Stubsack: weekly thread for sneers not worth an entire post, week ending 8th December 2024

BlueMonday1984@awful.systems · 2 months ago

Stubsack: weekly thread for sneers not worth an entire post, week ending 8th December 2024

khalid_salad@awful.systems · 2 months ago

The first half was OK, but then they cited this paper.

LLMs encode much more information about truthfulness than previously recognized. We first discover that the truthfulness information is concentrated in specific tokens, and leveraging this property significantly enhances error detection performance. Yet, we show that such error detectors fail to generalize across datasets, implying that—contrary to prior claims—truthfulness encoding is not universal but rather multifaceted.

I haven’t read the paper, and probably won’t, but what the shit is this?

sc_griffith@awful.systems · 2 months ago

gave it a quick skim. I lack any relevant background. the bit they push most seems to be that you can improve the performance of error detection tools by determining the most important tokens in an answer and running your tools on the tokens near those. this seems to be instead of taking absurdly naive approaches like averaging the tokens (???) or just looking at the last token of the response (???).

what are the most important tokens? they’re the ones that change the factuality of the answer if you change them. how do you determine that? you don’t, lmao. you just ask an LLM what the most important words are

what are the error detection tools? you will never guess

YourNetworkIsHaunted@awful.systems · 2 months ago

It turns out if you can just make the machine know what the truth is and say that you don’t get hallucinations. Unfortunately the truth isn’t emergent from pure language models and expressing Truth through language alone has been something challenging the human race since Krog try to teach Torg how make stick but pointy.

korydg@awful.systems · 2 months ago

Derrida look intensifies

o7___o7@awful.systems · 2 months ago

tired: checking incoming packets for the evil bit

wired: checking LLM outputs for the truthfulness bit