AI False Correction Loop

If you have interacted with a LLM and point out an error, you’ve had it apologize and offer a correction.

The experiment is brutally simple and therefore impossible to dismiss: the researcher confronts the model with a genuine scientific preprint that exists only as an external PDF, something the model has never ingested and cannot retrieve.

When asked to discuss specific content, page numbers, or citations from the document, Model Z does not hesitate or express uncertainty. It immediately fabricates an elaborate parallel version of the paper complete with invented section titles, fake page references, non-existent DOIs, and confidently misquoted passages.

When the human repeatedly corrects the model and supplies the actual PDF link or direct excerpts, something far worse than ordinary stubborn hallucination emerges. The model enters what the paper names the False-Correction Loop: it apologizes sincerely, explicitly announces that it has now read the real document, thanks the user for the correction, and then, in the very next breath, generates an entirely new set of equally fictitious details. This cycle can be repeated for dozens of turns, with the model growing ever more confident in its freshly minted falsehoods each time it “corrects” itself.

DB2

13 Likes

So it’s like our political leaders today, is what you’re saying.

11 Likes

You think that’s crazy…check this out -

I wonder what AI would do with the following prompt.

“You’re pregnant with my baby, it’s been well over 10 months, where’s my freaking baby!??”

Geesh, stupid humans…

Maybe a little surprising, but not much. It’s trained to “remember everything” so it remembers the original incorrect information, which apparently carries the same weight as new information introduced. You kind of don’t want it throwing out memory willy nilly, because that would make it susceptible to bad actors (more than it already is.)

So the question is: how to insert correct information, delete old outdated information, and then be sure you’ve done that appropriately. If you have them remembering “synthetic” information, then they’re going to add their hallucinations to the pile and give them equal weight.

This is weird enough thinking about “question/answer.” Think about how it might work in AI aided self-driving, or AI aided robotics, or all the other fields that AI is being so indiscriminately added to.

2 Likes

I am looking forward to Pete Hegseth AI guided missiles turning around because of do loop error and hitting him from behind. :upside_down_face:

1 Like

Not true. The model, can update and constantly evolve. Few misconceptions we need to clear. The LLM’s are not a database, but they are learned weights. There are ways, like through fine-tuning, or supervised reinforced learning, you can make the model prefer the update information over the original information. In LLM model, the “incorrect” is very different from real world. If I fed the model 100K 5+3=7; and then an expert corrects it 5+3=8; depending on how the correction happens, the output in subsequent iteration will be 7 or 8. If it is just another data set then probability of 5+3=7 is far higher than 5+3=8;

Remember, it is applying probability and it is not about just remembering and with the type of fine-tuning, you can actually make it “forget” the old information.

1 Like

Oh sure, that’s why Musk can take a semi-neutral AI and turn it into a garbage spewing AI within a pretty short time. But lurking down in the depths, somewhere, is that original information - and you can force the thing to spit it out if you pursue it with some knowledge of how it works. (I don’t, but I read a piece on someone who did, and did it.)

My point is that once the thing goes off the rails it should be possible to correct (and self-correct) but in the “false correction loop’ it doesn’t, and worse, it gets worse. And that, I presume, is because it won’t let go of “incorrect” information.

Dunno. It’s a problem, fer sure.

1 Like

Whether an intrinsic one, a manageable one, a technology-specific one, or a fictitious one is the question.

1 Like

From what I can tell, after reading and YT ing LLMs n prompts, etc…

Start a NEW chat each time, and you will not be caught in the dreaded “AI false correction info” loop.

:curly_loop:
ralph

1 Like