> This means that the content that is more replicated will be considered more true by the system, regardless of its connection to reality or its coherence with the rest of the knowledge in the system.
I understand the problem, but what better way do we currently have to measure its connection to reality? At least from a practical point of view it seems that LLMs have achieved way better performance than other methods in this regard, so repeatedness doesn't look like that bad a metric. Or rather, it's the best I think we currently have.
> I understand the problem, but what better way do we currently have to measure its connection to reality?
We can consider its responses to a broader range of questions than those having an unambiguous and well-known answer. Its propensity for making up 'facts', and for fabricating 'explanations' that are incoherent or even self-contradictory shows that any apparent understanding of the world being represented in the text is illusory.
I understand the problem, but what better way do we currently have to measure its connection to reality? At least from a practical point of view it seems that LLMs have achieved way better performance than other methods in this regard, so repeatedness doesn't look like that bad a metric. Or rather, it's the best I think we currently have.