Varun Srinivasan on Warpcast

Content pfp

https://warpcast.com/~/channel/papers-please

0 reply

0 recast

0 reaction

Varun Srinivasan pfp

Varun Srinivasan

Is there a credible solution to the LLM hallucination problem? Any interesting research papers or discussions on this?

9 replies

2 recasts

29 reactions

ȷď𝐛𝐛 pfp

Curious whether/how for you “hallucination” distinct from a broad “wrong answer” category (delivered confidently in both cases ofc lol)

1 reply

0 recast

0 reaction

jj 🛟 pfp

https://openai.com/index/webgpt/ It’s been worked on since 4 years ago It’s come a long way

0 reply

0 recast

3 reactions

Shashank pfp

amazon recommends using RAG on knowledge sources to get a hallucination score https://aws.amazon.com/blogs/machine-learning/reducing-hallucinations-in-large-language-models-with-custom-intervention-using-amazon-bedrock-agents/ research paper: https://www.amazon.science/publications/hallucination-detection-in-llm-enriched-product-listings

0 reply

0 recast

0 reaction

mk pfp

Seems like since it happens rarely, multiple LLMs working together should be able to detect them. I don’t understand why we keep applying one LLM when our own minds use multiple models.

0 reply

0 recast

0 reaction

Vinay Vasanji pfp

@vinayvasanji.eth

Perhaps something directionally to do with reasoning capability https://www.machinesforhumans.com/post/unreasonable-ai---the-difference-between-large-language-models-llms-and-human-reasoning

0 reply

0 recast

0 reaction

Stephan pfp

How prevalent is it in coding applications? Feels like I never experience LLM hallucination but coding is pretty much my only use case

1 reply

0 recast

0 reaction

Idan Levin 🎩 pfp

Idan Levin 🎩

there is a cocept of truth orcale (sort of fact checking service) that some are exploring

0 reply

0 recast

0 reaction

Alex Murray pfp

We discuss the possibility of decentralized oracle systems to mitigate LLM hallucinations in this article https://www.researchgate.net/publication/378262367_Navigating_the_challenges_of_generative_technologies_Proposing_the_integration_of_artificial_intelligence_and_blockchain

0 reply

0 recast

0 reaction

Dean Pierce 👨‍💻🌎🌍 pfp

Dean Pierce 👨‍💻🌎🌍

@deanpierce.eth

The main solution is for the agent to provide reputable sources for any claims, but it's usually cheaper and easier for the human operator to do the fact checking. Letting an agent randomly browse around online to do fact checking is slow and expensive, so it's not the default, but it's totally possible.

0 reply

0 recast

0 reaction

Mkkstacks pfp

@sophia-indrajaal

1 reply

0 recast

1 reaction