A quick test of the new llama3 models (and old ones):

L3 70b: https://i.imgur.com/HPgNLnW.png

L3 8b: https://i.imgur.com/HTxZmy9.png

Mixtral 8x7b: https://i.imgur.com/qjEk93V.png

ChatGPT: https://i.imgur.com/DIQ5loP.png

ChatGPT is correct, L3 70b almost; others are wrong.

Love tests like these.

Have you seen recent tests on ChatGPT v ChatDKG showing ChatGPT hallucinating without the use of a decentralized RAG in the form of a knowledge graph?

Very interesting evolution of these models.

Love tests like these.

Have you seen recent tests on ChatGPT v ChatDKG showing ChatGPT hallucinating without the use of a decentralized RAG in the form of a knowledge graph?

Very interesting evolution of these models.

https://twitter.com/DrevZiga/status/1755625036486988174?t=AU1KDEXEyeFd43hKHDaUyg&s=19