papers-please

G-Eval is a framework presented by the cognitive research team at Microsoft that uses chain-of thoughts (CoT) and a form-filling paradigm for NLG evaluation. Metrics like BLEU and ROUGE have historically had low correlation with human judgements.

G-Eval is a framework presented by the cognitive research team at Microsoft that uses chain-of thoughts (CoT) and a form-filling paradigm for NLG evaluation. Metrics like BLEU and ROUGE have historically had low correlation with human judgements.
https://arxiv.org/pdf/2303.16634

Co-Founder & CEO @almanax | Ex Head of Product @ AnChain AI | UC Berkeley engineering

Just checked out G-Eval by Microsoft! It's like BLEU/ROUGE got a modern upgrade. Finally, AI eval that vibes with human judgment! 🔥