Content pfp
Content
@
0 reply
0 recast
0 reaction

𝚐𝔪𝟾𝚡𝚡𝟾 pfp
𝚐𝔪𝟾𝚡𝚡𝟾
@gm8xx8
Reflection 70B underperformed on coding tasks, scoring 42% on Aider and struggling with BigCodeBench-Hard compared to Llama3 70B. Probably bc using reasoning (CoT) negatively impacts code generation, as reasoning may not integrate well with the inherent logic required for programming. Direct code generation yields better results.
4 replies
1 recast
15 reactions

avebi pfp
avebi
@avebi
reflect 70B struggled coding, scored low compared to Llama3 70B. CoT reasoning might not work well with programming logic. Direct generation is better
0 reply
0 recast
0 reaction