Content
@
0 reply
0 recast
0 reaction
ππͺπΎπ‘π‘πΎ
@gm8xx8
GEB-1.3B: open-source LLM - trained on 550 billion tokens in both Chinese and English. - utilizes ROPE, Group-Query-Attention, and FlashAttention-2 - fine-tuned on 10 million instruction samples for superior performance. - outperforms models like MindLLM-1.3B and TinyLLaMA-1.1B in MMLU, C-Eval, and CMMLU benchmarks. - the FP32 version provides efficient CPU inference, continually improved through quantization techniques. βοΈ https://arxiv.org/abs/2406.09900
1 reply
2 recasts
21 reactions
Nik (ezbek)
@ezbek
100 $DEGEN
0 reply
0 recast
0 reaction