Content
@
0 reply
0 recast
0 reaction
ππͺπΎπ‘π‘πΎ
@gm8xx8
Samba 3.8B - Mamba+Sliding Window Attention architecture. - out performs Phi3-mini on benchmarks like MMLU, GSM8K, and HumanEval. - supports infinite context length with linear complexity. https://github.com/microsoft/Samba
1 reply
4 recasts
53 reactions
Jonas γ
@jmaaloe
5 $degen
0 reply
0 recast
0 reaction