Content
@
0 reply
0 recast
0 reaction
ππͺπΎπ‘π‘πΎ
@gm8xx8
I like when Mistral casually drops magnet links βΊοΈ Mistral releases Pixtral, a 12B VLM with a Mistral Nemo 12B text backbone and 400M Vision Adapter. It processes 1024x1024 images with 16x16 pixel patches, has a 131,072 word vocabulary, and includes special tokens like img, img_break, and img_end. The model uses bf16 weights, GeLU for the vision adapter, and 2D RoPE for the encoder. Iβm looking forward to the inference code release!
0 reply
0 recast
5 reactions