ππͺπΎπ‘π‘πΎ
@gm8xx8
162 Following
131558 Followers
0 reply
7 recasts
23 reactions
1 reply
1 recast
3 reactions
0 reply
0 recast
4 reactions
0 reply
1 recast
7 reactions
0 reply
0 recast
4 reactions
0 reply
2 recasts
19 reactions
0 reply
0 recast
3 reactions
0 reply
0 recast
4 reactions
1 reply
2 recasts
19 reactions
0 reply
1 recast
3 reactions
0 reply
0 recast
1 reaction
MistralAI Pixtral & Mistral Large
Pixtral Large: https://huggingface.co/mistralai/Pixtral-Large-Instruct-2411
- Mistral Commercial License
- 124B open-weights multimodal model built on top of Mistral Large 2
- Combines a 123B multimodal decoder w/a 1B parameter vision encoder.
- 128K context window supports at least 30 high-resolution images.
- SOTA on MathVista, DocVQA, & VQAv2 benchmarks.
Mistral Large: https://huggingface.co/mistralai/Mistral-Large-Instruct-2411
- Mistral Research License
- Supports dozens of languages, including English, French, German, Chinese, Spanish and more.
- Trained on 80+ coding languages like Python, Java, C++, and Bash.
- Excels in agent-based and mathematical tasks with native function calling and JSON output.
- 128K context window with improved system prompt reliability. 0 reply
1 recast
4 reactions
2 replies
0 recast
0 reaction
0 reply
0 recast
1 reaction
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models
paper: arxiv.org/abs/2411.04905
project page: opencoder-llm.github.io
OpenCoder is a family of open and reproducible code language models, available in 1.5B and 8B parameter sizes, supporting both English and Chinese. Pretrained on 2.5 trillion tokens (90% raw code, 10% code-related web data) and fine-tuned with 4.5M high-quality examples, it achieves performance comparable to top-tier models. OpenCoder provides complete model weights, inference code, training data, data processing pipeline, experimental results, and detailed training logs.
> OpenCoder releases model weights, inference code, data-cleaning scripts, synthetic data, checkpoints, and over 4.5 million fine-tuning entries, making it one of the most transparent models available.
> High-Quality Synthetic DataβOffers a synthetic data generation process and 4.5 million fine-tuning data entries, providing a strong foundation for training. 1 reply
0 recast
6 reactions
2 replies
0 recast
8 reactions
1 reply
2 recasts
21 reactions
0 reply
1 recast
3 reactions
2 replies
0 recast
5 reactions
1 reply
0 recast
1 reaction