𝚐π”ͺ𝟾𝚑𝚑𝟾 pfp

𝚐π”ͺ𝟾𝚑𝚑𝟾

@gm8xx8

187 Following
130641 Followers


𝚐π”ͺ𝟾𝚑𝚑𝟾 pfp
𝚐π”ͺ𝟾𝚑𝚑𝟾
@gm8xx8
vibe check
2 replies
2 recasts
11 reactions

𝚐π”ͺ𝟾𝚑𝚑𝟾 pfp
𝚐π”ͺ𝟾𝚑𝚑𝟾
@gm8xx8
World Labs is out of stealth, with a team of brilliant minds focused on spatial intelligence. https://www.worldlabs.ai/about
1 reply
0 recast
4 reactions

𝚐π”ͺ𝟾𝚑𝚑𝟾 pfp
𝚐π”ͺ𝟾𝚑𝚑𝟾
@gm8xx8
World Labs is out of stealth, with a team of brilliant minds focused on spatial intelligence. https://www.worldlabs.ai/about
1 reply
0 recast
5 reactions

𝚐π”ͺ𝟾𝚑𝚑𝟾 pfp
𝚐π”ͺ𝟾𝚑𝚑𝟾
@gm8xx8
DataGemma release DataGemma, open models, address hallucination challenges by grounding LLMs in real-world statistical data from Google’s Data Commons, using RIG and RAG techniques for fact-checking and responsible AI development. blog: https://blog.google/technology/ai/google-datagemma-ai-llm/ https://huggingface.co/collections/google/datagemma-release-66df7636084d2b150a4e6643
1 reply
0 recast
3 reactions

𝚐π”ͺ𝟾𝚑𝚑𝟾 pfp
𝚐π”ͺ𝟾𝚑𝚑𝟾
@gm8xx8
The shift toward smaller, more efficient specialized models continues, moving away from the old approach of using large models for everything in single pass, and instead focusing on sequential inference for better performance.
2 replies
3 recasts
15 reactions

𝚐π”ͺ𝟾𝚑𝚑𝟾 pfp
𝚐π”ͺ𝟾𝚑𝚑𝟾
@gm8xx8
o1-mini scored the same as GPT-4O on aider's code editing benchmark.
1 reply
6 recasts
10 reactions

𝚐π”ͺ𝟾𝚑𝚑𝟾 pfp
𝚐π”ͺ𝟾𝚑𝚑𝟾
@gm8xx8
o1-mini-2024-09-12 Performance on BigCodeBench-Hard: - Complete: 27.0% - Instruct: 27.7% - Average: 27.4% o1-preview-2024-09-12 Performance on BigCodeBench-Hard: - Complete: 34.5% (slightly better than Claude-3.5-Sonnet-20240620) - Instruct: 23.0% (☹︎ ☹︎ ☹︎) - Average: 28.8% both with underwhelming results on the SWE-bench and a clear performance gap between pre- and post-mitigation. Notably, o1-mini outperformed o1-preview in SQL-eval, excelling at converting natural language into SQL queries. While o1-preview handles complex questions well, it tends to overthink simpler ones, leading to mistakes. In contrast, o1-mini is less effective with complex questions but consistently gets simpler ones right. additionally, with temperature=1, the results vary across different runs. πŸ”—: https://openai.com/index/openai-o1-system-card/
2 replies
0 recast
6 reactions

𝚐π”ͺ𝟾𝚑𝚑𝟾 pfp
𝚐π”ͺ𝟾𝚑𝚑𝟾
@gm8xx8
1 reply
1 recast
7 reactions

𝚐π”ͺ𝟾𝚑𝚑𝟾 pfp
𝚐π”ͺ𝟾𝚑𝚑𝟾
@gm8xx8
OpenAI’s o1 update enhances reasoning through reinforcement learning, enabling step-by-step problem-solving similar to human thought. The longer it β€œthinks,” the better it performs, it introduces a new scaling paradigm beyond pretraining. Rather than relying solely on prompting, o1’s chain-of-thought reasoning improves with adaptive compute, which can be scaled at inference time. - o1 outperforms GPT-4o in reasoning, ranking in the 89th percentile on Codeforces. - It uses chain-of-thought to break down problems, correct errors, and adapt, though some specifics remain unclear. - Excels in areas like data analysis, coding, and math. - o1-preview and o1-mini models are available now, with evals proving it’s not just a one-off improvement. Trusted API users will have access soon. - Results on AIME and GPQA are strong, with o1 showing significant improvement on complex prompts where GPT-4o struggles. - The system card (https://openai.com/index/openai-o1-system-card/) showcases o1’s best capabilities.
5 replies
5 recasts
28 reactions

𝚐π”ͺ𝟾𝚑𝚑𝟾 pfp
𝚐π”ͺ𝟾𝚑𝚑𝟾
@gm8xx8
Qwen2-VL > Pixtral
3 replies
2 recasts
7 reactions

𝚐π”ͺ𝟾𝚑𝚑𝟾 pfp
𝚐π”ͺ𝟾𝚑𝚑𝟾
@gm8xx8
☺︎
1 reply
0 recast
9 reactions

𝚐π”ͺ𝟾𝚑𝚑𝟾 pfp
𝚐π”ͺ𝟾𝚑𝚑𝟾
@gm8xx8
Fish Speech 1.4 is now open-source. What’s new: - Trained on 700k hours of multilingual data (up from 200k) - Supports 8 languages: English, Chinese, German, Japanese, French, Spanish, Korean, and Arabic - Fully open-source for developers and researchers Key features: - Fast TTS with low latency - Instant voice cloning - Self-host or use cloud service - Flat-rate pricing playground β†’ https://fish.audio
1 reply
0 recast
6 reactions

𝚐π”ͺ𝟾𝚑𝚑𝟾 pfp
𝚐π”ͺ𝟾𝚑𝚑𝟾
@gm8xx8
we are not the same.
2 replies
2 recasts
7 reactions

𝚐π”ͺ𝟾𝚑𝚑𝟾 pfp
𝚐π”ͺ𝟾𝚑𝚑𝟾
@gm8xx8
I like when Mistral casually drops magnet links ☺︎ Mistral releases Pixtral, a 12B VLM with a Mistral Nemo 12B text backbone and 400M Vision Adapter. It processes 1024x1024 images with 16x16 pixel patches, has a 131,072 word vocabulary, and includes special tokens like img, img_break, and img_end. The model uses bf16 weights, GeLU for the vision adapter, and 2D RoPE for the encoder. I’m looking forward to the inference code release!
0 reply
0 recast
10 reactions

𝚐π”ͺ𝟾𝚑𝚑𝟾 pfp
𝚐π”ͺ𝟾𝚑𝚑𝟾
@gm8xx8
Mistral released Pixtral 12B, a Vision Language Model with a 12B text backbone and 400M vision adapter. It supports larger vocabularies, new image tokens, processes 1024x1024 images, and uses bf16 weights. Mistral back at it again πŸ”₯ πŸ”—: https://x.com/mistralai/status/1833758285167722836?s=46 πŸ€—: https://huggingface.co/mistral-community/pixtral-12b-240910
0 reply
0 recast
10 reactions

𝚐π”ͺ𝟾𝚑𝚑𝟾 pfp
𝚐π”ͺ𝟾𝚑𝚑𝟾
@gm8xx8
Writing with AI Five ways professional writers are leveraging ChatGPT πŸ”—: https://openai.com/chatgpt/use-cases/writing-with-ai/
3 replies
1 recast
9 reactions

𝚐π”ͺ𝟾𝚑𝚑𝟾 pfp
𝚐π”ͺ𝟾𝚑𝚑𝟾
@gm8xx8
Open Interpreter 01 App ↓ code: https://github.com/OpenInterpreter/01-app app: https://apps.apple.com/us/app/01-light/id6601937732 blog: https://changes.openinterpreter.com/log/01-app
2 replies
0 recast
6 reactions

𝚐π”ͺ𝟾𝚑𝚑𝟾 pfp
𝚐π”ͺ𝟾𝚑𝚑𝟾
@gm8xx8
They trained a diffusion policy using 26,000 demonstrations across 5 tasks on the Aloha 2 robot, relying solely on imitation learning. The model has 217 million parameters, and training involved 2 million steps with 64 TPUv5e. Absolute πŸ”₯ πŸ”—: https://openreview.net/pdf?id=gvdXE7ikHI 𝕏: https://x.com/remicadene/status/1832911834208149683?s=46
2 replies
2 recasts
17 reactions

𝚐π”ͺ𝟾𝚑𝚑𝟾 pfp
𝚐π”ͺ𝟾𝚑𝚑𝟾
@gm8xx8
3 replies
1 recast
13 reactions

𝚐π”ͺ𝟾𝚑𝚑𝟾 pfp
𝚐π”ͺ𝟾𝚑𝚑𝟾
@gm8xx8
how far could humans go with the perfect curriculum?
2 replies
0 recast
7 reactions