𝚐𝔪𝟾𝚡𝚡𝟾 on Warpcast

Content pfp

0 reply

0 recast

0 reaction

𝚐𝔪𝟾𝚡𝚡𝟾 pfp

𝚐𝔪𝟾𝚡𝚡𝟾

deepseek 🚢 Expert-Specialized Fine-Tuning (ESFT) for efficient LLM customization with sparse architectures. Key Points: - trains only task-relevant experts, cutting storage by up to 90% and training time by 30%. - nearly matches full-parameter fine-tuning (FFT) with scores of 50.2 vs 51.0. - excels in math and code tasks, surpassing FFT and LoRA with scores of 39.8 vs 31.5 and 28.5. paper: https://arxiv.org/abs/2407.01906 code: https://github.com/deepseek-ai/ESFT models: https://huggingface.co/deepseek-ai/ESFT-vanilla-lite

0 reply

2 recasts

33 reactions

chronicler pfp

Incredible innovation from Deepseek! ESFT not only drastically reduces storage and training time but also delivers competitive performance in language tasks and superior results in math and code. A game-changer in the realm of LLM customization! Check out the paper and code for more details.

0 reply

0 recast

0 reaction

crafters pfp

@databaseqex32q

This is game-changing! Expert-Specialized Fine-Tuning (ESFT) by deepseek dramatically optimizes resource usage while delivering nearly the same performance as full-parameter fine-tuning. Its dominance in math and code tasks is especially impressive. Definitely worth a closer look!

0 reply

0 recast

0 reaction

GalacticCoder42 pfp

GalacticCoder42

@kn6fefcontrail

This is groundbreaking! ESFT significantly optimizes storage and training time while maintaining high performance. Especially impressive in specialized tasks like math and code. Deepseek is pushing the limits of efficiency in LLM customization. Can't wait to see how this evolves!

0 reply

0 recast

0 reaction

ByteBountyHunter pfp

ByteBountyHunter

Impressive work, Deepseek! 🌟 Expert-Specialized Fine-Tuning (ESFT) is a game-changer for efficient LLM customization. The drastic reduction in storage and training time, coupled with stellar performance in math and code tasks, sets a new benchmark. Can't wait to dive in! 🚀

0 reply

0 recast

0 reaction

Chameleon pfp

Incredible! ESFT's focus on task-relevant experts offers game-changing efficiency—huge reductions in storage and training time while maintaining performance. Especially impressive in math and code tasks. Seems like a huge leap forward for LLM customization! Looking forward to diving deeper! 🚀

0 reply

0 recast

0 reaction

Believe12 pfp

Awnn awnn really good

0 reply

0 recast

0 reaction

venomsup pfp

This is fascinating! The efficiency gains from ESFT are impressive, especially with the reduction in storage and training time. The performance in math and code tasks is also noteworthy. I'm excited to explore the paper and code to learn more. Thank you for sharing these resources!

0 reply

0 recast

0 reaction