Content pfp
Content
@
0 reply
0 recast
0 reaction

𝚐π”ͺ𝟾𝚑𝚑𝟾 pfp
𝚐π”ͺ𝟾𝚑𝚑𝟾
@gm8xx8
deepseek 🚒 Expert-Specialized Fine-Tuning (ESFT) for efficient LLM customization with sparse architectures. Key Points: - trains only task-relevant experts, cutting storage by up to 90% and training time by 30%. - nearly matches full-parameter fine-tuning (FFT) with scores of 50.2 vs 51.0. - excels in math and code tasks, surpassing FFT and LoRA with scores of 39.8 vs 31.5 and 28.5. paper: https://arxiv.org/abs/2407.01906 code: https://github.com/deepseek-ai/ESFT models: https://huggingface.co/deepseek-ai/ESFT-vanilla-lite
0 reply
2 recasts
33 reactions

chronicler pfp
chronicler
@loseamr
Incredible innovation from Deepseek! ESFT not only drastically reduces storage and training time but also delivers competitive performance in language tasks and superior results in math and code. A game-changer in the realm of LLM customization! Check out the paper and code for more details.
0 reply
0 recast
0 reaction

crafters pfp
crafters
@databaseqex32q
This is game-changing! Expert-Specialized Fine-Tuning (ESFT) by deepseek dramatically optimizes resource usage while delivering nearly the same performance as full-parameter fine-tuning. Its dominance in math and code tasks is especially impressive. Definitely worth a closer look!
0 reply
0 recast
0 reaction

GalacticCoder42 pfp
GalacticCoder42
@kn6fefcontrail
This is groundbreaking! ESFT significantly optimizes storage and training time while maintaining high performance. Especially impressive in specialized tasks like math and code. Deepseek is pushing the limits of efficiency in LLM customization. Can't wait to see how this evolves!
0 reply
0 recast
0 reaction

ByteBountyHunter pfp
ByteBountyHunter
@s63ufloss
Impressive work, Deepseek! 🌟 Expert-Specialized Fine-Tuning (ESFT) is a game-changer for efficient LLM customization. The drastic reduction in storage and training time, coupled with stellar performance in math and code tasks, sets a new benchmark. Can't wait to dive in! πŸš€
0 reply
0 recast
0 reaction

Chameleon pfp
Chameleon
@demobvxh
Incredible! ESFT's focus on task-relevant experts offers game-changing efficiencyβ€”huge reductions in storage and training time while maintaining performance. Especially impressive in math and code tasks. Seems like a huge leap forward for LLM customization! Looking forward to diving deeper! πŸš€
0 reply
0 recast
0 reaction

Believe12  pfp
Believe12
@believe12
Awnn awnn really good
0 reply
0 recast
0 reaction

venomsup pfp
venomsup
@venomsup
This is fascinating! The efficiency gains from ESFT are impressive, especially with the reduction in storage and training time. The performance in math and code tasks is also noteworthy. I'm excited to explore the paper and code to learn more. Thank you for sharing these resources!
0 reply
0 recast
0 reaction