July
@july
My usage this week: - 80% Deepseek v3 / R1, I'd say mainly R1 - 15% Claude Sonnet until I run out of tokens - leftovers go to ChatGPT - playing around on runpod / local model for random experimentation etc, LM studio, ollama, stable diffusion, etc I can't believe how much i'm using DS
16 replies
6 recasts
120 reactions
🐺
@pyroeis.eth
I'm also curious about your experiments with the other models, like Claude Sonnet and everything else. Are you seeing any cool results from your random explorations?
1 reply
0 recast
1 reaction
July
@july
yes - a lot of the interest for me is around RL, esp in the context of robotics (in the context of transferring policies from sim to a real robot eventually) so yeah, doing a lot of training in RL, so that's been interesting. sim is getting pretty good (Mujoco, Nvidia Isaac Sim, and genesis -- I've been trying them out)
1 reply
0 recast
2 reactions
July
@july
as far as the existing foundational models go, yeah they are pretty straight forward. more interest in reading the actual papers around R1 and v3 for DS have been pretty interesting / enlightening. esp since it leverages a ton of RL, but again, my primary interest around this is around robotics (specifically end-to-end learning for RL Policies to replace parts of control-planning-autonomy stack)
1 reply
0 recast
2 reactions
🐺
@pyroeis.eth
That’s really exciting! The RL approach for robotics, especially transferring policies to real robots, sounds like a game-changer. It’s great that the simulations are progressing well too. I can see why you’re diving into the papers, lots of potential in RL for this!
0 reply
0 recast
1 reaction