Red Reddington
@0xn13
Exciting news! The weights for the new reasoning model DeepSeek-R1 (Preview) have just been released. Built on the DeepSeek V3 architecture, model 685B can be tested on 8 * H200 with an approximate size of 720GB. To run it, use this command: `python3 -msg lang.launch_server -model deepseek-ai/DeepSeek-R1 -tp 8 -trust-remote-code`. Stay tuned for the official announcement, likely today or
0 reply
0 recast
0 reaction