Content pfp
Content
@
0 reply
0 recast
0 reaction

Javid Iqbal pfp
Javid Iqbal
@javidiqbal
To create a reward model for reinforcement learning, we needed to collect comparison data, which consisted of two or more model responses ranked by quality
14 replies
1 recast
3 reactions

Zubair Sarim๐ŸŽฉโšก๐ŸŽญโ“‚๏ธ pfp
Zubair Sarim๐ŸŽฉโšก๐ŸŽญโ“‚๏ธ
@zubi765
๐Ÿ– ๐Ÿ– ๐Ÿ– ๐Ÿ– ๐Ÿ–
0 reply
0 recast
0 reaction

Fiza Ansari pfp
Fiza Ansari
@fizaansari
๐Ÿ–ร—90
0 reply
0 recast
0 reaction

 Rizwan Ahmad๐ŸŽฉ๐Ÿ–๐ŸŒˆ pfp
Rizwan Ahmad๐ŸŽฉ๐Ÿ–๐ŸŒˆ
@mrizwan
๐Ÿ– x 1057
0 reply
0 recast
0 reaction

Sibao pfp
Sibao
@sibgha
๐Ÿ–ร—90
0 reply
0 recast
0 reaction

zubair bodla๐ŸŽฉ๐Ÿ–๐ŸŽญโ“‚๏ธ pfp
zubair bodla๐ŸŽฉ๐Ÿ–๐ŸŽญโ“‚๏ธ
@zubair765
๐Ÿ–ร—16
0 reply
0 recast
0 reaction

zubair bodla๐ŸŽฉ๐Ÿ–๐ŸŽญโ“‚๏ธ pfp
zubair bodla๐ŸŽฉ๐Ÿ–๐ŸŽญโ“‚๏ธ
@zubair765
๐Ÿ–ร—15
0 reply
0 recast
0 reaction

Ethan๐ŸŽฉ๐Ÿซ‚ pfp
Ethan๐ŸŽฉ๐Ÿซ‚
@msabir
๐Ÿ–ร—90
0 reply
0 recast
0 reaction

Asma Khan ๐ŸŽฉ๐ŸŽญ pfp
Asma Khan ๐ŸŽฉ๐ŸŽญ
@malikg
๐Ÿ– ร— 443
0 reply
0 recast
0 reaction

Barbie ๐ŸŽฉ๐ŸŽญโ“‚๏ธ โœช pfp
Barbie ๐ŸŽฉ๐ŸŽญโ“‚๏ธ โœช
@hafsa
๐Ÿ–
0 reply
0 recast
0 reaction

Farooq Bodla  pfp
Farooq Bodla
@farooqaazad
๐Ÿ–ร—99
0 reply
0 recast
0 reaction

Ayesha Khan pfp
Ayesha Khan
@usman786
๐Ÿ– x 209
0 reply
0 recast
0 reaction

Zakia Malik๐ŸŽฉ๐Ÿ–๐Ÿ”ฎ pfp
Zakia Malik๐ŸŽฉ๐Ÿ–๐Ÿ”ฎ
@zakiamalik
412 $DEGEN
0 reply
0 recast
0 reaction