Content pfp
Content
@
0 reply
0 recast
0 reaction

July pfp
July
@july
A question: What is not on the internet? A surprisingly thought provoking question. One guess as to why ChatGPT was possible is the data around language & how we behave through language was abundant on the internet at time of training It turns out there's surprisingly a lot that's not on the internet
12 replies
3 recasts
16 reactions

July pfp
July
@july
E.g. Why are 32 institutions and robotics labs from around the world banding together to try to create a data set for learning? It's because there are not a lot of robots in the world - or on the internet for that matter, and there isn't a lot of real, trainable, robotics data https://robotics-transformer-x.github.io/
1 reply
0 recast
4 reactions

rileybeans pfp
rileybeans
@rileybeans
what's not on the internet? breath. life. the butterflies and nervousness I get walking up to someone. that feeling when someone is excited to see you.
1 reply
0 recast
6 reactions

Varun Kumar pfp
Varun Kumar
@vkcs
What's not on the internet - -The truth behind pr and marketing initiatives -Physiological human events (eating, running, etc) -Identity of pets -Police force records -Pot holes on the road(Indian thing) -Number of trees
1 reply
0 recast
2 reactions

Eric Platon pfp
Eric Platon
@ic
Rumsfeld would say it better, yet pretty sure there are unknown unknowns that are not on the Internet.
2 replies
0 recast
0 reaction

greyseymour.eth pfp
greyseymour.eth
@greyseymour
my grandma - she’s dead. but also: - some rare texts - most outsider art - much oral tradition/endangered languages, other cultural artifacts of value - a meaningful/delightful “co-shopping” experience - many family recipes - lots of valuable content re arts/crafts techniques — I think about this a lot!
0 reply
0 recast
4 reactions

rafi pfp
rafi
@rafi
Internet-free thought is not on the internet. Everyone here is living in a spaceless echo chamber of interconnectedness. Conversations with people who are fully offline and free of TV and mass media are ones I hold very dearly. Not only because they are so rare but because of fresh insight they bring.
1 reply
0 recast
1 reaction

Catabolismo pfp
Catabolismo
@catabolismo
-Most books published before 2000 -99% of the proprietary data used by companies and institutions -TV and radio historical archives -almost all of the worlds' therapist notes about their patients -Context based knowledge such as which one is the block's haunted house or who is the most popular kid in school.
0 reply
0 recast
1 reaction

Bethany - countessellis.eth🎩 pfp
Bethany - countessellis.eth🎩
@ellis
1/ This. The scope of data creates certain biases in the model, just like how sampling is done for statistics is biased by how broad and even the sampling is. There’s really two parts to this:
1 reply
0 recast
1 reaction

Matthew Barton pfp
Matthew Barton
@mbar
Intranets Cash / commodities transfers Snail mail Diaries
0 reply
0 recast
0 reaction

maddieli pfp
maddieli
@maddieli
Most of the land registry around the world
0 reply
0 recast
0 reaction

wake pfp
wake
@wake
I don't think I've seen a single cat anywhere
1 reply
0 recast
0 reaction

sgt_slaughtermelon🎩 pfp
sgt_slaughtermelon🎩
@sgt-sl8termelon
I think copyright law and lawsuits has stymied a lot of accessibility, e.g. https://en.wikipedia.org/wiki/Authors_Guild,_Inc._v._Google,_Inc. Also lots of research is still paywalled, right? Protecting the interests of authors and researchers (theoretically) has kept only free, public domain stuff available? maybe?
0 reply
0 recast
0 reaction