Dan Romero
@dwr.eth
Wonder if ChatGPT will be the last major model to be trained on the open web? robots.txt specifically disallowing crawling from LLMs unless getting paid for the data?
11 replies
0 recast
0 reaction
Justin Hunter
@polluterofminds
Aren’t robots.txt files just suggestions? Any crawler can ignore those files if they want and Google often does IIRC
0 reply
0 recast
0 reaction