Starting Andrej Karpathy's Zero to Hero today. Will be posting daily updates on progress and thoughts

Starting Andrej Karpathy's Zero to Hero today. Will be posting daily updates on progress and thoughts

 https://www.youtube.com/playlist?list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ

Decentralized open source consciousness | Values-aligned communities | citizen of Black Rock City

Wrapped BTC

ok made it through the first video today

Neural Nets at their core are not that complicated: 
given inputs, and goal outputs, continuously tweak the weights of the inputs until we reach the goal outputs

Everything else on top of that is efficiency optimization, increasing the number of inputs, and operationalizing

Making a pitstop to brush up on some Linear Algebra & Matrix Math. I've been a casual enjoyer 3 blue 1 brown, and love visualizations, so this has been pretty helpful!

finished video 3 a few days ago

- batch processing (randomly sampling inputs) each training loop to speed up training
- add more previous letters to train on (instead of just 1 letter to predict the next)
- use 80% of input data to train, 10% to develop against, and 10% to save for testing
- increase params & hidden layer neuron count

Making a pitstop to brush up on some Linear Algebra & Matrix Math. I've been a casual enjoyer 3 blue 1 brown, and love visualizations, so this has been pretty helpful!

https://www.3blue1brown.com/lessons/matrix-multiplication

video 2: makemore part 1, done

make more names from existing names

starting with using bigrams (predicting the next letter based on the last letter)

two ways to get the weights:
- by counting them from the training data (intuitive, works for bigrams)
- start with random weights, use gradient decent / a neural net to train (loop) to minimize "loss" (less intuitive imo, works for more input data)

both ways get to the same weights!

lots of matrix math. gotta be rotating some shapes in your head. I wish there were 3D visualizations of this