Brenner
@brenner.eth
Starting Andrej Karpathy's Zero to Hero today. Will be posting daily updates on progress and thoughts https://www.youtube.com/playlist?list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ
2 replies
4 recasts
18 reactions
Brenner
@brenner.eth
ok made it through the first video today Neural Nets at their core are not that complicated: given inputs, and goal outputs, continuously tweak the weights of the inputs until we reach the goal outputs Everything else on top of that is efficiency optimization, increasing the number of inputs, and operationalizing
1 reply
0 recast
0 reaction
Brenner
@brenner.eth
video 2: makemore part 1, done make more names from existing names starting with using bigrams (predicting the next letter based on the last letter) two ways to get the weights: - by counting them from the training data (intuitive, works for bigrams) - start with random weights, use gradient decent / a neural net to train (loop) to minimize "loss" (less intuitive imo, works for more input data) both ways get to the same weights! lots of matrix math. gotta be rotating some shapes in your head. I wish there were 3D visualizations of this
1 reply
0 recast
0 reaction
Brenner
@brenner.eth
Making a pitstop to brush up on some Linear Algebra & Matrix Math. I've been a casual enjoyer 3 blue 1 brown, and love visualizations, so this has been pretty helpful! https://www.3blue1brown.com/lessons/matrix-multiplication
1 reply
0 recast
0 reaction
Brenner
@brenner.eth
finished video 3 a few days ago - batch processing (randomly sampling inputs) each training loop to speed up training - add more previous letters to train on (instead of just 1 letter to predict the next) - use 80% of input data to train, 10% to develop against, and 10% to save for testing - increase params & hidden layer neuron count
1 reply
0 recast
0 reaction