Traditional Language Models Tutorial π: Old School Word Predictors Before Transformers
Explore traditional LMs guide β from n-grams to RNNs. Foundation for modern LLMs.
N-Grams Guide
- Count previous words to predict next β simple yet effective.
- Markov assumption: Limit history for efficiency.
- Smoothing techniques: Laplace for handling zeros.
Feedforward Neural Language Models
- Fixed window neural nets.
- Embeddings + MLP for probability predictions.
RNNs in Language Modeling
- LSTM/GRU: Tackle long dependencies like a boss.
- Seq2seq foundations for advanced tasks.
Why Study Traditional LMs?
Understand the evolution to transformers! Whatβs your take on RNNs vs. Transformers? Comment! π¬
My Traditional LMs Notes
Top Traditional LMs Resources
- Jurafsky SLP Book
- Karpathy on RNN Effectiveness
- LSTM Original Paper
- GRU Paper
- Colahβs LSTM Post
- Yoav Goldberg Primer
Keywords: traditional language models tutorial, n-grams guide, RNN language models, pre-transformers LMs, AI word prediction