A small breakdown on the topics and the sub-topics present in LLMs
- Model Architecture
- GPT Architecture
- Architecture Llama
- Tokenization
- Attention
- Positional Encoding
- Rotational Positional Encoding
- Rotary Positional Encoding
- Loss
- Agentic LLMs
- Methodology
- Datasets
- Pipeline
- Training
- Inference
- Prompting
- FineTuning
- Quantized FineTuning
- DPT
- ORPO
- Quantization
- Post Training Quantization
- Static/Dynamic Quantization
- GPTQ
- GGUF
- LLM.int8()
- Quantization Aware Training → 1BIT LLM
-
RL in LLM
- Coding
- Engineering
- Flash Attention 2
- KV Cache
- Inference → Batched?
- Python Advanced
- Decorators
- Context Managers
- Triton Kernels
- CuDA
- JAX / XLA JIT compilers
- Model Exporting (vLLM, Llama.cpp, QLoRA)
- ML Debugging
- Benchmarks
- Modifications
- Model Merging
- Linear Mapping
- SLERP
- TIES
- DARE
- MoE
- Misc Algorithms
- Chained Matrix Unit
- Gradient Checkpointing
- Chunked Cross Entropy
- BPE
- Explainability
- Sparse Autoencoders
- Task Vectors
- Counterfactuals
- MultiModal Transformers
- Audio
- Whisper Models
- Diarization
- Adversarial methods