Aditya bhatt The Stochastic Parrot Problem: Why LLMs Fail When It Actually Matters Abstract Through two real conversations—one about backend architecture, one about a medical emergency—I discovered the hard boundary between what Large Language Models can and cannot do. While useful ... 30-Jan-2026
Aditya bhatt Tinker ; Might be huge for researchers and developers SUPERVISED FINE-TUNING FOR SYMPTOM CLASSIFICATION: A TINKER API CASE STUDY ABSTRACT Trained Llama-3.2-1B on 30 mental health examples using LoRA (rank=32). Achieved 94% loss reduction (4.0 → 0.25) ove... 02-Jan-2026
Aditya bhatt Distillation Learnings Why My First Knowledge Distillation Experiment Failed (And How I Fixed It) Date: December 19, 2025 I've been reading the DeepSeek-V3.2 paper and wanted to understand knowledge distillation hands-on. S... 18-Dec-2025
Aditya bhatt Continual Pretraining Continual Pre-Training of Large Language Models: A Practitioner's Guide Aditya Bhatt December 2025 Abstract Large Language Models (LLMs) are typically trained on static datasets, leading to knowledge ... 07-Dec-2025
Aditya bhatt Flash Attention + ROPE is Great Training Language Models from Scratch: A Systematic Study of Scaling Laws and Modern Architectural Techniques Abstract We present a systematic empirical investigation of transformer language model tra... 21-Nov-2025
Aditya bhatt Verifying scaling laws Empirical Validation of Scaling Laws: How Model Size and Training Impact Language Generation Abstract We present a systematic investigation of scaling behavior in transformer-based language models. Th... 15-Nov-2025