Submissions from sebastianraschka.com

		Components of a Coding Agent (sebastianraschka.com)
		299 points by MindGods 10 days ago \| past \| 90 comments
		Claude Code's Real Secret Sauce Isn't the Model (sebastianraschka.com)
		6 points by ModelForge 14 days ago \| past
		A Visual Guide to Attention Variants in Modern LLMs (sebastianraschka.com)
		9 points by Brajeshwar 22 days ago \| past
		A Visual Guide to Attention Variants in Modern LLMs (sebastianraschka.com)
		23 points by Anon84 23 days ago \| past \| 1 comment
		LLM Architecture Gallery (sebastianraschka.com)
		586 points by tzury 30 days ago \| past \| 41 comments
		A Round Up and Comparison of 10 Open-Weight LLM Releases in Spring 2026 (sebastianraschka.com)
		4 points by MindGods 48 days ago \| past
		Categories of Inference-Time Scaling for Improved LLM Reasoning (sebastianraschka.com)
		1 point by ibobev 78 days ago \| past
		Understanding and Coding the Self-Attention Mechanism of LLMs from Scratch (sebastianraschka.com)
		1 point by onurkanbkrc 78 days ago \| past \| 1 comment
		The State of LLMs 2025: Progress, Problems, and Predictions (sebastianraschka.com)
		1 point by nsainsbury 3 months ago \| past
		The State of LLMs 2025: Progress, Problems, and Predictions (sebastianraschka.com)
		3 points by ModelForge 3 months ago \| past
		The State of LLMs 2025: Progress, Progress, and Predictions (sebastianraschka.com)
		4 points by ibobev 3 months ago \| past
		The State of LLMs 2025: Progress, Progress, and Predictions (sebastianraschka.com)
		9 points by vismit2000 3 months ago \| past
		New LLM Pre-Training and Post-Training Paradigms (sebastianraschka.com)
		2 points by lr0 3 months ago \| past \| 1 comment
		Understanding Encoder and Decoder LLMs (sebastianraschka.com)
		1 point by jeffjeffbear 3 months ago \| past
		A Technical Tour of the DeepSeek Models from V3 to v3.2 (sebastianraschka.com)
		23 points by ibobev 4 months ago \| past \| 1 comment
		A Technical Tour of the DeepSeek Models from V3 to v3.2 (sebastianraschka.com)
		5 points by mzl 4 months ago \| past \| 1 comment
		Recommendations for Getting the Most Out of a Technical Book (sebastianraschka.com)
		2 points by naves 4 months ago \| past
		A Technical Tour of the DeepSeek Models from V3 to v3.2 (sebastianraschka.com)
		8 points by giuliomagnifico 4 months ago \| past
		Getting the Most Out of a Technical Book (sebastianraschka.com)
		4 points by quietlearning 5 months ago \| past
		Beyond Standard LLMs (sebastianraschka.com)
		1 point by vismit2000 5 months ago \| past
		Beyond Standard LLMs (sebastianraschka.com)
		1 point by ibobev 5 months ago \| past
		A Researcher's Field Guide to Non-Standard LLM Architectures (sebastianraschka.com)
		2 points by ModelForge 5 months ago \| past
		Understanding the 4 Main Approaches to LLM Evaluation (From Scratch) (sebastianraschka.com)
		1 point by ibobev 6 months ago \| past
		Popular Attention Alternatives: GQA, MLA, SWA (sebastianraschka.com)
		4 points by ModelForge 6 months ago \| past
		Multi-Head Latent Attention (sebastianraschka.com)
		4 points by ModelForge 6 months ago \| past
		Understanding the 4 Main Approaches to LLM Evaluation (From Scratch) (sebastianraschka.com)
		2 points by ibobev 6 months ago \| past
		LLM Evaluation from Scratch: Multiple Choice, Verifiers, Leaderboards, LLM Judge (sebastianraschka.com)
		4 points by ModelForge 6 months ago \| past
		Understanding and Implementing Qwen3 from Scratch (sebastianraschka.com)
		1 point by ibobev 7 months ago \| past
		GPT-OSS vs. Qwen3 and a detailed look how things evolved since GPT-2 (sebastianraschka.com)
		490 points by ModelForge 8 months ago \| past \| 97 comments
		From GPT-2 to GPT-OSS: Analyzing the Architectural Advances (sebastianraschka.com)
		3 points by mdp2021 8 months ago \| past
		More