Aiconomy

Knowledge Distillation

A model compression technique where a smaller 'student' model is trained to replicate the behavior of a larger 'teacher' model, producing more efficient models that retain much of the original's capability.

Knowledge distillation, introduced by Hinton et al. in 2015, enables deploying powerful AI on resource-constrained devices like smartphones. The student model learns from the teacher's output probabilities (soft labels) rather than just the ground truth, capturing nuanced relationships between classes. Modern distillation has produced models like DistilBERT (40% smaller, 60% faster than BERT with 97% of its performance) and TinyLlama. The technique is central to making frontier AI capabilities accessible at lower cost.

Explore the Data

AI Economy Pulse

Every Friday: the 3 AI data points that actually matter this week. Free, forever.

Built on data from Stanford HAI, IEA, OECD & IMF

Latest: “AI Investment Hits $42B in Q1 2026 — Here's Where It Went”

No spam, ever. Unsubscribe anytime.