What is Pre-Training?

Question

Accepted Answer

The initial phase of training an AI model on a large, general-purpose dataset to learn broad knowledge and patterns before it is fine-tuned for specific tasks. Pre-training is the most expensive phase of building large AI models. GPT-4's pre-training reportedly cost $78-191 million in compute alone, processing trillions of tokens from books, websites, and code. The pre-training paradigm, popularized by BERT (2018) and GPT-2 (2019), enables transfer learning — training once, then adapting cheaply to many tasks. Pre-training datasets have grown from millions of documents to trillions of tokens. The quality and composition of pre-training data is now recognized as equally important as model architecture and scale.

Pre-Training

Live Data

Explore the Data

Related Terms

Artificial General Intelligence (AGI)

AI Alignment

AI Compute

Capex (Capital Expenditure)

ChatGPT

Data Center

AI Economy Pulse