What is Model Collapse?

Question

Accepted Answer

A degradation phenomenon where AI models trained on data generated by other AI models progressively lose quality and diversity, potentially threatening the long-term viability of training on internet data. Research published in Nature in 2024 demonstrated that training language models recursively on AI-generated text leads to progressive quality degradation, with outputs converging to a narrow distribution. As AI-generated content floods the internet — with estimates suggesting 50%+ of web content could be AI-generated by 2025 — the risk of model collapse increases for future models trained on web-scraped data. This has intensified the value of pre-AI training data and curated human-created datasets. Solutions under exploration include data provenance tracking, filtering AI-generated content from training data, and synthetic data quality controls.

Model Collapse

Explore the Data

Related Terms

Artificial General Intelligence (AGI)

AI Alignment

AI Safety

Deepfake

Foundation Model

Hallucination

AI Economy Pulse