Aiconomy

Model Collapse

A degradation phenomenon where AI models trained on data generated by other AI models progressively lose quality and diversity, potentially threatening the long-term viability of training on internet data.

Research published in Nature in 2024 demonstrated that training language models recursively on AI-generated text leads to progressive quality degradation, with outputs converging to a narrow distribution. As AI-generated content floods the internet — with estimates suggesting 50%+ of web content could be AI-generated by 2025 — the risk of model collapse increases for future models trained on web-scraped data. This has intensified the value of pre-AI training data and curated human-created datasets. Solutions under exploration include data provenance tracking, filtering AI-generated content from training data, and synthetic data quality controls.

Explore the Data

AI Economy Pulse

Every Friday: the 3 AI data points that actually matter this week. Free, forever.

Built on data from Stanford HAI, IEA, OECD & IMF

Latest: “AI Investment Hits $42B in Q1 2026 — Here's Where It Went”

No spam, ever. Unsubscribe anytime.