08.01.2026 17:04
A self-learning AI that sets tasks and evaluates itself without human involvement.
Modern artificial intelligence systems have achieved remarkable results, yet their core learning paradigm remains limited: most models are trained by imitating human behavior or solving predefined tasks. A new research direction proposes a fundamentally different path — AI that learns autonomously by generating its own questions and verifying answers without human involvement.
An international team from Tsinghua University, the Beijing Institute for General Artificial Intelligence (BIGAI), and Pennsylvania State University introduced Absolute Zero Reasoner (AZR) — an experimental architecture in which the model acts as both student and teacher. It independently creates tasks, solves them, validates results by executing code, and uses both successes and failures as feedback for continuous improvement.
The defining feature of AZR is its closed-loop learning process: the language model generates Python programming tasks, attempts to solve them, and objectively checks correctness through execution. Unlike traditional reinforcement learning, where evaluation depends on human judgment or predefined metrics, the source of truth here is the computational environment itself — the code and its output.
Experiments show that this approach significantly enhances logical reasoning and programming skills, even in relatively compact models. Open-source Qwen models with 7 and 14 billion parameters demonstrated notable improvements in reasoning quality and, in some tasks, outperformed models trained on manually curated datasets. This highlights the potential of self-learning as an alternative to costly human supervision.
Researchers emphasize that this method closely mirrors human learning: starting with imitation, followed by self-questioning, experimentation, and ultimately surpassing acquired knowledge. The mechanism scales naturally — as the model grows more capable, it generates increasingly complex challenges for itself.
At present, the technology is limited to domains with easily verifiable outcomes, such as mathematics and programming. However, in the future it could extend to agent-based systems — autonomous AI capable of operating in browsers, office tools, or digital environments while independently evaluating the correctness of its actions.
Importantly, Absolute Zero is not an isolated effort. Similar ideas are emerging across major AI laboratories, pointing to a new learning paradigm in which AI evolves from a passive data consumer into an active problem solver and researcher.
Amid growing data scarcity and rising training costs, such approaches may become critical for the industry’s progress. If AI systems learn to improve effectively without human input, this could mark a decisive step toward truly autonomous intelligence — systems capable not only of replicating human solutions, but of going beyond them.