October 2025

Dhanishtha-2.1: Introducing Adaptive Effort Intelligence

Varun Gupta•3 min read

Dhanishtha-2.1: Introducing Adaptive Effort Intelligence

The problem with one-size-fits-all thinking

Traditional language models typically operate at a single cognitive intensity. Whether answering "What's 2+2?" or solving complex mathematical proofs, they apply the same computational effort. It's like using a sledgehammer for every task—regardless of whether you're cracking a nut or breaking concrete.

Dhanishtha-2.1 changes this paradigm.

Four levels of effort

At the heart of Dhanishtha-2.1 is our Adaptive Effort Architecture. The model supports four discrete effort modes so it can match compute to task complexity:

No effort — Instant responses with essentially no computational overhead. Unlike some approaches, we avoid sending empty "think" blocks, which eliminates token waste.
Low effort — Fast, efficient replies for straightforward queries (this was the default behavior of the original Dhanishtha).
Medium effort — Balanced, moderate-depth reasoning for typical, slightly complex tasks.
High effort — Deep, extended reasoning for problems that require comprehensive analysis.

This isn't only about speed. It's intelligent resource allocation: the model learns when to think harder and when to respond instinctively.

Revolutionary training approach

Dataset generation — teaching models to think at different depths

We pioneered a dataset methodology that teaches the model how to respond at each effort level:

Every training example contains four distinct answers — one per effort level.
Smaller Dhanishtha-2.1 models were used to generate datasets for larger models, creating a self-improving pipeline.
Our custom agentic framework used ensembles of Dhanishtha-2.0 models (5×) working in concert to generate high-quality training data.

Training innovations

Key technical choices we made to push performance:

Most models were trained for a single epoch (the 14B variant trained for 3 epochs).
Extremely small learning rate: 1e-37, for very fine-grained updates.
Custom reinforcement-learning method: a modified GRPO (Gradient Reward Policy Optimization) adapted specifically for the 14B model.
The 0.6B model reused and improved the regenerated dataset from Dhanishtha 2.0, extended to cover all four effort levels.

What this means for developers

Dhanishtha-2.1 is more than an incremental improvement — it rethinks how models allocate compute. Practical benefits for developers include:

Build applications that automatically scale thinking effort based on query complexity.
Reduce inference costs by up to ~80% on simple queries.
Maintain peak performance for complex reasoning tasks by escalating effort only when needed.
Deploy adaptive AI that behaves more like human problem-solving: efficient on the simple stuff, thorough on the hard stuff.

The road ahead

Dhanishtha-2.1 is a first step toward our larger release, "Dhanishtha Max," scheduled for 26 March 2025. The Max model will include five effort modes and further improvements to adaptivity and efficiency.

Published by the Dhanishtha research team.