Claude Opus 4.7: adaptive thinking, task budgets and /ultrareview

Offizielles Anthropic-Headerbild zur Claude Opus 4.7 Ankündigung: abstrakte Illustration eines stilisierten Kopfprofils mit Netzwerk-Knoten

After the Desktop redesign, Anthropic released Claude Opus 4.7 today, the new flagship model. What changed in practice?

https://www.anthropic.com/news/claude-opus-4-7

A few features stand out from a QA and engineering point of view.

Adaptive Thinking, Task Budgets and /ultrareview as new everyday tools

The most defining change is Adaptive Thinking. Previously you had to hand the model a thinking budget manually, and finding the right size was largely guesswork. Opus 4.7 regulates that on its own and adjusts compute time to the complexity of the task. For teams this means less parameter fiddling and more consistent results without much upfront configuration.

Task Budgets are new in public beta: a soft token frame for entire agent loops. The model sees a countdown and prioritises by itself what still fits. If you run autonomous agent loops for regression testing, test case generation or document review, you finally have a reliable cost frame without a noticeable hit on quality.

/ultrareview is a new slash command in Claude Code. It starts a dedicated review session in which several agents go through the code in parallel and look specifically for subtle design weaknesses and overlooked logic gaps. You get a senior reviewer’s perspective before the merge, on top of your existing review process.

Memory, vision and behaviour: the quiet improvements

The model writes to and reads from scratchpads and notes files more reliably and holds context across multiple sessions better. In long test sessions or legacy migrations, where Claude needs the same context over several days, this is a noticeable step forward.

For images that Claude reads in and analyses, the resolution limit has tripled, now up to 3.75 megapixels. Detail-rich dashboards, dense code screenshots or UI mockups arrive without downsampling, where the previous limit used to cost detail.

The tone has become more direct: validation phrases and emojis are reduced, instructions are followed more literally. The raw numbers also moved (plus 11 points on SWE-bench Pro, a third fewer tool errors). In day-to-day use what stands out more is that tool failures no longer end the run; the model works through them.

First impression after a few hours: Adaptive Thinking and Task Budgets take configuration work off your plate and give you more control. /ultrareview will show up first in QA pipelines as a checkpoint before merging that goes beyond plain syntax checks.

Personal side note: it’s my birthday tomorrow, and with Tuesday’s Desktop redesign and today’s Opus 4.7 it feels a bit like an early present. Thanks Anthropic 😉

Share

QCT – Dein Experte für Testmanagement, Softwarequalität und digitale Transformation

QCT Logo in Negativ-Darstellung für dunkle Hintergründe