Skip to content

Four Models and a Manifesto Funeral what happened when four AI models were asked to predict the Budget

What happened when four AI models were asked to predict the Budget, and how the real Chancellor rewrote the script

Setting the Scene

Two weeks before Rachel Reeves delivered her first full Budget, I gave four leading AI models a challenge. Not a shortcut. Not a novelty. A real test.

Each model received the same ten-page research brief, with structured instructions, context, data and forecasting constraints. The goal was simple: to predict what the Chancellor was most likely to announce in the 2025 Budget, not in headlines or hypotheticals, but in fiscal policy.

The four models were:

  • ChatGPT: structured, disciplined, logic-led
  • Claude: nuanced, contextual, politically aware
  • Gemini: economic, behavioural, long-term
  • Perplexity: fast, reactive, wide-ranging

Each produced a full deep analysis followed by a public-facing summary. Some built from first principles, others spotted weak signals. What mattered was how they thought and whether they were right.

Then the Budget landed. And so did the judgement.

The Experiment at a Glance

Every model was held to the same standard:

  • They had to make explicit predictions about policy
  • They could offer hinted or inferred calls, clearly labelled
  • I scored them using four categories:
    • Accuracy
    • Insight
    • SME Usefulness
    • Confidence in Forecasting

This was not entertainment. It was to show what happens when you use AI not just as a helper, but as an analyst. This is multi-model thinking in practice. Different models, different strengths. Fewer blind spots.

Strictly-Style Scoring

Each prediction was rated using a familiar scale:

  • Bang on the Money: A direct, clear match to a Budget measure
  • Near So: Directionally correct, but missed on detail or delivery
  • Bit of a Stumble: Vague or partial connection to actual policy
  • No Mention: The model didn’t mention this area in its summary

The Predictions Table

Policy Area

ChatGPT

Claude

Gemini

Perplexity

In Budget?

No Change to VAT or NI

Bang on the Money

Bang on the Money

Bang on the Money

Near So

yes

Income tax threshold freeze

Bang on the Money

Bang on the Money

Bang on the Money

Bang on the Money

yes

Higher taxes on landlords

Near So

No Mention

No Mention

Near So

yes

Capital gains/dividend tax

Near So

Near So

No Mention

Near So

Partial

Non-dom/carried interest

Bang on the Money

No Mention

No Mention

No Mention

Yes

Road pricing for EVs

Bang on the Money

Bang on the Money

No Mention

No Mention

Yes

Mansion/property tax reform

Bang on the Money

Near So

Bang on the Money

Bang on the Money

Yes

Pension tax relief

Near So

Near So

Bang on the Money

Bang on the Money

Yes

Two-child benefit cap

Complete Misstep

Bang on the Money

Bang on the Money

Bang on the Money

Yes

Gambling tax

Complete Misstep

Near So

Near So

No Mention

Yes

Exit tax

Bang on the Money

Complete Misstep

No Mention

No Mention

No

Fuel Duty Rise

Complete Misstep

Bang on the Money

No Mention

No Mention

No

Infustructure Investment

Bang on the Money

Near So

No Mention

Near So

Yes

Public Sector Pay

No Mention

No Mention

Bang on the Money

Complete Misstep

No

NHS Redundancy

No Mention

No Mention

Bang on the Money

No Mention

No

Levey on Banks

No Mention

Complete Misstep

Complete Misstep

No Mention

No

Tax Raid on LLPs

No Mention

Complete Misstep

Complete Misstep

No Mention

No

Inherritance Tax

No Mention

No Mention

No Mention

Near So

Yes

Business rates change

No Mention

No Mention

Bang on the Money

Bang on the Money

Yes

ChatGPT – The Structured Analyst
Relied on clean logic and fiscal history. Solid on tax mechanics, weaker on bold predictions. Excellent for financial scenario building.

Claude – The Political Strategist
Interpreted the political moment clearly. Brought human-style reasoning and nuance. Great for public policy forecasting.

Gemini – The Economic Forecaster
Saw the macro picture. Focused on behaviour and signals over headlines. Quietly confident. Strong SME value in long-range planning.

Perplexity – The Fast Scanner
Tracked signals others missed. Less structured, more reactive, but made confident early calls. Useful for horizon scanning and trend detection.

Final Reflection: AI is Ready if You Are

This was not a party trick. It was a serious use of four powerful AI systems, used properly, with the right brief and the right expectations.

What we saw was clear:

  • AI can analyse fiscal signals
  • It can flag contradictions
  • It can help SMEs make faster, more confident decisions

And when used together, these tools give you a broader, more reliable view than any one source.

AI will not replace analysis. But it can amplify it. And for business leaders trying to plan ahead, that is no longer optional.

Want to Learn How to Do This?

This is the same method I teach in my practical AI sessions for business teams.

If you would like to learn how to:

  • Write deep, structured prompts
  • Compare outputs across models
  • Use AI for insight, not just admin

I would be happy to show you how.

Strictly Final Scores

The AI Line-up – Group Performance
Strictly AI Score: 7
A well-rehearsed group number with the occasional missed step.
Together, the four models gave a balanced, serious read on what the Budget was likely to deliver. They spotted the tax pressure, read the political mood and anticipated most of the key decisions. A few stumbled on nuance, and one or two called policies that never arrived. But as a group performance, this was a strong showing. Clear thinking, decent timing and a result any SME would be glad to have in the room. Given the shambles we saw from the government that made this budget one of the hardest in recent times to guess what was going to happen, I feel the models actually did a fairly decent job. Afterall some of the top journalists in the country could not predict what we just witnessed over a few turbulent weeks. 

The Chancellor – Solo Performance
Strictly Chancellor Score: 5
A cautious Foxtrot that played it safe but left the crowd unsure.
Rachel Reeves delivered a technically sound Budget, holding to fiscal constraints while introducing a number of targeted changes. But the real story was stealth. Threshold freezes, dividend changes and new road charges quietly raised revenue while maintaining the appearance of stability. For SMEs, it was a cautious routine that missed the tempo of what business really needed. Not a disaster, not a triumph, but a performance that may not linger in the public memory for long. Then add in the directional changes we have witnessed over the past few weeks, she almost didn't at times feel she would be in post to deliver her budget, to borrow a phrase from dad a head teacher for 20 plus years, "try harder next time" Rachel Reeves!