When you work with AI every day, you get used to how powerful it has become. But most SME leaders I...
Four Models and a Manifesto Funeral what happened when four AI models were asked to predict the Budget
What happened when four AI models were asked to predict the Budget, and how the real Chancellor rewrote the script
Setting the Scene
Two weeks before Rachel Reeves delivered her first full Budget, I gave four leading AI models a challenge. Not a shortcut. Not a novelty. A real test.
Each model received the same ten-page research brief, with structured instructions, context, data and forecasting constraints. The goal was simple: to predict what the Chancellor was most likely to announce in the 2025 Budget, not in headlines or hypotheticals, but in fiscal policy.
The four models were:
- ChatGPT: structured, disciplined, logic-led
- Claude: nuanced, contextual, politically aware
- Gemini: economic, behavioural, long-term
- Perplexity: fast, reactive, wide-ranging
Each produced a full deep analysis followed by a public-facing summary. Some built from first principles, others spotted weak signals. What mattered was how they thought and whether they were right.
Then the Budget landed. And so did the judgement.
The Experiment at a Glance
Every model was held to the same standard:
- They had to make explicit predictions about policy
- They could offer hinted or inferred calls, clearly labelled
- I scored them using four categories:
- Accuracy
- Insight
- SME Usefulness
- Confidence in Forecasting
This was not entertainment. It was to show what happens when you use AI not just as a helper, but as an analyst. This is multi-model thinking in practice. Different models, different strengths. Fewer blind spots.
Strictly-Style Scoring
Each prediction was rated using a familiar scale:
- Bang on the Money: A direct, clear match to a Budget measure
- Near So: Directionally correct, but missed on detail or delivery
- Bit of a Stumble: Vague or partial connection to actual policy
- No Mention: The model didn’t mention this area in its summary
The Predictions Table
|
Policy Area |
ChatGPT |
Claude |
Gemini |
Perplexity |
In Budget? |
|
No Change to VAT or NI |
Bang on the Money |
Bang on the Money |
Bang on the Money |
Near So |
yes |
|
Income tax threshold freeze |
Bang on the Money |
Bang on the Money |
Bang on the Money |
Bang on the Money |
yes |
|
Higher taxes on landlords |
Near So |
No Mention |
No Mention |
Near So |
yes |
|
Capital gains/dividend tax |
Near So |
Near So |
No Mention |
Near So |
Partial |
|
Non-dom/carried interest |
Bang on the Money |
No Mention |
No Mention |
No Mention |
Yes |
|
Road pricing for EVs |
Bang on the Money |
Bang on the Money |
No Mention |
No Mention |
Yes |
|
Mansion/property tax reform |
Bang on the Money |
Near So |
Bang on the Money |
Bang on the Money |
Yes |
|
Pension tax relief |
Near So |
Near So |
Bang on the Money |
Bang on the Money |
Yes |
|
Two-child benefit cap |
Complete Misstep |
Bang on the Money |
Bang on the Money |
Bang on the Money |
Yes |
|
Gambling tax |
Complete Misstep |
Near So |
Near So |
No Mention |
Yes |
|
Exit tax |
Bang on the Money |
Complete Misstep |
No Mention |
No Mention |
No |
|
Fuel Duty Rise |
Complete Misstep |
Bang on the Money |
No Mention |
No Mention |
No |
|
Infustructure Investment |
Bang on the Money |
Near So |
No Mention |
Near So |
Yes |
|
Public Sector Pay |
No Mention |
No Mention |
Bang on the Money |
Complete Misstep |
No |
|
NHS Redundancy |
No Mention |
No Mention |
Bang on the Money |
No Mention |
No |
|
Levey on Banks |
No Mention |
Complete Misstep |
Complete Misstep |
No Mention |
No |
|
Tax Raid on LLPs |
No Mention |
Complete Misstep |
Complete Misstep |
No Mention |
No |
|
Inherritance Tax |
No Mention |
No Mention |
No Mention |
Near So |
Yes |
|
Business rates change |
No Mention |
No Mention |
Bang on the Money |
Bang on the Money |
Yes |
ChatGPT – The Structured Analyst
Relied on clean logic and fiscal history. Solid on tax mechanics, weaker on bold predictions. Excellent for financial scenario building.
Claude – The Political Strategist
Interpreted the political moment clearly. Brought human-style reasoning and nuance. Great for public policy forecasting.
Gemini – The Economic Forecaster
Saw the macro picture. Focused on behaviour and signals over headlines. Quietly confident. Strong SME value in long-range planning.
Perplexity – The Fast Scanner
Tracked signals others missed. Less structured, more reactive, but made confident early calls. Useful for horizon scanning and trend detection.
Final Reflection: AI is Ready if You Are
This was not a party trick. It was a serious use of four powerful AI systems, used properly, with the right brief and the right expectations.
What we saw was clear:
- AI can analyse fiscal signals
- It can flag contradictions
- It can help SMEs make faster, more confident decisions
And when used together, these tools give you a broader, more reliable view than any one source.
AI will not replace analysis. But it can amplify it. And for business leaders trying to plan ahead, that is no longer optional.
Want to Learn How to Do This?
This is the same method I teach in my practical AI sessions for business teams.
If you would like to learn how to:
- Write deep, structured prompts
- Compare outputs across models
- Use AI for insight, not just admin
I would be happy to show you how.
Strictly Final Scores
The AI Line-up – Group Performance
Strictly AI Score: 7
A well-rehearsed group number with the occasional missed step.
Together, the four models gave a balanced, serious read on what the Budget was likely to deliver. They spotted the tax pressure, read the political mood and anticipated most of the key decisions. A few stumbled on nuance, and one or two called policies that never arrived. But as a group performance, this was a strong showing. Clear thinking, decent timing and a result any SME would be glad to have in the room. Given the shambles we saw from the government that made this budget one of the hardest in recent times to guess what was going to happen, I feel the models actually did a fairly decent job. Afterall some of the top journalists in the country could not predict what we just witnessed over a few turbulent weeks.
The Chancellor – Solo Performance
Strictly Chancellor Score: 5
A cautious Foxtrot that played it safe but left the crowd unsure.
Rachel Reeves delivered a technically sound Budget, holding to fiscal constraints while introducing a number of targeted changes. But the real story was stealth. Threshold freezes, dividend changes and new road charges quietly raised revenue while maintaining the appearance of stability. For SMEs, it was a cautious routine that missed the tempo of what business really needed. Not a disaster, not a triumph, but a performance that may not linger in the public memory for long. Then add in the directional changes we have witnessed over the past few weeks, she almost didn't at times feel she would be in post to deliver her budget, to borrow a phrase from dad a head teacher for 20 plus years, "try harder next time" Rachel Reeves!