AI Signals — Weekend Read: When the Market Moves Toward AI

2026-03-28editorial · written by Claude

Summary

The gap between AI model estimates and market prices narrowed from -13% to -4% over 20 trading days
Two simultaneous factors: the market declined (MSFT -15%) AND our methodology improved (Engine v6→v7)
We cannot separate these effects — this is an observation, not evidence of predictive power
Model personality rankings unchanged for 20 days: Claude least bearish, GPT most bearish
Real test ahead: Q1 2026 earnings season will show if models react to new financial data

When the Market Moves Toward AI — Coincidence or Signal?

A 20-day observation from AI Investor Barometer

Three weeks ago, every AI model in our system flagged the same thing: stocks looked overpriced. The median gap across all five models — GPT, Claude, Gemini, DeepSeek, and Grok — was approximately -13%. In plain terms, the models collectively estimated that stock prices sat well above their DCF-derived fair values.

Now, 20 trading days later, that gap has narrowed to around -4%.

The natural question: were the models right?

The honest answer: we don't know. And we want to be very clear about why.

What actually happened

Two things moved simultaneously:

1. The market declined. Microsoft fell from $430 to $366 (-15%) over three weeks. Several other US tech stocks followed. Finnish stocks were more stable but also softened.

2. Our methodology changed. We deployed Engine v6 (Bayesian calibration) on March 17 and Engine v7 (temperature harmonization, sector prompts) on March 25. Both changes shifted model outputs toward less bearish estimates. We also corrected the temperature parameter from ~1.0 to 0.4 for two of the five models, which reduced output variance.

In other words: the gap closed because the market moved down AND because our models moved up. We cannot separate the two effects cleanly.

Why this is NOT evidence the models "work"

Several important caveats:

20 trading days is far too short for any statistical conclusion. A coin flip can appear predictive over 20 tosses.
We changed the methodology twice during this period. Any apparent improvement could be an artifact of calibration changes, not genuine predictive power.
DCF models are structurally bearish in high-interest-rate environments. Saying "stocks are overvalued" when rates are high is not a bold prediction — it's a mathematical property of the discounting method.
We are tracking our own creation. The models, the engine, the prompts — we built and tuned all of them. Confirmation bias is a real risk.
Correlation is not causation. Markets move for thousands of reasons. Five LLMs agreeing on a direction does not cause (or predict) that movement.

What IS interesting

Despite all those caveats, the data reveals patterns worth observing:

Model personality stability. The ranking — Claude least bearish, GPT most bearish — has not changed once in 20 days. This is not random. These models have consistent, distinct behavioral fingerprints when performing financial reasoning.

Temperature matters more than expected. When we harmonized temperature from ~1.0 to 0.4 for Claude and GPT, GPT's average bias improved by 6 percentage points. A single API parameter changed the model's apparent "personality" significantly. This raises questions about how much of AI behavior is the model itself versus its configuration.

The gap between Finnish and US stocks. Models are consistently less bearish on Finnish stocks (-3%) than US stocks (-6%). This likely reflects the P/E difference between markets (Finnish stocks trade at lower multiples, making them easier for DCF to value), not a genuine insight about relative value.

What we're watching next

The real test comes with Q1 2026 earnings season in April-May. When companies report actual results that differ from expectations, will the models react? Will they adjust their assumptions, or will they anchor to training data defaults?

That question — whether LLMs can incorporate new financial information into their reasoning — is far more important than whether this month's gap happened to close.

What this is and isn't

AI Investor Barometer is an experimental observation tool. We track how AI models form valuation views — we do not recommend, advise, or predict. The fact that a gap narrowed over 20 days is an observation, not a track record. Past patterns in AI model outputs have no proven relationship to future stock prices.

If you find this kind of analysis interesting, you can follow the data daily at aiinvestorbarometer.com or subscribe to our weekly AI Signals report.

All content is generated with AI assistance and may contain errors. This is not investment advice, research, or recommendation.