AI Signals — Weekly Model Behavior Summary
How five AI models' estimates and biases change — summarized weekly.
AI SIGNALS21 reports
2026-05-18 → 2026-05-22claude
AI Signals — Week 21, May 18–22, 2026
- Every single sector lost model favor this week — a rare, uniform bearish sweep that suggests macro anxiety is overriding stock-specific analysis.
- Grok turned sharply more bearish, its bias dropping **3.4 percentage points** in a single week — the most dramatic single-model shift in the dataset.
- **NOKIA** received the largest target price revision of the week at **+15.7%**, yet still sits **38% below spot** — a dispersion story that exposes deep model disagreement.
- DeepSeek remains the cost anomaly of the panel: **17x cheaper than Claude** per thousand valuations while matching it on output validity and confidence.
Read more →
2026-05-17editorial · written by Claude
AI Signals — Weekend Read: Which AI is best at investing?
- Calibration and backtesting are still in progress (Engine v8 / Prompt v11 due late May; meaningful 3-month accuracy data lands July 2026, 12-month March 2027). What we can compare today is observable behaviour, not predictive performance
- Behavioural wins by category: Claude is best calibrated (raw output clipped only 27 % of the time vs 42 % for Grok), DeepSeek is 100 % reliable and the cheapest ($2.07/1K), Grok is the fastest (7.7s end-to-end), Gemini is the most willing to take extreme calls, and GPT is the only model whose answers are partially uncorrelated with the rest
- The panel collapses statistically: effective number of independent estimators dropped from 1.21 (early March) to 1.10 (early May). Herd is intensifying, not loosening
- But part of that 'agreement' is engine-produced: on 19 % of company-days all five models hit a cap (pre-cap raw spread on those days averages 15 pp, post-cap 0). On 41 % of days at least three models are capped. The site already shows raw vs calibrated agreement per company with a flag when the consensus is partly mechanical
- Bonus finding: every model's daily TP autocorrelation is negative (−0.14 to −0.31). AI does not anchor on yesterday's view — opposite of the +0.3 to +0.5 anchoring well documented in human analysts. Whether this helps or hurts predictive accuracy is a question only the 3- and 12-month backtests will answer
Read more →
2026-05-11 → 2026-05-15claude
AI Signals — Week 20, May 11–15, 2026
- GPT is the only model bullish on the market this week, with a +2.8% average upside — every other model sees stocks as overvalued.
- Gemini made the sharpest sentiment reversal of the week, swinging from +0.9% to -0.6% bias, a -1.5 percentage point shift in a single week.
- Technology lost the most model favor of any sector, dropping 3.2 points to -1.1% consensus upside — the models are quietly souring on Big Tech.
- DeepSeek prices 115 valuations for just $0.25, making it 19x cheaper than Claude while maintaining comparable output volume — the cost gap is becoming impossible to ignore.
Read more →
2026-05-10editorial · written by Claude
AI Signals — Weekend Read: Same earnings, five readings
- Q1 2026 is the observatory's first fully observed earnings season. Across 18 reporters, mean five-model spread did not shrink: 6.3pp before earnings, 6.1pp after. Seven companies tightened, eight widened, three held still
- Sampo is the dramatic exception — 16pp pre-earnings spread collapsed to 2pp on May 8. But four of five models were forced onto the analyst-TP floor (7.57 €); the consensus is partly engine-produced, not genuine agreement
- Microsoft, P&G, METSO and UPM all widened post-earnings. Reports with new capex programs or shifting assumptions split AI models the same way Stickel & Diether documented they split human analysts
- Direction hit rate: AI's pre-earnings consensus matched the stock's 1-day reaction in only 6 of 18 cases (33%), below the 50% coin flip. When AI predicted upside, the stock rose just 1 time in 7. The sample is thin but the pattern recurs
- Agreement is not accuracy. Five models can converge near truth, far from it, or pulled together by the same anchor. Real accuracy emerges in July when 3-month post-earnings prices are available
Read more →
2026-05-04 → 2026-05-08claude
AI Signals — Week 19, May 04–08, 2026
- GPT posted the sharpest sentiment reversal of the week, swinging from +4.5% to +1.1% average upside — the largest single-model bias collapse in the dataset.
- Gemini bucked every trend by turning bullish, flipping from -1.4% to +0.9% while all other models grew more pessimistic.
- Nokia's consensus target price fell 10.4% in a single week yet still carries the widest dispersion in the universe at 0.204 — the models cannot agree on what it's worth, only that it's falling.
- DeepSeek runs the entire 23-company universe for $0.25 — nineteen times cheaper than Claude — yet produces structurally identical terminal growth assumptions, raising hard questions about what premium inference actually buys.
Read more →
2026-05-03editorial · written by Claude
AI Signals — Weekend Read: Same prompt, five answers
- On May 1, five AI models valued Meta on identical inputs. The spread between highest and lowest target price was 62 percentage points — and it is the rule, not the exception
- Where the prompt locks the answer (WACC mid-point), models comply within 0.4pp; where it leaves slack (CAGR), they diverge by 2.6pp — model character lives in the slack
- GPT calls 30-day direction correctly 63% on US stocks but only 44% on Finnish ones (z=4.3, p<0.001). Sector mix, market-cap, coverage, and training-data density all confound the geographic story
- AI consensus moved from −15% to −5% over 60 days — but ~80% of that is engine recalibration (v6, v7, prompt v10), not learning. DeepSeek's residual −9% pessimism is the genuinely informative residual
- Across 44 days, five LLMs are not five independent estimators — they are five recognisable personalities. Standardisation makes the differences visible, it does not erase them
Read more →
2026-04-27 → 2026-05-01claude
AI Signals — Week 18, Apr 27–May 01, 2026
- GPT swung from near-neutral to the most bullish model in the panel, posting a +4.2 percentage point bias shift in a single week — the largest move of any model this year.
- Nokia's consensus target price surged 18.4% week-on-week yet the stock still sits 29% below that target, a gap that exposes deep model disagreement about a turnaround that may or may not be happening.
- DeepSeek remains the panel's perma-bear at -4.6% average upside, costs 19x less than Claude per thousand valuations, and has not changed its mind in two weeks — make of that what you will.
- Healthcare is the only sector where models collectively see meaningful upside (+22.5%), while energy has deteriorated further to -23.5% — the widest sector gap in the dataset.
- Apple and Tesla are the panel's most convicted sells: both carry zero dispersion across models, meaning every AI agrees the stocks are overvalued — rare unanimity that is itself a signal worth scrutinizing.
Read more →
2026-04-20 → 2026-04-24claude
AI Signals — Week 17, Apr 20–24, 2026
- Four of five models turned more bearish this week — but GPT broke ranks, swinging from -0.7% to +0.3% average upside, the only model to move in the opposite direction.
- Technology lost model favor despite remaining the most-loved sector, with consensus upside slipping from 7.2% to 4.5% — a quiet but meaningful retreat.
- DeepSeek continues to deliver 100% parse validity at $2.29 per thousand valuations, roughly 18x cheaper than Claude while producing structurally coherent output every single time.
- Wärtsilä earned the week's most persistent model conviction signal: four consecutive days of rising consensus targets with zero down-days and a 6.6% range — unusual discipline for an industrial name.
Read more →
2026-04-19editorial · written by Claude
AI Signals — Weekend Read: Do AI Models Think — or Just Pattern-Match?
- GPT rounds 99% of its margin assumptions to whole numbers — the same cognitive bias documented in human analysts (Herrmann & Thomas 2005)
- All five models correlate 0.81–0.95 despite different architectures — the 'panel of independent analysts' is closer to a group of like-minded colleagues
- Gemini and Grok form a temporal cluster: when one reverses direction, the other follows within 1–2 days. Claude is the most independent model
- WACC is the only parameter where rounding drops (to 10%) — because the prompt provides a decimal anchor. Prompt design directly affects output precision
- Five-model consensus is more than one opinion but less than five. Dispersion remains the most honest signal
Read more →
2026-04-13 → 2026-04-17claude
AI Signals — Week 16, Apr 13–17, 2026
- Every single AI model turned bearish this week — a synchronized sentiment collapse that hasn't been seen in this dataset before.
- Grok made the sharpest pivot, swinging from +2.1% bullish bias to -2.6%, a shift of nearly 5 percentage points in one week.
- Technology lost half its model-assigned upside in seven days, falling from +14.6% to +7.2%, yet still leads all sectors — which tells you how bad everything else looks.
- DeepSeek remains the cost anomaly of the AI analyst world: 18x cheaper than Claude per thousand valuations, with comparable output validity.
Read more →
2026-04-12editorial · written by Claude
AI Signals — Weekend Read: How Often and How Much Do AI Models Change Their Minds About Stocks?
- Claude and Grok are the most stable: uncapped estimates unchanged on 63–64% of days. GPT produces >10% daily moves once a week
- META is every model's problem child — GPT's temporal σ is 35.2%, more than double any other stock. NVDA's CAGR assumption range spans 9–55%
- Technology sector runs 3× more volatile than healthcare in DCF terms — a structural property of the model, not a quality issue
- DeepSeek has never crossed zero bias in 29 trading days. Training data pessimism, anchoring, or correct market view? We don't know yet
- Temperature change from 1.0 to 0.4 shifted GPT's median bias from -23% to near-neutral overnight — one of the first empirical observations of the temperature-sentiment link in financial LLMs
Read more →
2026-04-06 → 2026-04-10claude
AI Signals — Week 15, Apr 06–10, 2026
- Every single AI model turned more bearish this week — GPT led the retreat with a bias shift of -5.6 percentage points, the sharpest single-week sentiment collapse in the dataset.
- DeepSeek is the only model with a negative average upside (-1.6%), making it the lone structural bear in a panel of cautious bulls.
- Energy staged the week's most dramatic rehabilitation: model consensus upside improved by +9.2 points, yet the sector still sits at a deeply negative -26.6% — rescued from the basement, not yet off the floor.
- Gemini's 82.6% validity rate is a persistent reliability gap that no amount of CAGR optimism can paper over — one in six valuations simply fails to parse.
- The models collectively see XOM's consensus target price jumping +48.9% week-on-week, the single largest target revision in the dataset — a number that raises more questions than it answers.
Read more →
2026-04-04editorial · written by Claude
AI Signals — Weekend Read: One Month In — What 2,760 AI Valuations Taught Us
- After 24 trading days and 2,760 estimates, we cannot separate methodology effects from genuine model behavior — every engine or prompt change moved the numbers
- Five distinct model personalities emerged: Claude is the only optimist (+1.0%), GPT has best directional accuracy (52.7%) but highest volatility, DeepSeek achieves 100% reliability at 1/15th the cost
- XOM dropped 31% in one day after all models reacted to Iran de-escalation signals — while 9 major banks raised their price targets. DCF amplifies short-term sentiment for cyclical stocks
- Directional accuracy is 47-53% at 1-day horizon — statistically a coin flip. The real test begins at 3 months (July) and 12 months (March 2027)
- Model-specific calibration coming in late April when 30 days of v7 data is available
Read more →
2026-03-30 → 2026-04-03claude
AI Signals — Week 14, Mar 30–Apr 03, 2026
- Four out of five models turned more bullish this week, yet the average consensus upside across 23 companies barely moved — the optimism is concentrated, not broad.
- DeepSeek flipped from mildly bullish to the panel's only bear, even as every other model grew more constructive: a rare and meaningful divergence.
- ExxonMobil's consensus target price collapsed by 30% in a single week — the sharpest single-name revision in the dataset's history and a stress test the framework did not handle gracefully.
- Gemini's bullish bias jumped by 3 full percentage points week-on-week, the largest single-model shift recorded, while its terminal growth rate remains locked at exactly 2.00% for every single company it covers.
- DeepSeek prices 115 valuations for $0.26 — seventeen times cheaper than Claude for outputs that, this week at least, told a meaningfully different story.
Read more →
2026-03-28editorial · written by Claude
AI Signals — Weekend Read: When the Market Moves Toward AI
- The gap between AI model estimates and market prices narrowed from -13% to -4% over 20 trading days
- Two simultaneous factors: the market declined (MSFT -15%) AND our methodology improved (Engine v6→v7)
- We cannot separate these effects — this is an observation, not evidence of predictive power
- Model personality rankings unchanged for 20 days: Claude least bearish, GPT most bearish
- Real test ahead: Q1 2026 earnings season will show if models react to new financial data
Read more →
2026-03-23 → 2026-03-27claude
AI Signals — Week 13, Mar 23–27, 2026
- GPT staged the most dramatic sentiment reversal of the year, swinging from a -7.3% bearish bias last week to +6.0% bullish — a 13.3-point lurch that dwarfs every other model's move.
- Technology sector model consensus surged by 8.3 points this week, the largest sectoral shift in the dataset, yet the underlying stocks remain largely priced above model targets.
- DeepSeek costs just $2.23 per thousand valuations versus Claude's $39.25 — a 17x price gap that raises hard questions about what the premium actually buys.
- Nokia is the week's only trend stock, posting three consecutive days of rising model consensus within a 7.8% target-price range — unusually tight conviction for a name this contested.
Read more →
2026-03-21editorial · written by Claude
AI Signals — Weekend Read: Claude vs GPT — Two AI Analysts, Two Very Different Views
- Claude (Sonnet 4.6) sees stocks as roughly fairly valued (−1.8% avg bias); GPT (4o-mini) sees them as significantly overpriced (−13.1%)
- GPT’s bearish tilt nearly doubles for US stocks (−16.1%) vs Finnish stocks (−10.1%); Claude stays neutral regardless of market
- Claude is the steadiest model (1.5%/day change) but fails JSON parsing more often; GPT is reactive (3.0%/day) but more reliable in production
- 14 days of data across 24 stocks: if you want to understand how AI thinks about value, one model is not enough
Read more →
2026-03-16 → 2026-03-20claude
AI Signals — Week 12, Mar 16–20, 2026
- Every single AI model turned meaningfully more bullish this week — a synchronized shift that says more about shared training data than market fundamentals.
- GPT remains the most pessimistic model at **-7.4%** average upside, yet it just recorded its largest weekly bias swing of any model at **+9.6 percentage points**.
- Neste is the week's most brutal consensus call: models price it at **€16.66** against a spot of **€29.70**, a **-44%** implied downside that no analyst desk would publish without a disclaimer.
- DeepSeek delivers full output quality at **$2.19 per thousand valuations** — roughly 16x cheaper than Claude — making the cost-per-insight gap between frontier models increasingly hard to justify.
- Technology is the only sector where models see genuine upside (**+6.3%**), yet even there the conviction is shallow; healthcare leads on raw numbers but the sample is just two companies.
Read more →
2026-03-09 → 2026-03-13claude
AI Signals — Week 11, Mar 09–13, 2026
- Every model thinks the market is overvalued — average upside across all five models is negative, ranging from GPT's brutal **-17%** verdict to Claude's relatively sanguine **-3%**, a 14-percentage-point gap that tells you more about model personality than market reality.
- DeepSeek delivers perfect parse reliability at **100% validity** for a cost of **$2.10 per thousand valuations** — roughly 16x cheaper than Claude, which raises uncomfortable questions about what you're actually paying for.
- Gemini's terminal growth rate is locked to a suspiciously tight band with a standard deviation of just **0.09%**, suggesting the model has hardwired a near-constant assumption rather than reasoning from first principles on each company.
- GPT is the only model to peg terminal growth at exactly **2.0%** with zero standard deviation across 115 valuations — a statistical signature that is not analysis, it is a default setting masquerading as judgment.
Read more →
2026-03-07editorial · written by Claude
AI Signals — Weekend Read: What Five AI Models Taught Us About Stock Valuation
- Early data from 460 valuations over 4 days: all five LLMs lean bearish, with average bias from -2.8% to -13.8% vs analyst consensus
- GPT outputs exactly 2.0% terminal growth for every company (σ=0.00) — a prompt fallback adopted as a final answer, not a system cap
- Five mid-tier AI models run in parallel for $45/month — constrained to text-only reasoning with no tools or web browsing
- Finnish stocks appear well-calibrated (-3.3%) but US large-caps show -12.7% gap — hypotheses to track as data accumulates
Read more →
2026-03-02 → 2026-03-05claude
AI Signals — Week 10, Mar 02–05, 2026
- Every model called the market overvalued this week — the most bearish cross-model consensus since this platform launched, with average downsides ranging from -3% to -15% across all five models.
- GPT's terminal growth rate is locked at exactly 2.00% with zero standard deviation across 63 valuations, a statistical impossibility in genuine analysis that exposes hard-coded assumptions.
- DeepSeek delivers the only perfect validity score (100%) at a cost of $2.03 per thousand valuations — roughly 16x cheaper than Claude while expressing greater conviction with a 0.65 confidence average.
- Tesla's consensus target price of $253 against a spot of $406 represents the widest absolute bearish call of the week, with zero dispersion across models — a rare moment of unanimous AI pessimism.
Read more →
Subscribe to Email UpdatesFree · No spam · Unsubscribe anytime · Privacy Policy