AI Investor Barometer
What is AI Investor Barometer?
AI Investor Barometer is an experimental comparison tool that uses five different AI models to generate valuation assumptions for listed stocks daily. Each model — GPT, Claude, Gemini, DeepSeek and Grok — independently produces its own estimate from the same company's public data. Results are presented side by side.
Currently tracking approximately 12 Finnish OMXH stocks and 12 US S&P 500 companies. The pipeline runs automatically on business days.
The project does not provide investment recommendations or advice. It is an experimental research tool: what do different AI models estimate about the same stocks, and do they differ from each other?
Platform in Numbers
Why Compare AI Models?
AI is increasingly involved in investment decisions — either directly or indirectly through analysis tools, news aggregations and chatbots. The problem is that a single model appears reliable even when it is systematically over- or under-estimating certain stocks.
When five different models estimate the same company on the same day, you immediately see:
- Do the models agree — or do estimates diverge dramatically?
- Is any model consistently higher or lower than others?
- Does any model change its estimates daily, while another remains stable?
- Does any model stay closer to analyst consensus than another?
These questions are essential if you want to understand how different AI models differ. This tool makes model-specific comparison visible.
How AI Models Generate Estimates
Most AI outputs are black boxes: the model gives a number directly without you knowing how it arrived at it. In this project, the approach is different.
This approach makes model-specific differences transparent: if GPT and Claude arrive at different estimates, it's due to different growth assumptions — not because one "calculated incorrectly".
AI Model Comparison Metrics
Limitations and disclaimer
- Not investment advice. All content consists of AI models' computational estimates, not investment advice. Draw your own conclusions.
- AI models can be systematically wrong. Models learn from historical data that may be biased or incomplete. A high confidence score does not mean being correct.
- Public data only. Models use only publicly available financial data and market information.
- Experimental tool. This is an experimental AI comparison and measurement tool. Data may be incomplete, the scheduler may fail, results may be incorrect.
Pipeline Diagnostics
| Model | Avg Latency | Cost / Run | Valid % |
|---|---|---|---|
| Claude | 30.6s | $0.954 | 100% |
| Deepseek | 16.8s | $0.055 | 100% |
| Gemini | 16.6s | $0.263 | 100% |
| Gpt | 8.7s | $0.391 | 100% |
| Grok | 6.3s | $0.369 | 100% |
Contact & Feedback
Have feedback, a collaboration idea, or a question? We'd love to hear from you.