Run AI Stock Analysis Locally — FinSignal with Ollama, LM Studio, and Claude
FinSignal is a Chrome extension (and standalone web app) that runs 8 specialist financial agents — technical, fundamental, sentiment, risk, earnings, and more — through a single LLM call and returns a BUY / SELL / HOLD signal with cited sources and a confidence score.
It supports three LLM backends out of the box:
- Claude API — cloud, web-search grounded, highest quality
- Ollama — fully local, runs on your machine, no data leaves your network
- LM Studio — fully local, great GUI for model management
This post walks through setting up each one.
Install the Extension
- Grab it from the Chrome Web Store
- Pin it via the puzzle-piece menu so the ⬡ icon stays in your toolbar
The extension is free to install — the source repo is private, but everything you need is bundled in the published extension.
Option A — Claude API (Cloud, Recommended for Best Results)
Claude has live web_search access, so every analysis point gets grounded in real headlines and filings from the last few hours. This is the highest-fidelity path.
Setup:
- Get an API key at console.anthropic.com/settings/api-keys
Keys start with
sk-ant-api03- - Click the ⬡ icon → paste your key → click Connect
That's it. Your key is stored in chrome.storage.session and cleared automatically when Chrome closes — it never leaves your browser except in direct calls to api.anthropic.com.
Settings → Provider should show Claude (Sonnet) selected. Add a ticker like NVDA, hit Run all, and you'll get a full multi-agent report in ~10 seconds.
Note: Claude is the only provider with live web search. For Ollama and LM Studio, the extension swaps in a different prompt that drops web-search references — more on what local models actually bring to the table below.
Option B — Ollama (Local, Privacy-First)
Ollama lets you run open-weight models entirely on your machine. No API key, no usage costs, no data leaving your network.
1. Install Ollama
# macOS
brew install ollama
# Or download from https://ollama.com
Start the server:
ollama serve
# Runs at http://localhost:11434
2. Pull a model
Gemma 3 worked really well for this use case — it follows the JSON schema reliably and produces coherent multi-section financial analysis:
ollama pull gemma3:4b # ~3GB, runs on most laptops
ollama pull gemma3:12b # better quality, needs ~8GB VRAM
Other good options:
ollama pull llama3.2:3b # fast, lighter
ollama pull mistral:7b # solid instruction following
ollama pull qwen2.5:7b # strong at structured output
3. Configure in FinSignal
- Open the extension → Settings tab
- Provider → select Ollama
-
Ollama URL →
http://localhost:11434(default, leave as-is) -
Model → type the model name exactly as pulled, e.g.
gemma3:4b - Click Save
Now run analysis — it'll hit your local Ollama server instead of any cloud API.
Troubleshooting Ollama
CORS error in the extension popup?
The extension popup is on a chrome-extension:// origin. You need to tell Ollama to allow it:
OLLAMA_ORIGINS="chrome-extension://*" ollama serve
Or set it permanently:
# macOS launchd
launchctl setenv OLLAMA_ORIGINS "chrome-extension://*"
Model returns garbled or non-JSON output?
Smaller models sometimes fail to adhere to a strict JSON schema on the first try. Hit Retry — the orchestrator strips markdown fences and re-parses. If it fails repeatedly, try a larger variant (gemma3:12b over gemma3:4b).
Option C — LM Studio (Local, Great for Model Discovery)
LM Studio gives you a GUI for browsing, downloading, and running GGUF models. If you prefer not to use the CLI, this is the smoothest local experience.
1. Install LM Studio
Download from lmstudio.ai — available for macOS, Windows, and Linux.
2. Load a model
In LM Studio:
- Go to the Discover tab → search
gemma-3ormistral - Download a Q4 or Q5 quantization (good balance of size vs quality)
- Go to Local Server tab → select your model → click Start Server
LM Studio runs an OpenAI-compatible server at http://localhost:1234 by default.
3. Configure in FinSignal
- Open the extension → Settings tab
- Provider → select LM Studio
-
LM Studio URL →
http://localhost:1234 -
Model → paste the model identifier shown in LM Studio's server tab (e.g.
lmstudio-community/gemma-3-4b-it-GGUF) - Click Save
Provider Comparison
| Claude API | Ollama | LM Studio | |
|---|---|---|---|
| Web search grounding | ✅ Live headlines & filings | ❌ Training data only | ❌ Training data only |
| Fundamental depth | Strong | Strong | Strong |
| Recency (last earnings, news) | ✅ Current | ⚠️ Cutoff-limited | ⚠️ Cutoff-limited |
| Privacy | Data sent to Anthropic | 100% local | 100% local |
| Cost | Pay per token | Free | Free |
| Setup | Paste API key | CLI + model pull | GUI download |
| Best model for this | claude-sonnet-4 | gemma3:4b / 12b | Gemma 3 Q4/Q5 |
| JSON schema adherence | Excellent | Excellent (gemma3) | Excellent (gemma3) |
How the Analysis Works
All 8 agents run in a single LLM call — not 8 separate requests. The orchestrator builds a combined system prompt assigning each agent role, sends one message, and parses the structured JSON response.
User → orchestrator.js
↓
buildSystemPrompt() ← 8 agent roles combined
buildUserMessage() ← ticker + JSON schema
↓
callClaude() | callOllama() | callLMStudio()
↓
Parse JSON → normalize signal
↓
Zustand store → React UI
Every analysis point in the response must include a source field. The UI silently drops any point without one — a basic anti-hallucination guardrail. Confidence is capped at 99 and calibrated to drop when agents produce conflicting signals.
When running locally (Ollama / LM Studio), the prompt drops web-search instructions and adds:
"You have NO live web access. Base analysis on your training knowledge. Prefix uncertain values with 'approximately' or 'estimated'."
This is an honesty instruction, not a capability ceiling. Models like Gemma 3 are trained on enormous amounts of financial data — SEC filings, earnings transcripts, analyst reports, 10-Ks, financial news. For well-documented tickers, that's years of synthesized coverage baked into the weights.
What the 8-agent framework does with a local model is structured knowledge extraction — forcing the model to surface what it already knows across technical, fundamental, sentiment, risk, and compliance lenses simultaneously. The result can be genuinely high-quality analysis, especially for fundamentals, sector context, business moat, and historical risk patterns.
The gap vs. Claude is specifically recency: last quarter's earnings beat, an analyst downgrade from last week, yesterday's macro event. For longer-horizon views where the fundamental picture matters more than last week's news, local models hold up well.
Links
Not financial advice. FinSignal is for informational and educational purposes only.













