Why Sonar Has More Insights — Gap Analysis & Expansion Plan

✅ March 2026 Update — Dual-Stream Pipeline Live

The original analysis (Feb 28) was based on a Steam-only prototype with 80 insights per game. The pipeline has been rebuilt with dual-stream (press + player), multi-source data, and GitHub Actions automation. Here's what changed:

~150

Avg insights per game now

was 80 (Steam only)

1,518

Total insights across 10 games

was ~800 (100 Steam reviews × 10)

Separate sentiment streams

Press (articles) + Player (Steam)

Change	Before (Feb 28)	After (Mar 2026)	Status
Data streams	Steam only	Press articles + Steam reviews	✅ Done
Sources per game	100 (Steam reviews only)	100–122 (Steam + press articles)	✅ Done
Category framework	8 categories (keyword match)	26 categories (AI extraction)	✅ Done
Reference score	None	Steam All-Time % + Metacritic	✅ Done
OpenCritic	Planned	❌ Auth now required (was "free public")	Blocked
Reddit	Planned	❌ HTTP 403 from GitHub Actions IPs (Azure block)	Blocked
YouTube transcripts	Phase 2	⏸ YT_BLOCKED=True (needs transcript proxy)	Backlog
GitHub Actions automation	None (local only)	workflow_dispatch — run from any browser	✅ Done

📊 Current gap (Balatro): Our prototype: 112 insights · 103 sources → Sprung Sonar: 370 insights · 56 sources → 3.3× gap (down from 4.6× in Feb baseline). Main remaining drivers: YouTube (disabled) and Reddit (IP blocked).

1. The Numbers — Our Prototype vs. Sonar

Using Balatro as the benchmark: Sonar shows it publicly in their press coverage, giving a clean apples-to-apples comparison point.

Our Prototype — Balatro (Mar 2026)

112

total insights (was 80 — Feb baseline)

2 source types · 100 Steam reviews + 3 press articles · 103 sources total

Sonar — Balatro

370

insights from 56 sources

3+ source types · YouTube + articles + community

That's a 3.3× insight gap on a single game — down from 4.6× in the Feb baseline. The remaining gap is almost entirely explained by YouTube (disabled) and Reddit (blocked on GitHub Actions IPs).

Across all 10 games (Mar 2026 run)

Game	Press %	Player %	Total Ins	Sources	Steam All-Time	Metacritic	Sprung Sonar
Black Myth: Wukong	58%	58%	151	102	94%	81	—
Palworld	N/A	84%	77	100	95%	TBD	—
Helldivers 2	93%	56%	139	103	83%	82	—
Balatro	100%	77%	112	103	98%	90	88% · 370 ins
Hell Is Us	54%	49%	243	101	87%	77	—
Outer Wilds	100%	76%	144	122	95%	85	81% · 610 ins
Disco Elysium	89%	72%	131	102	92%	91	—
Pentiment	95%	67%	173	102	95%	86	—
Citizen Sleeper	86%	67%	184	102	94%	82	75% · 389 ins
Signalis	84%	78%	154	102	97%	81	—

Insight gap by dimension (updated Mar 2026)

Total source docs (Balatro)

103

⚠ We now have MORE source documents than Sprung (103 vs 56) — but our 100 are short Steam reviews; theirs are long-form YouTube, articles, community. Quality > quantity here.

Insights per source doc

~1.1

~6.6

Category depth (# cats)

Category coverage up from 8 → 16 for Balatro, 19–25/26 for other games. Balatro's lower coverage (16/26) due to its focused card-game design rather than open-world variety.

Total insights (Balatro)

112

370

2. Root Cause Breakdown

The insight gap comes from three separable problems, not one.

🔴 Source diversity

We use only Steam reviews. Sonar pulls from YouTube videos, written articles, and community boards. Each source type has different signal density — a 15-minute YouTube video can yield 30+ extractable insights; a 60-word Steam review yields ~1.

Impact: ~40×

🔴 Source volume

We cap at 100 reviews per game. Sonar had 56 distinct source documents for Balatro. Long-form sources (YouTube, articles) are fewer in number but exponentially richer per piece — each one is independently curation-worthy.

Impact: ~56×

🟡 Extraction density

Our keyword matcher tags a whole review to one category max. Sonar extracts multiple discrete insights per source — a single article might contribute 8 insights across 5 categories. This is an AI extraction quality problem, not a data access problem.

Impact: ~6–8×

🟡 Category resolution

Sonar has 26 categories vs our 8. More categories = more surfaces for insights to land on = higher raw insight count even from the same source material. Their categories include things like Monetization, Player Agency, Replay Value, Social Features — all missed by us.

Impact: ~3×

3. Source-by-Source Breakdown

What each source type provides, what's available for free, and estimated implementation effort.

Source	Sonar Uses?	We Use?	Insights / Source Doc	API Access	Effort
Steam Reviews	❓ Unconfirmed	✅ Yes (100/game)	~1.1 (AI extraction)	Free, no key	Done
Gaming Press Articles	✅ Confirmed	✅ Yes (3–5/game)	5–15 per article	Google Search + scrape	Done
YouTube Videos (transcripts)	✅ Primary source	⏸ Disabled	10–40 per video	YouTube Data API v3 (free quota)	YT_BLOCKED=True — needs proxy
Reddit (r/gaming, game-specific)	✅ Likely (community boards)	❌ Blocked	2–8 per thread	Reddit OAuth — Azure IPs blocked without auth	Blocked on GH Actions
Metacritic User Reviews	❓ Unconfirmed	❌ No	~1 per review (like Steam)	Scrape only (no API)	Medium
OpenCritic (critic reviews)	❓ Unconfirmed	❌ Blocked	8–15 per full review	Now requires API key — was listed as free	Email developers@opencritic.com
Metacritic Critic Reviews	❓ Unconfirmed	❌ No	8–15 per review	Scrape only	Medium
Steam Discussion Forums	❓ Unconfirmed	❌ No	2–6 per thread	ISteamPublishedFile API (free)	Easy
Twitch/Streaming VODs	❓ Unlikely (mentioned videos)	❌ No	Variable (reaction-heavy)	Twitch API — free for metadata	Hard (transcription cost)

4. Our Current Output — Top 5 by Category Mentions

With Steam only (1,000 reviews across 10 games), here's how mentions distribute across our 8 categories. Note: these are keyword matches, not discrete insights per Sonar's definition.

Rank	Source → Category	Total Mentions	% of All Matches	Gap vs. Sonar Equivalent
#1	Steam Reviews → Gameplay	532	32.4%	Well-covered — Sonar's equivalent likely similar
#2	Steam Reviews → Player Experience	393	24.0%	Comparable, but emotionally shallow (short text)
#3	Steam Reviews → Aesthetics	323	19.7%	YouTube is massively better here (visual reactions)
#4	Steam Reviews → Mechanics	100	6.1%	Undercounted — Steam reviews rarely breakdown loops
#5	Steam Reviews → Game & World Systems	91	5.5%	Highly undercounted — better in long-form articles

⚠ The bottom 3 categories — Accessibility (52), Technical (74), UX-UI (55) — are structurally undercounted from Steam reviews. Players rarely articulate accessibility or UI problems in short-form text. Long-form YouTube and critic reviews cover these substantially better.

5. Remaining Expansion Plan — From 112 to 370+ Insights Per Game

Four concrete phases, prioritized by ROI (insight gain per hour of dev work). Fully executable with the existing Python pipeline.

Phase 1 — 2–3 hours

Max out Steam + add OpenCritic

Est. insight lift: ~2.5×

The simplest wins: pull 500 reviews per game instead of 100, and add OpenCritic which has a free public API. OpenCritic aggregates full written critic reviews — these are 500–2000 words each and yield 8–15 insights per review vs. ~0.8 from a Steam blurb.

What to do:

1. Change max_reviews=100 to max_reviews=500 in analyze.py

2. Add fetch_opencritic_reviews(game_name) — search opencritic.com/api/game/search?criteria={name}, then fetch reviews at opencritic.com/api/review?game={id}. No API key needed.

3. Feed critic review text into the same classify pipeline. Each review is long — let AI extract multiple insights per review.

Steam ×500 OpenCritic

Phase 2 — 4–6 hours

Add YouTube transcripts

Est. insight lift: ~4–6×

This is likely Sonar's primary edge. A 15-minute YouTube video review is ~2,000–3,000 words and touches every dimension of a game. YouTube Data API v3 is free (10,000 units/day default quota). Transcripts are fetchable via the youtube-transcript-api Python library (no key needed).

What to do:

1. pip install youtube-transcript-api google-api-python-client

2. Search YouTube: f"{game_name} review" site:youtube.com — target video reviews, not "let's play" content. Filter by: duration 8–25 min (full reviews), view count >10k, published within 2 years.

3. Fetch auto-captions via youtube-transcript-api. Clean the transcript, chunk into 800-token blocks.

4. Run each chunk through Claude with a multi-insight extraction prompt: "Extract all distinct player experience insights from this transcript. For each insight, identify the category, sentiment, and a 1-sentence quote."

5. Target 5–10 videos per game → 150–400 insights per game from YouTube alone.

Steam ×500 OpenCritic YouTube (5–10 videos/game)

Phase 3 — 3–4 hours

Add Reddit + Steam Discussions

Est. insight lift: ~1.5×

Reddit houses the most in-depth mechanical discussions anywhere — game-specific subreddits often have pinned threads specifically about UX, accessibility, and bugs. Steam Discussions are similarly dense for technical issues. Both are low-cost to add.

What to do:

1. Reddit: use PRAW (Python Reddit API Wrapper). Free tier allows 60 req/min. Search r/{game_subreddit} for top posts tagged "criticism", "feedback", "ux", "accessibility". Also search r/gaming with game name + keywords.

2. Steam Discussions: endpoint is https://store.steampowered.com/appreviews/{appid} — Steam's IGetDiscussionList API also available for forum threads (no auth needed for public games).

3. Filter threads for signal: sort by upvotes, minimum comment count >5. Each quality thread = 4–8 extractable insights.

Steam ×500 OpenCritic YouTube Reddit (r/{game}) Steam Discussions

Phase 4 — 2 hours

Upgrade to multi-insight extraction

Est. insight lift: ~3× on same data

This is an AI prompt change, not a data change. Currently our classifier assigns one category per review. Sonar extracts multiple discrete insights per source. A single 200-word Steam review can contain observations about gameplay, technical performance, and UI — we're collapsing all that to one label.

What to do:

1. Change the classification prompt from: "Which category does this review primarily discuss?" to: "Extract all distinct insights from this review. For each, return: category, sentiment (positive/negative/mixed), severity (minor/notable/major), and a verbatim quote of 1–2 sentences."

2. Update the schema from one object per review to a list of insight objects. Update the HTML renderer to aggregate insights per category rather than reviews per category.

3. Expected output: 2–4 insights per Steam review, 6–12 per OpenCritic piece, 15–40 per YouTube transcript chunk.

All sources Multi-insight extraction

6. Projected Insight Count After Full Expansion

Projected insights per game after all 4 phases

112

Current — Mar 2026
was ~80 (Feb baseline)

~370

Sprung Sonar benchmark (Balatro)

~500–800

Our target (remaining phases)

After Phase	Sources Active	Estimated Insights/Game	Δ vs Current	Δ vs Sonar
Baseline — Mar 2026 ✅	Steam ×100 + press articles	112	+1.4× vs Feb	−70%
Next: Reddit OAuth	+ Reddit (GitHub Secret)	~145	+29%	−61%
Next: YouTube	+ YouTube (5–10 vids/game)	~350–450	+3–4×	≈ parity or above
Future: OpenCritic	+ OpenCritic (API key req.)	~400–500	+3.6–4.5×	+8–35%
Full build	All sources + higher Steam cap	~600–1000	+5–9×	+60–170% vs Sonar

7. Recommended Next Steps

✅ Biggest remaining win: YouTube transcripts. This is almost certainly Sonar's primary data source — their emphasis on "videos" is explicit in every press mention. A single 10-minute video review is ~2,000–3,000 words and touches every dimension of a game. Adding 5–10 videos per game would likely close the gap to parity or beyond. Blocker: youtube-transcript-api returns empty results on GitHub Actions; needs a self-hosted proxy or the /youtube-scrape skill.

✅ Easy win: Reddit OAuth credentials as GitHub Secret. Reddit HTTP 403s are caused by GitHub Actions' Azure IP range being blocked. Adding a Reddit app client ID/secret as secrets would restore ~30 additional insights per game at no API cost.

⚠ OpenCritic is no longer "free public": The API now returns {"message":"API key is required. Email developers@opencritic.com"}. This was documented as a free public API in the original plan — that's no longer the case. Contact them for a key, or pivot to scraping Metacritic critic reviews directly.

⚠ Press score caveat: Our press scores consistently run 10–15 points higher than Metacritic for well-reviewed games (Balatro 100% vs MC 90, Outer Wilds 100% vs MC 85). This is because our 3–5 article sample skews toward positive coverage. A larger, more diverse article sample — especially including mixed/negative reviews — would improve calibration.

Why Sonar Has More Insights — and How to Close the Gap

✅ March 2026 Update — Dual-Stream Pipeline Live

1. The Numbers — Our Prototype vs. Sonar

Across all 10 games (Mar 2026 run)

Insight gap by dimension (updated Mar 2026)

2. Root Cause Breakdown

3. Source-by-Source Breakdown

4. Our Current Output — Top 5 by Category Mentions

5. Remaining Expansion Plan — From 112 to 370+ Insights Per Game

6. Projected Insight Count After Full Expansion

Projected insights per game after all 4 phases

7. Recommended Next Steps