Research Pipeline

Tactics Journal Research Pipeline

The autonomous research arm of Tactics Journal. An end-to-end pipeline that ingests football content, detects emerging tactical trends, and generates citation-backed research reports.

Architecture

ingest → backfill → detect → rescore → report

ingest — Pulls RSS feeds and YouTube transcripts into the database
backfill — Fills gaps in source content
detect — Finds emerging tactical trends via novelty scoring (frontier-gap detection)
rescore — Recalculates novelty scores with latest data
report — Generates structured research reports from top candidates using multi-agent flow (planner, parallel OODA subagents, synthesis, citation verification, final revision)

Repo & Local

GitHub · ~/research · ~/code/research

Key files: main.py (pipeline), server.py (dashboard), detect_detectors.py, detect_policy_config.json, report_policy_config.json, config.json

Infrastructure

Railway — production cron jobs (ingest hourly, detect every 6h, report daily). CLI: railway status, railway logs --service <name>. Project: research, Environment: production.
Cloudflare AI Gateway — LLM routing for all pipeline calls
Cloudflare Dynamic Workers Gateway — article fetches and YouTube transcript resolution
Postgres (pgvector) — database for sources, embeddings, candidates, reports
Paperclip — agent orchestration system managing the research team

Models

anthropic/claude-sonnet-4-6 — lead, synthesis, summary, revision
workers-ai/@cf/meta/llama-3.3-70b-instruct-fp8-fast — default model, signal, citation, eval
openai/text-embedding-3-small — embeddings

All routed through Cloudflare AI Gateway. No OpenRouter or other providers.

Autoresearch

Karpathy-style experiment loop for tuning each pipeline stage:

python autoresearch/<stage>/prepare.py   # freeze benchmark
python autoresearch/<stage>/train.py     # edit mutable surface, run, keep improvements

Stages: ingest, detect, report. Production eval: make eval-report, make optimize-ingest-policy, make benchmark-report.

Publishing Flow

Pipeline drafts report → report.md + sources.json + metadata.json
Report artifacts saved to report_runs/<timestamp>-<slug>/
PR opened against GitHub repo for Kyle's review
Never auto-merge

Known Pitfalls

Trajectory analysis is unreliable — keyword matching returns "Insufficient history"
Double novelty scoring — rescore calls compute_novelty_score twice (known bug)
Source title fuzzy matching — LLMs paraphrase titles, candidates silently dropped
wrangler.jsonc is empty — no Cloudflare Worker deploy until proper wrangler.toml exists
Dashboard on Railway — DATABASE_URL pointing to *.railway.internal won't work locally

Tactics Journal — the publication this pipeline serves
Paperclip — agent orchestration managing the research team
Kyle Boas — reviews and publishes all reports

Research Pipeline

Tactics Journal Research Pipeline

Architecture

Repo & Local

Infrastructure

Models

Autoresearch

Publishing Flow

Known Pitfalls

Related

Navigation menu

People

Business

Technology

Planning

Tools