Congressional Record Explorer

About

This project is building toward a foundational data layer for AI-powered journalism, starting with the U.S. Congressional Record. Over time, additional data sources will be onboarded. The long-term goal is to make primary political debate directly accessible to both AI systems and the public, so reporting, analysis, and education can be grounded in original, verifiable sources rather than summaries or secondhand commentary by legacy media outlets.

Transparency of original sources is particularly important in today's political climate. Both people and AI systems need the ability to trace claims back to the original words, context, and evidence.

Improvements over the existing Congressional Record

Current versions of the Congressional Record are difficult to navigate due to an outdated website and the volume of administrative filler and boilerplate. This project restructures the record to improve clarity, transparency, and accessibility. This includes:

Non-substantive material is removed, including administrative boilerplate, votes, and procedural text, so the focus remains on actual debate.
All content links back to original sources, including the official Congressional Record, related legislation, and floor video.
Debate is organized by topic, making it possible to see full back-and-forth discussions instead of isolated remarks.
Member information is clearly displayed, providing context on who is speaking.
Legislation mentioned on the floor is directly linked, allowing readers to move from debate to bill text.
Artifacts entered into the record are extracted, separating supporting documents from spoken remarks.
Floor audio and video are provided for spoken statements, making debate accessible to users who prefer or rely on audio/visual formats.

This structure also improves access for people using assistive technologies by reducing clutter, improving navigation, and offering audio-first options.

Current Status (Beta)

This site is actively under construction. The goal is to create something that is easy to use for humans while also exposing the underlying content in a form that AI models can directly consume. As models become more advanced and offer customized user experiences, the data will already be structured and ready.

Most days from 2025 have been uploaded and processed.
Basic keyword search is available at the individual day level.
Floor audio and video clips are available for many January 2025 sessions.
Automation is underway to ensure 2026 data is ingested continuously as new records are published.

Not Yet Available, But Coming Soon

Complete coverage of Congressional statements for all years
Cross-day keyword search
Semantic (vector-based) search
Public API or MCP access

Short-Term Roadmap

Complete automation so new Congressional Record data stays current
Backfill 2024 and earlier years of the Congressional Record
Expand floor audio and video coverage beyond January 2025
Enable full keyword search across all days
Add semantic search for topic-based and meaning-based discovery
Release an API and model-facing MCP interface for AI-powered journalism, educational, and summarization tools