Dwarkesh Context Window

This project takes a full transcript from a Dwarkesh Patel podcast episode and prompts a frontier LLM to act as a third guest: summarizing, questioning, and extending the discussion into new directions for AI research.

One motivating idea is that if models ever become genuinely great researchers, they should be able to follow (and contribute to) expert conversations. A long-form interview—dense with context, claims, and open questions—is a surprisingly strong “test harness” for that.

Over time, the site can serve as a historical record of how well different models “think along” with top researchers, and maybe even evolve into a lightweight benchmark.

Inspiration: Theo.gg’s “Skatebench” benchmark (naming skateboarding tricks from English descriptions). It’s a great reminder that weird, random benchmarks can be insightful: early on some Chinese models lagged behind American models, and later on some models regressed on Skatebench while improving on other benchmarks.

I am not affiliated with Dwarkesh Patel in any way.

Browse posts View source

Recent posts