Staying ahead of the curve is part of the job. Every week we read frontier research from the AI labs, primary papers from academia, and the major firms' annual outlooks — and translate them into something an operator can actually use.
This page collects the work that has shaped how we think — annual reports we re-read every year, frontier research notes from the labs, and the historical case studies we still cite in client engagements.
Our edge is that we read the source material — directly, regularly, and with the question every operator actually has in mind: what does this change about how the work gets done?
OpenAI, Anthropic, and the rest — read across their product announcements, interpretability work, and formal research. The leading indicator for what's about to be deployable — and what's about to be unsafe — in client workflows.
Read with an operator's eye, not a strategist's. We look past the trend lists for the operating-model gap that determines who actually captures the gain.
Some operating-model lessons travel decades. We catalog the historical scale events — Tesla 2018, others — that still inform how we design for clients today.
Working notes on papers from OpenAI, Anthropic, and Stanford's CGRI — read for what they imply about deploying agents and rebuilding workflows in real organizations.
Anthropic on a method for surfacing — without telling it where to look — what's actually different between two LLMs. The operating-model implication for every team about to swap a model in production.
Anthropic's interpretability team on emotional patterns inside frontier models — and why that matters when one is sitting inside a client workflow.
OpenAI's own playbook for catching their internal coding agents going off-task — and what it tells every operator about deploying agents inside a real workflow.
Treating an LLM as a measurement instrument — and what it changes about how you instrument a workflow.
Reading OpenAI's Graviton paper for what it implies about the cost curve of running agents in production — not just training models.
Anthropic on what changes when an agent runs for hours instead of seconds — and why most workflows haven't caught up to it.
What it means when a frontier AI lab grounds its product decisions in 81,000 user interviews — and what it tells operators about disciplined research practice.
Stanford's Corporate Governance Research Initiative on what trust does — and what its absence costs — inside an organization.
We read the major firms' tech outlooks the same way an operator does — looking past the trend lists for the operating-model gap that determines who actually captures the gain. One synthesis per year, 2021 through 2025.
McKinsey, Bain, and Deloitte's 2025 outlooks tell three versions of the same story: the unit of automation moved from a task to a workflow to an agent — and most companies are about to find out their workflows weren't built for software that acts.
McKinsey and Deloitte's 2024 outlooks both ratified what operators had already learned the hard way: the AI dividend was real, but it lived inside the operating model — not the model.
Three of the year's defining reports converged on the same uncomfortable truth: most companies didn't have the operational shape to absorb the technology they were buying.
Two reports, one operator-grade conclusion: the firms that survive a downturn aren't the ones that cut hardest — they're the ones whose systems were already lean.
Tech got two years of demand in twelve months — and most operating models didn't catch up.
Some operating-model lessons travel decades. These are the ones we return to most often when a leadership team is about to take their organization through the next jump in scale.
How Anthropic restructured to ship enterprise product at scale without losing the research identity that differentiated it — and the operating-model lessons for any company trying to be two things at once.
How Replit recognized that the unglamorous integrated infrastructure they'd been compounding since 2016 — Repls, Nix environments, hosting, deploys, integrated database, object storage, auth — was the asset that made AI-agent app generation actually shippable, and how they restructured the company around that recognition without abandoning the existing 30M+ user base.
An operator's reading of Netflix's 2022 — the first subscriber loss in over a decade, the layoffs, the ad-tier reversal, the password-sharing crackdown, and the operating-model rebuild that produced the largest sub adds in years within eighteen months.
An operator's reading of Airbnb's 2020 — when ~80% of revenue evaporated in eight weeks, and the leaner, more focused company that emerged IPO'd at a higher valuation than the pre-pandemic one.
An operator's reading of Tesla's 2018 — the production crisis, the financial near-miss, and the operating-model lessons that apply to every team scaling faster than its systems.
Most of these reports describe a gap. We close gaps. The first conversation is short, specific, and free.