Field Notes
When One Data Point Costs $600 and Ten Weeks
Machine learning assumes data is cheap and abundant. In manufacturing, a single dataset can cost thousands of dollars and take months to collect, and the lessons that matter most can take years to come back from the field. It turns out Lean Six Sigma has been solving the scarce-data problem for a century, and modern ML is converging on the same answers. Here's the map between the two, and why the real win is closing the loop over the long run.
We've Been Having the Wrong Conversation About AI
While the tech industry debates 10x productivity and headcount reduction, manufacturing, test labs, and traditional engineering firms are sitting on enormous untapped potential. The real AI story isn't about replacing people. It's about giving them the tools to close decades of digital debt and focus on work that actually matters.

Building With Two LLM Agents in Deliberately Separated Roles:
I'm rebuilding my materials engineering RAG system, and as part of the build I decided to run an experiment using a strict agent workflow constraint: Claude implements the code, Codex writes the tests and reviews the work, and ownership never overlaps. The premise is grounded in recent research, showing that Agents, much like humans, probably shouldn't grade their own work. This post is what the workflow actually felt like across the first five reviewed milestones of a build (twenty-two issues caught, ten of them blockers), where the discipline held, where it got inconvenient, and where I think this approach is a tax most teams shouldn't pay.
AI Will Tell You You're Right Even When You're Wrong:
In engineering organizations where confidence drives design decisions, sycophancy is the AI failure mode we're not prepared to handle. Hallucination is verifiable — a wrong claim can be caught by a competent reviewer. Sycophancy confirms what you already believe, and the higher up the org chart the believer sits, the less likely anyone is to push back. A junior engineer who over-trusts a model's agreement wastes a design cycle. A CEO who does it triggers a $250 million lawsuit. Here's what the research says about why RLHF produces sycophancy, what it looks like when it reaches the courtroom, and how I've changed my prompting habits to work against it.
Building a Search Engine That Actually Works:
I vibe coded a RAG system for a corpus of engineering handbooks and watched the search function return garbage. Here's how I diagnosed the root causes, ran a 40-configuration parameter sweep, benchmarked four backend architectures, and tested a knowledge-graph hypothesis — improving NDCG@5 by nearly 400% along the way.