The Feynman Companion

May 08, 2026

Part 2

The Vision AI Application Stack

From Captured Frame to Production Decision

Everything AFTER the camera captures and BEFORE model internals (Part 3)

"Study hard what interests you the most

in the most undisciplined, irreverent and original manner possible."

— Richard Feynman

Prologue: The Map of Part 2

In Part 1, we followed a photon from the camera sensor through the USB stack to userspace. The frame is now sitting in a buffer in your application’s memory. A 48 MP YUYV blob of 96 megabytes. Now what?

This part covers everything that happens to that image before, during, and after AI inference—the entire application stack. We deliberately exclude model internals (convolution layers, attention, loss functions)—that’s Part 3. Here we focus on the engineering: how do you take a raw camera frame and turn it into a production-line pass/fail decision in under 100 milliseconds?