The Feynman Companion
Part 2
The Vision AI Application Stack
From Captured Frame to Production Decision
Everything AFTER the camera captures and BEFORE model internals (Part 3)
"Study hard what interests you the most
in the most undisciplined, irreverent and original manner possible."
— Richard Feynman
Prologue: The Map of Part 2
In Part 1, we followed a photon from the camera sensor through the USB stack to userspace. The frame is now sitting in a buffer in your application’s memory. A 48 MP YUYV blob of 96 megabytes. Now what?
This part covers everything that happens to that image before, during, and after AI inference—the entire application stack. We deliberately exclude model internals (convolution layers, attention, loss functions)—that’s Part 3. Here we focus on the engineering: how do you take a raw camera frame and turn it into a production-line pass/fail decision in under 100 milliseconds?