& the death of the Architect
…at least as we knew her.
The concept of “application architecture” has been around for decades, but only recently did an actual “architect” role finally enter our collective org charts. Thank the cloud. The idea that one person could be responsible for the correctness of the technology and for planning how other engineers would “fill in the blanks” was, rightfully, attractive. With cloud infrastructure, tooling, and practice all converging to create only a handful of meaningful getting-started choices, the Architect became a shining icon of risk-reduction. And for a brief and magical time, software-focused R&D turned into just “D”.
As AI took center stage, organizations were quick to file it under the heading “cloud,” and the Architect’s purview expanded. But then, as if in slow motion, the rules of engagement changed in ways that the Architect could never have anticipated.
Prior to 2012, progress in AI R&D closely tracked Moore’s Law, with compute doubling every two years. Post-2012, compute has been doubling every 3.4 months (1), contributing to an explosion in AI research and a new state of the art every week.
Image credit: OpenAI (1)
Meanwhile, a strong competing insurgency was also taking place in application development. During that same period, sensors exploded in number and capability. Data went from being sparse (think web requests) to incredibly dense and numerical. Software shifted away from deterministic business logic toward non-deterministic, computational codes (2). Finally, and maybe most importantly, the applications of highest value kept popping up in places where Moore’s Law couldn’t do all the heavy lifting. Specialized devices – edge compute – eclipsed virtual machines as the key vehicles of value delivery. In the end, the very nature of AI applications prevented the Architect from making use of her most powerful weapon: the persistent doubling of centralized computing power.
The fall of the Architect’s standard meant an immediate redistribution of risk to technical contributors that weren’t even on the field in the previous campaign. “Scientist” reappeared on the org chart, along with a data-munging support staff. Developers started sharpening metal and writing C code like it was 1999 again.
It’s a different world, and yet here you are looking for a champion among the rubble. Why?
Fog of war
The problem is that modern AI application teams are still missing their battle captain. Day in and day out, a lack of situational awareness is tangible and pervasive. No single contributor is ever sure that their piece of the technology is correct. Critical things are falling between roles, incentivizing competition over cooperation, and generally spewing fog of war (3):
Software optimization. Data scientists drive hard to meet accuracy requirements, picking algorithms and training models on their desktops or in the cloud. What they can’t know is if the Python snippets they’re cooking up in those Jupyter notebooks will run efficiently on the production hardware. Who takes those Python snippets and optimizes them? Without intimate knowledge of the algorithms, a performance engineer has no guarantee that meaningful optimization is even possible. No self-respecting bit-meddler is going to agree to an optimization path under these conditions. Nor is any modern data scientist going to agree to writing hardware-ready code from day one. Stalemate.
Hardware acceleration. The struggles individual developers face in working with hardware accelerators are thoroughly documented elsewhere (4), but the problem just gets worse at the level of the application and the team. The hardware market is growing like crazy, creating a whole host of problems:
Unpredictable results – It’s extremely difficult to know in advance what the actual latency and power consumption benefits of any given hardware accelerator will be, especially if algorithm selection isn’t finalized.
Curse of choice – Without a detailed knowledge of both the algorithms and the larger application architecture, hardware acceleration is opaque and undifferentiated. No one person on the team has enough context to evaluate the hundreds of seemingly identical options.
Disincentive to act – Anecdotally, teams spend six to twelve months just on the offload problem (5). Consider that in that time new general-purpose (CPU) hardware will likely come to market that offers 2x the performance benefit. The team might as well have done nothing.
No architecture. The lack of situational awareness is not only stressful for the team but masks an even deeper, more existential dread: the possibility that no combination of contemporary hardware and software will successfully unlock the application. The team is looking for the proverbial needle in a haystack without any certainty that the needle even exists.
The stuff of legends
Enter the mythical “AI application architect,” a seer capable of cutting through the fog of war. She is the keeper of secret, arcane knowledge:
Which pieces of the software should the team optimize, how much time optimization will take, and critically what will their gains be in terms of latency and power consumption?
Which hardware should the team use, what are the likely engineering man-hours associated, and critically what will their gains be in terms of latency and power consumption.
Which combinations of software optimization and hardware acceleration, if any, will unlock which application features?
The bad news is that this Architect doesn’t exist. Given the current pace of technological change, no human can reliably navigate these trade-offs. Most application teams know this implicitly, but nonetheless want to believe in a champion that can lead them confidently forward.
The good news is that we’re no longer in an age of superstition. These trade-offs are modelable – mathematically and computationally – with sufficient data. And the sages have been thinking about exactly that problem for a long time.
Long live the architect!
Hardware/software co-design is an old idea (6). At its core is a hypothesis that the relationship between hardware, software, and end-to-end application performance can be modeled using statistical techniques. Despite solid science to back it up, co-design nonetheless sounds fantastical to some technical people. This AI thing is “just engineering as usual,” they say. But it’s not.
Imagine you are a team of mechanical engineers trying to build the next-generation jet engine, and your directive is to balance maximum speed, fuel consumption, and manufacturing costs. Would you go down to the hangar, get a bunch of arc welders and sheet metal, and start building jet engines to figure out which jet engine to build? Absolutely not. Your R&D process would look much different. Your “engineering as usual” almost certainly involves some kind of virtual environment – a CAD environment – that allows you to model the trade-offs and simulate your designs before ever putting heat to metal.
The benefits of modeling and simulation to jet engines are obvious. The impact of CAD on navigating knotty physical constraints is undeniable. So it will be with hardware/software co-design. It’s a no-brainer, and it’s the future of AI application architecture.
FØCAL (7) is the first co-design company of the AI era. We’re taking on the momentous challenge of adapting the technology of co-design to the expectations of modern – read: agile – engineering teams. FØCAL brings co-design directly to your engineering process, inserting it into the AI software life cycle as part of your normal testing, integration, and deployment workflows. Find us on GitHub (8)!
- (1) Amodei, D. & Hernandez, D. (2018-05-16) “AI and Compute.” OpenAI.com.
- (2) Rossa, B. (2018-02-20) “Computer vision: The process problem.” LinkedIn.com
- (3) Ozkaya, I. (2019-08) “Are DevOps and automation our next silver bullet?” Computer.org
- (4) Rossa, B. (2019-11-27) “Deep divides between AI chip startups, developers” – A developer’s perspective. f0cal.com
- (5) Hemsoth, N. (2019-10-29) “Deep divides between AI chip startups, developers.” TheNextPlatform.com
- (6) Bailey, B. (2019-07-25) “Hardware-software co-design reappears.” SemiconductorEngineering.com
- (7) FØCAL website.
- (8) FØCAL GitHub organization page.
About the author
Imagine a premature graybeard with a bad Emacs habit who has been talking trash on neural networks since the 90s. Then throw in a couple of big DARPA programs with GPUs, Beowulf clusters, and LIDARs on A-10 Warthogs. Since 2013, I’ve been a “chief vision officer” for hire, helping companies industry-wide tackle their computer vision R&D and HR challenges. Oh, and I also launched a startup.
Brian F. Rossa – C*O, FØCAL