VOOZH about

URL: https://thenewstack.io/ai-hardware-software-dilemma/

⇱ Does Artificial Intelligence Require Specialized Processors? - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2017-10-20 03:00:40
Does Artificial Intelligence Require Specialized Processors?
analysis,contributed,research,science,
Cloud Native Ecosystem

Does Artificial Intelligence Require Specialized Processors?

Oct 20th, 2017 3:00am by Massimiliano Versace
👁 Featued image for: Does Artificial Intelligence Require Specialized Processors?
Images via Neurala and DARPA.
Massimiliano Versace
Massimiliano Versace is the co-founder and CEO of Neurala Inc. and the company visionary. After his pioneering research in brain-inspired computing and deep networks, he continues to inspire and lead the world of autonomous robotics. He has spoken at a number of events and venues, including TedX, NASA, Los Alamos National Labs, Air Force Research Labs, HP, iRobot, Samsung, LG, Qualcomm, Ericsson, BAE Systems, Mitsubishi, ABB and Accenture. In addition, his work has been featured in IEEE Spectrum, New Scientist, Geek Magazine, CNN and MSNBC. Versace is a Fulbright scholar, has authored dozens of academic publications and holds several patents. He also holds two Ph.D.s: in Experimental Psychology from the University of Trieste and in Cognitive and Neural Systems from Boston University.

Artificial intelligence: everybody is talking about it, and the as-of-yet unrealized possibilities of the technology are fueling a renaissance in the hardware and software industry. Hardware and software companies — including Intel, NVidia, Google, IBM, Microsoft, Facebook, Qualcomm, ARM and many others — are racing to build the next AI hardware platform or fighting to maintain their lead.

AI, and deep learning (a sub-field of neural networks) in particular is an inherently non-Von Neumann process, and the prospect of having a processor more closely tailored to the specific needs of neural networks is appealing.

But, I like to think before acting, especially before diving into a potentially very expensive hardware project. And I would like to follow the very same paradigm that deep learning successfully uses to compute — learn from past examples — to determine the best path forward when considering this important question:

Should the AI industry build a specialized deep learning chip, and, if so, what should it look like?

Let’s first look at the past to predict the future.

The World Is Administered by the Wise but Progresses Because of the Crazy

The U.S. Defense Advanced Research Projects Agency (DARPA) is an agency created in 1958 as the Advanced Research Projects Agency (ARPA) by U.S. President Dwight Eisenhower with the goal of the developing “crazy innovative” technologies for use by the military. The agencies played a vital role in the development of defense (and non-defense) technology, such as ARPANET, the basis for what would become the internet; GPS technology; and many others.

In 2008, DARPA launched the Systems of Neuromorphic Adaptive Plastic Scalable Electronics (SyNAPSE). The goal was to create a new breed of non-traditional brain-inspired processors that could perform perception and behavioral tasks while simultaneously achieving significant energy savings.

Biology makes no distinction between memory and computation.

The program, which originally involved my lab at Boston University and researchers at Hewlett-Packard, HRL Laboratories, and IBM, had the goal of achieving energy efficiency by intelligently locating data and computation on the chip, alleviating the need to move data over large distances. The idea was to do this from a single synapse and scale up to a whole brain (see figure).

👁 Image

Some versions of the chip employed sophisticated memristive device technologies to implement dense connectivity between neurons and asynchronous computation — processing and transmitting data only as required, similar to how neurons process and communicate information.

Why SyNAPSE?

SyNAPSE was a complex, multi-faceted project, but it was rooted in two fundamental problems. First, traditional algorithms perform poorly in the complex, real-world environments in which biological agents thrive. Biological computation, in contrast, is highly distributed and deeply data-intensive. Second, traditional microprocessors are extremely inefficient at executing highly distributed, data-intensive algorithms. SyNAPSE sought both to advance the state-of-the-art in biological algorithms and to develop a new generation of nanotechnology necessary for the efficient implementation of those algorithms.

👁 Image

Biology makes no distinction between memory and computation. Virtually every synapse of every neuron simultaneously stores information and uses this information to compute. Standard computers, in contrast, separate memory and processing into two nice, neat boxes. Biological computation assumes these boxes are the same thing. Understanding why this assumption is such a problem requires stepping back to the core design principles of digital computers.

The vast majority of current-generation computing devices are based on the Von Neumann architecture. This core architecture is wonderfully generic and multi-purpose, attributes which enabled the information age. Von Neumann architecture comes with a deep, fundamental limit, however. A Von Neumann processor can execute an arbitrary sequence of instructions on arbitrary data, enabling re-programmability, but the instructions and data must flow over a limited capacity bus, connecting the processor and main memory. Thus, the processor cannot execute a program faster than it can fetch instructions and data from memory. This limit is known as the “Von Neumann bottleneck.”

Multi-core and massively multi-core architectures are partial solutions but still fit within the same general theme. Extra transistors are swapped for higher performance. Rather than relying on automatic mechanisms alone, though, multi-core chips give programmers much more direct control of the hardware. This works beautifully for many classes of algorithms, but not all and certainly not for data-intensive bus-limited ones.

Unfortunately, the exponential transistor density growth curve cannot continue forever without hitting basic physical limits, and these transistors will not serve the purpose of enabling brains to run efficiently in silicon. Enter the DARPA SyNAPSE program.

The Outcome of SyNAPSE

In 2014, after about six years of effort, one of the groups — IBM in San Jose, California — announced the “birth” of its SyNAPSE chip, loaded with more than five billion transistors and more than 250 million synapses (programmable logic points). The chip was built on Samsung Foundry’s 28nm process technology, and its power consumption was less than 100 milliwatts during operation. Despite this still being orders of magnitude fewer than the actual count of brain synapses, it was an important step toward building more brain-like chips.

👁 Image

So, after DARPA poured tens of millions of dollars into SyNAPSE, where are neuromorphic—“brain-inspired”—chips today? Can I find them at Best Buy, as I can find the other processors used today to run AI?

They are nowhere to be found.

So, what can be found? Which hardware can AI run on?

CPU, GPU, VPU, TPU, [whatever you want]PU

While you can’t buy a fancy SyNAPSE processor to run your AI, you have surely heard of — and maybe have even bought — CPUs or GPUs (graphics processors). GPU-accelerated computing leverages GPUs to reduce the time in computing complex applications: it offloads compute-intensive portions of the application to the GPU, while the remainder of the code still runs on the CPU. An intuitive way to understand why this is faster is to look into how CPUs and GPUs compute: while CPUs are designed to have a few cores optimized for sequential serial processing, GPUs have been built by design as massively parallel processors consisting of thousands of smaller cores which are best at handling multiple tasks simultaneously.

Movidius, now part of Intel (as the FPGA company Altera and the AI hardware company Nervana), built and successfully commercialized a Visual Processing Unit (VPU). VPUs differ from GPUs as they were designed from the ground up to allow mobile devices to process intensive computer vision algorithms (whereas GPUs were adapted from processing graphics to machine vision).

Aside from NVidia (GPUs) and Intel (CPUs and VPUs), Google has joined the battle with the Tensor Processing Unit (TPU). This is a custom ASIC specifically built for Google’s TensorFlow framework. The energy/speed savings are achieved by being more lenient to compute precision losses: by allowing the chip to be more tolerant of reduced computational precision, the TPU requires fewer transistors per operation. Since neural networks can tolerate such a precision loss by gracefully degrading performance (rather than “STOP WORKING”), it’s a processor that simply works.

How about Qualcomm? The company has talked up, and then apparently abandoned, its Zeroth AI software platform for mobile computing, but it’s hard to get a clear sense of Qualcomm’s efforts in this area, although recent efforts on Snapdragon seem to indicate a resurgence of interest.

Finally, we should not forget ARM: the chip-licensing firm — whose claim to fame is low-power processing — is heavily researching and investing in AI. And, to be honest, we can run Neurala’s algorithms pretty well on those tiny and inexpensive chips already!

So, there is plenty of hardware available to run your AI. What is the common thread of this hardware, and is it the time for the hardware industry to jump into another “specialized AI chip?”

3 Lessons Learned from Brain-Inspired Hardware

Programmability, programmability, programmability.

The success of NVidia is only partially due to performance. There are alternatives to NVidia processors that are either faster, leaner in power requirements or both. But NVidia’s sweet spot is a judicious intersection of performance, price and ease of use. The company has done a very good job of making the development, testing and fielding of neural network algorithms seamless.

So, why is programmability so important for AI?

Because AI is still a very young and dynamic field. Neural networks, deep learning and reinforcement learning: these are all research-intensive enterprises. In research, flexibility and ease of use are key to shortening the time for a research cycle. A new paradigm or a new variation in an AI algorithm may modify the underlying mathematical and computational assumptions enough to make any specialized hardware hopelessly obsolete.

I truly believe that, in the best interest of advancing AI, hardware and software should be somehow factorized, as to allow maximum flexibility to algorithm developers, in particular at a time when AI is quickly innovating and finding its path.

I believe that the hardware company that understands this trend and provides flexibility as the first and foremost feature will succeed in capturing the heart — and brains — of the field.

TRENDING STORIES
Max is the co-founder and CEO of Neurala Inc., and the company visionary. After his pioneering research in brain-inspired computing and deep networks, he continues to inspire and lead the world of autonomous robotics. He has spoken at dozens of...
Read more from Massimiliano Versace
SHARE THIS STORY
TRENDING STORIES
Google is a sponsor of The New Stack.
TNS owner Insight Partners is an investor in: Unit.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.