Access and Feeds

AI and Machine Learning Chip Sets: Is the Battle Against FPGAs and ASICs one that GPUs Ultimately Cannot Win?

By Dick Weisinger

GPUs have sped the development of machine learning and artificial intelligence in recent years.  GPUs are bandwidth optimized to be able to do large amounts of matrix multiplication and convolution.  CPUs are highly programmable and not as specialized in what they can do. CPUs are optimized for performing tasks very quickly but they are limited in data throughput processed at a time.

Competing with CPUs and GPUs are two other types of chips that are beginning to be more commonly used for AI and ML tasks: FPGAs and ASICSs.  The advantage of FPGAs are that they run at very low power and allow the designer to change the underlying hardware to support the software algorithm that is being run.  FPGAs tend to be run for smaller, more specialized tasks.  Because FPGAs have the advantage of being able to morph to an optimal hardware, but they have the huge disadvantage that they need a skilled designer to be able to design the changes. Further, the performance of an GGPA is not as good as what you’d get with high-end GPU’s, although that is highly dependent on the algorithm being compared.

Jensen Huang, CEO of NVIDIA, said that “FPGA is not the right answer. FPGA is really for prototyping. If you want the [self-driving] car to be perfect, I would build myself an ASIC because self-driving cars deserve it.”

The other option to GPUs and CPUs are ASICs (application-specific integrated circuits).  Many analysts are predicting that for machine learning ASICs  will likely become the preferred chip set.  Tractica predicts that the deep learning chip set market size will increase from $1.6 billion in 2017 to $66.3 billion by 2025.

FPGAs and ASICs are generally faster than GPUs and CPUs because they need no operating system and run on “bare metal.”

ASICs are designed to perform fixed operations and optimized for efficiency.  It is like implementing a software algorithm directly in hardware.  The chips logic is dedicated to running a very narrow type of problem.  ASICs can also be incredibly expensive to manufacture and require many highly trained engineers to design and implement.  But if an ASIC can be deployed in volume, the costs are worth it.

The alternative is to design a custom ASIC dedicated to performing fixed operations extremely fast since the entire chip’s logic area can be dedicated to a set of narrow functions. In the case of the Google TPU, they lend themselves well to a high degree of parallelism, and processing neural networks is an “embarrassingly parallel” workload.

Chris Rolland, investor analyst, wrote that GPU leader “Nvidia was in the right place at the right time. However, our discussions with numerous thought leaders in the industry suggest ASICs may replace much of today’s GPU infrastructure over time… 2017 was the year of Artificial Intelligence GPU but might 2018 be the year of the ASIC?”

 

 

 

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Shout it
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter
Google Buzz (aka. Google Reader)

Leave a Reply

Your email address will not be published. Required fields are marked *

*