Tesla patent hints at Hardware 3’s neural network accelerator for faster processing

(Photo: Tesla)

During the recently-held fourth-quarter earnings call, Elon Musk all but stated that Tesla holds a notable lead in the self-driving field. While responding to Loup Ventures analyst Gene Munster, who inquired about Morgan Stanley’s estimated $175 billion valuation for Waymo and its self-driving tech, Musk noted that Tesla actually has an advantage over other companies involved in the development of autonomous technologies, particularly when it comes to real-world miles.

“If you add everyone else up combined, they’re probably 5% — I’m being generous — of the miles that Tesla has. And this difference is increasing. A year from now, we’ll probably go — certainly from 18 months from now, we’ll probably have 1 million vehicles on the road with — that are — and every time the customers drive the car, they’re training the systems to be better. I’m just not sure how anyone competes with that,” Musk said.

To carry its self-driving systems towards full autonomy, Tesla has been developing its custom hardware. Designed by Apple alumni Pete Bannon, Tesla’s Hardware 3 upgrade is expected to provide the company’s vehicles with a 1000% improvement in processing capability compared to current hardware. Tesla has released only a few hints about HW3’s capabilities over the past months. That said, a patent application from the electric car maker has recently been published by the US Patent Office, hinting at an “Accelerated Mathematical Engine” that would most likely be utilized for Tesla’s Hardware 3.

An illustration for Tesla’s Accelerated Mathematical Engine, as depicted in a recent patent application. (Credit: US Patent Office)

Tesla notes that there is a need to develop “high-computational-throughput systems and methods that can perform matrix mathematical operations quickly and efficiently,” particularly in computationally demanding applications such as convolutional neural networks (CNN), which are used in image recognition and processing. CNNs use deep learning to perform descriptive and generative tasks, usually utilizing machine vision that involves image and video recognition. These processes, which are invaluable for the development and operation of driver-assist systems like Autopilot, require a lot of computing power.

Considering the large amount of data involved in applications such as CNNs, the computational resources and the rate of calculations become limited by the capabilities of existing hardware. This becomes particularly evident in computing devices and processors that execute matrix operations, which encounter bottlenecks during heavy operations, resulting in wasted computing time. To address these limitations, Tesla’s patent application hints at the use of a custom matrix processor architecture. 

“In operation according to certain embodiments, system 200 accelerates convolution operations by reducing redundant operations within the systems and implementing hardware specific logic to perform certain mathematical operations across a large set of data and weights. This acceleration is a direct result of methods (and corresponding hardware components) that retrieve and input image data and weights to the matrix processor 240 as well as timing mathematical operations within the matrix processor 240 on a large scale.”

By adopting its custom matrix processor architecture, Tesla expects its hardware to be capable of supporting larger amounts of data. In terms of formatting alone, the electric car maker notes that its design would allow the system to reformat data on the fly, making it immediately available for execution. Tesla also notes that its architecture would result in improvements in processing speed and efficiency. 

“Unlike common software implementations of formatting functions that are performed by a CPU or GPU to convert a convolution operation into a matrix-multiply by rearranging data to an alternate format that is suitable for a fast matrix multiplication, various hardware implementations of the present disclosure re-format data on the fly and make it available for execution, e.g., 96 pieces of data every cycle, in effect, allowing a very large number of elements of a matrix to be processed in parallel, thus efficiently mapping data to a matrix operation. In embodiments, for 2N fetched input data 2N2 compute data may be obtained in a single clock cycle. This architecture results in a meaningful improvement in processing speeds by effectively reducing the number of read or fetch operations employed in a typical processor architecture as well as providing a paralleled, efficient and synchronized process in performing a large number of mathematical operations across a plurality of data inputs.”

It should be noted that Tesla’s patent application for its Accelerated Mathematical Engine is but one aspect of the company’s upcoming hardware upgrade to its fleet of electric cars. The full capabilities of Tesla’s Hardware 3, at least for now, remain to be seen. Ultimately, while Tesla did not provide concrete updates on the development and release of Hardware 3 to the company’s fleet of vehicles during the fourth quarter earnings call, Musk stated that some full self-driving features would likely be ready towards the end of 2019

Back in October, Musk noted  that Hardware 3 would be equipped in all new production cars in around 6 months, which translates to a rollout date of around April 2019. Musk stated that transitioning to the new hardware will not involve any changes with vehicle production, as the upgrade is simply a replacement of the Autopilot computer installed on all electric cars today. In a later tweet, Musk mentioned that Tesla owners who bought Full Self-Driving would receive the Hardware 3 upgrade free of charge. Owners who have not ordered Full Self-Driving, on the other hand, would likely pay around $5,000 for the FSD suite and the new hardware.

Tesla’s patent application for its Accelerated Mathematical Engine could be accessed here.

Tesla patent hints at Hardware 3’s neural network accelerator for faster processing
To Top