Tesla AI Infra & AI Platform Engineering Manager Tim Zaman recently stated that the electric vehicle maker now has the world’s seventh largest supercomputer by GPU count, and that’s even before the company deploys its custom Dojo supercomputer. The massive graphics supercomputer hints at Tesla’s focus on its data and computing-intensive projects, such as Autopilot and Full Self-Driving.
Zaman posted Tesla’s new hardware on social media platforms such as Twitter and LinkedIn. In his post, the AI and Autopilot lead noted that the EV maker is sponsoring the MLSys Conference. He then stated that Tesla had upgraded its GPU supercomputer to 7,360 A-100(80GB) GPUs. This effectively makes Tesla’s supercomputer one of the world’s largest by GPU count.
As noted in a Data Center Dynamics report, the A100 GPUs are produced by Nvidia, which, interestingly enough, was where Zaman worked prior to his employment at Tesla. Each processor has 80 GB of graphics memory and boasts a memory bandwidth of 2 TB per second. That’s some serious hardware, though such power (and likely more) is needed for Tesla’s ambitious projects.
Tesla’s current supercomputer is a precursor cluster for Dojo. Interestingly enough, Tesla noted last year that the company’s pre-Dojo supercomputer was already the fifth most powerful in the world with its 5,760 Nvidia A100s. The company appears to have added about 1,600 GPUs to the system since then, or about 27%.
Since Tesla’s Dojo supercomputer is designed in-house, the massive machine will not be reliant on Nvidia A100 chips. Instead, it would utilize Tesla’s custom D1 chip, which will be supported by FP32, BFP16, and CFP8. Dojo would be optimized for machine learning workloads, particularly Tesla’s Autopilot and Full Self-Driving efforts. With Dojo in the picture, improvements in Autopilot and FSD would likely be accelerated.
What’s quite remarkable about Dojo is the fact that its beastly specs are really designed to do just one thing — make autonomous vehicles possible. During the AI Day presentation last year, Tesla highlighted that Dojo is a pure learning machine with more than 500,000 training nodes built together. Tesla also highlighted that Dojo is a work in progress, so even the impressive specs and features that the company teased in AI Day will be improved when the supercomputer is deployed.
Don’t hesitate to contact us with news tips. Just send a message to email@example.com to give us a heads up.