News
SpaceX ships Starship’s 200th upgraded Raptor engine
A day after revealing the completion of the 200th Falcon upper stage and Merlin Vacuum engine, SpaceX has announced that it also recently finished building Starship’s 200th upgraded Raptor engine.
Starship – and Raptor, by extension – has yet to reach orbit and is likely years away from scratching the surface of the established success and reliability of the Falcon upper stage and MVac. But compared to MVac, Raptor is more complex, more efficient, more than twice as powerful, experiences far more stress, and is three times younger.
And Raptor 2 isn’t the first version of the engine. Before SpaceX shipped its first Raptor 2 prototype, it manufactured 100 Raptor 1 engines between the start of full-scale testing in February 2018 and July 2021. By late 2021 or early 2022, when Raptor 2 took over, the total number of Raptor 1 engines produced likely reached somewhere between 125 and 150 – impressive but pale in comparison to SpaceX’s Raptor 2 ambitions.
From the start, Raptor 2’s purpose was to make future Raptors easier, faster, and cheaper to manufacture. The ultimate goal is to eventually reduce the cost of Raptor 2 production to $1000 per ton of thrust, or $230,000 at Raptor 2’s current target of 230 tons (~510,000 lbf) of thrust. As of mid-2019, Musk reported that each early Raptor 1 prototype cost “more” than $2 million for what would turn out to be 185 tons of thrust (~$11,000 per ton). It’s not clear if that ever appreciably changed.
In response, SpaceX strived to make Raptor 2 simpler wherever possible, removing a large part of the maze of primary, secondary, and tertiary plumbing. In 2022, CEO Elon Musk confirmed that SpaceX had even removed a complex torch igniter system for Raptor 2’s main combustion chamber. All that simplification made Raptor 2 much easier to build in theory, and SpaceX’s production figures have more than confirmed that theory. Despite those simplifications, SpaceX was also able to boost Raptor 2’s thrust by 25% by sacrificing just 1% of Raptor 1’s efficiency.

Beginning with its first delivery in February 2018, SpaceX produced the first 100 Raptor 1 engines in about 36 months. In the first 11 to 12 months of Raptor 2 production, SpaceX has delivered 200 engines. That translates to at least six times the average throughput, but the true figure is even higher. In June 2019, Musk stated that SpaceX was “aiming [to build a Raptor] engine every 12 hours by end of year.” As is usually the case, that progress took far longer to realize. But in October 2022, a senior NASA Artemis Program official revealed that SpaceX recently achieved sustained production of one Raptor 2 engine per day for a full week.
Such a high rate – likely making Raptor one of the fastest-produced orbital-class rocket engines in history – is required because SpaceX’s next-generation Starship rocket needs a huge amount of engines. The Starship upper stage currently requires three sea-level-optimized Raptors and three vacuum-optimized Raptors, and SpaceX has plans to increase that to nine engines total. Starship’s Super Heavy booster is powered by 33 sea-level Raptors.

Orbital-class versions of Starship and Super Heavy have never flown, let alone demonstrated successful recovery or reuse, so SpaceX has to operate under the assumption that every orbital test flight will consume 39 Raptors. Even after the reuse of Super Heavy boosters or Starships becomes viable, taking significant strain off of Raptor demand, SpaceX wants to manufacture a fleet of hundreds or even thousands of Starships and a similarly massive number of boosters. To outfit that massive fleet, SpaceX would have to mass-produce orbital-class Raptor engines at a scale that’s never been attempted.
But it will likely be years – if not a decade or longer – before SpaceX is in a position to attempt to create that mega-fleet. If the Raptor 2 engines SpaceX is already building are modestly reliable and reusable, and it doesn’t take more than 5-10 orbital test flights to begin reusing Starships and Super Heavy boosters, a production rate of one engine per day is arguably good enough to support the next few years of realistic engine demand.
SpaceX’s first orbital Starship launch attempt could occur as early as December 2022, although Q1 2023 is more likely. SpaceX currently has permission for up to five orbital Starship launches per year out of its Starbase, Texas facilities and will likely try to take full advantage of that with several back-to-back test flights in a period of 6-12 months.
News
NVIDIA Director of Robotics: Tesla FSD v14 is the first AI to pass the “Physical Turing Test”
After testing FSD v14, Fan stated that his experience with FSD felt magical at first, but it soon started to feel like a routine.
NVIDIA Director of Robotics Jim Fan has praised Tesla’s Full Self-Driving (Supervised) v14 as the first AI to pass what he described as a “Physical Turing Test.”
After testing FSD v14, Fan stated that his experience with FSD felt magical at first, but it soon started to feel like a routine. And just like smartphones today, removing it now would “actively hurt.”
Jim Fan’s hands-on FSD v14 impressions
Fan, a leading researcher in embodied AI who is currently solving Physical AI at NVIDIA and spearheading the company’s Project GR00T initiative, noted that he actually was late to the Tesla game. He was, however, one of the first to try out FSD v14.
“I was very late to own a Tesla but among the earliest to try out FSD v14. It’s perhaps the first time I experience an AI that passes the Physical Turing Test: after a long day at work, you press a button, lay back, and couldn’t tell if a neural net or a human drove you home,” Fan wrote in a post on X.
Fan added: “Despite knowing exactly how robot learning works, I still find it magical watching the steering wheel turn by itself. First it feels surreal, next it becomes routine. Then, like the smartphone, taking it away actively hurts. This is how humanity gets rewired and glued to god-like technologies.”
The Physical Turing Test
The original Turing Test was conceived by Alan Turing in 1950, and it was aimed at determining if a machine could exhibit behavior that is equivalent to or indistinguishable from a human. By focusing on text-based conversations, the original Turing Test set a high bar for natural language processing and machine learning.
This test has been passed by today’s large language models. However, the capability to converse in a humanlike manner is a completely different challenge from performing real-world problem-solving or physical interactions. Thus, Fan introduced the Physical Turing Test, which challenges AI systems to demonstrate intelligence through physical actions.
Based on Fan’s comments, Tesla has demonstrated these intelligent physical actions with FSD v14. Elon Musk agreed with the NVIDIA executive, stating in a post on X that with FSD v14, “you can sense the sentience maturing.” Musk also praised Tesla AI, calling it the best “real-world AI” today.
News
Tesla AI team burns the Christmas midnight oil by releasing FSD v14.2.2.1
The update was released just a day after FSD v14.2.2 started rolling out to customers.
Tesla is burning the midnight oil this Christmas, with the Tesla AI team quietly rolling out Full Self-Driving (Supervised) v14.2.2.1 just a day after FSD v14.2.2 started rolling out to customers.
Tesla owner shares insights on FSD v14.2.2.1
Longtime Tesla owner and FSD tester @BLKMDL3 shared some insights following several drives with FSD v14.2.2.1 in rainy Los Angeles conditions with standing water and faded lane lines. He reported zero steering hesitation or stutter, confident lane changes, and maneuvers executed with precision that evoked the performance of Tesla’s driverless Robotaxis in Austin.
Parking performance impressed, with most spots nailed perfectly, including tight, sharp turns, in single attempts without shaky steering. One minor offset happened only due to another vehicle that was parked over the line, which FSD accommodated by a few extra inches. In rain that typically erases road markings, FSD visualized lanes and turn lines better than humans, positioning itself flawlessly when entering new streets as well.
“Took it up a dark, wet, and twisty canyon road up and down the hill tonight and it went very well as to be expected. Stayed centered in the lane, kept speed well and gives a confidence inspiring steering feel where it handles these curvy roads better than the majority of human drivers,” the Tesla owner wrote in a post on X.
Tesla’s FSD v14.2.2 update
Just a day before FSD v14.2.2.1’s release, Tesla rolled out FSD v14.2.2, which was focused on smoother real-world performance, better obstacle awareness, and precise end-of-trip routing. According to the update’s release notes, FSD v14.2.2 upgrades the vision encoder neural network with higher resolution features, enhancing detection of emergency vehicles, road obstacles, and human gestures.
New Arrival Options also allowed users to select preferred drop-off styles, such as Parking Lot, Street, Driveway, Parking Garage, or Curbside, with the navigation pin automatically adjusting to the ideal spot. Other refinements include pulling over for emergency vehicles, real-time vision-based detours for blocked roads, improved gate and debris handling, and Speed Profiles for customized driving styles.
Elon Musk
Elon Musk’s Grok records lowest hallucination rate in AI reliability study
Grok achieved an 8% hallucination rate, 4.5 customer rating, 3.5 consistency, and 0.07% downtime, resulting in an overall risk score of just 6.
A December 2025 study by casino games aggregator Relum has identified Elon Musk’s Grok as one of the most reliable AI chatbots for workplace use, boasting the lowest hallucination rate at just 8% among the 10 major models tested.
In comparison, market leader ChatGPT registered one of the highest hallucination rates at 35%, just behind Google’s Gemini, which registered a high hallucination rate of 38%. The findings highlight Grok’s factual prowess despite the AI model’s lower market visibility.
Grok tops hallucination metric
The research evaluated chatbots on hallucination rate, customer ratings, response consistency, and downtime rate. The chatbots were then assigned a reliability risk score from 0 to 99, with higher scores indicating bigger problems.
Grok achieved an 8% hallucination rate, 4.5 customer rating, 3.5 consistency, and 0.07% downtime, resulting in an overall risk score of just 6. DeepSeek followed closely with 14% hallucinations and zero downtime for a stellar risk score of 4. ChatGPT’s high hallucination and downtime rates gave it the top risk score of 99, followed by Claude and Meta AI, which earned reliability risk scores of 75 and 70, respectively.

Why low hallucinations matter
Relum Chief Product Officer Razvan-Lucian Haiduc shared his thoughts about the study’s findings. “About 65% of US companies now use AI chatbots in their daily work, and nearly 45% of employees admit they’ve shared sensitive company information with these tools. These numbers show well how important chatbots have become in everyday work.
“Dependence on AI tools will likely increase even more, so companies should choose their chatbots based on how reliable and fit they are for their specific business needs. A chatbot that everyone uses isn’t necessarily the one that works best for your industry or gives accurate answers for your tasks.”
In a way, the study reveals a notable gap between AI chatbots’ popularity and performance, with Grok’s low hallucination rate positioning it as a strong choice for accuracy-critical applications. This was despite the fact that Grok is not used as much by users, at least compared to more mainstream AI applications such as ChatGPT.