News
SpaceX dropped a Crew Dragon mockup to save a helicopter and its passengers
SpaceX says it encountered an issue that forced it to drop a Crew Dragon spacecraft mockup during parachute testing — not a failure of the vehicle or its parachutes, to be clear, but still a problem nonetheless.
This is now the second significant hurdle SpaceX’s Crew Dragon astronaut spacecraft has faced in the last few days, following the revelation that NASA will not permit the company to launch astronauts until it completes an investigation into an in-flight rocket engine failure during its March 18th Starlink launch. There is likely no technical corollary for the new Falcon 9 rockets that will launch NASA astronauts, but existing Commercial Crew Program (CCP) contract rules still require SpaceX’s internal investigation be completed before it can proceed. With lives on the line, caution – within reason – is unequivocally preferable to the alternative.
Thankfully, SpaceX’s parachute test article anomaly should have a much smaller impact on Crew Dragon’s astronaut launch debut schedule, but it’s unlikely to have zero impact.
“During a planned parachute drop test [on Tuesday], the test article suspended underneath the helicopter became unstable. Out of an abundance of caution and to keep the helicopter crew safe, the pilot pulled the emergency release. As the helicopter was not yet at target conditions, the test article was not armed, and as such, the parachute system did not initiate the parachute deployment sequence. While the test article was lost, this was not a failure of the parachute system and most importantly no one was injured. NASA and SpaceX are working together to determine the testing plan going forward in advance of Crew Dragon’s second demonstration mission.”
SpaceX — March 24th, 2020
On March 24th, SpaceX says it was preparing for one of the last system-level Crew Dragon parachute tests planned before the spacecraft can be declared ready for human spaceflight. These final tests are reportedly focused on corner cases, referring to unusual but not impossible scenarios the spacecraft might encounter during operational astronaut landing attempts. Those likely include parachute deployment scenarios that are far more stressful than a nominal reentry, descent, and landing would allow.
Regardless, things did not go as planned during Tuesday’s test attempt. SpaceX primarily uses cargo planes, helicopters, and large balloons to carry its Crew Dragon test articles (not actual functional spacecraft) to the altitudes and speeds needed to achieve certain test conditions. On March 24th, SpaceX was using a helicopter – either a civilian Blackhawk or a much larger Skycrane.



For unknown reasons, the helicopter carrying the Crew Dragon test article on March 24th began to experience “instability”, likely referring to some sort of resonance (wobble, sway, oscillation, etc). Out of an abundance of caution, the pilot – likely highly trained – decided the instability was becoming an unacceptable risk and chose to drop the cargo load (a Crew Dragon mockup). Unsurprisingly, the parachute test article was not ready to drop and plummeted to the Earth without any kind of parachute deployment, likely pancaking on the desert floor shortly thereafter.
Again, it needs to be noted – as SpaceX did above – that the loss of the Crew Dragon parachute test article was entirely unrelated to the performance of the spacecraft or the parachutes it was testing. The mockup destroyed in the incident is essentially just a boilerplate mass simulator shaped like a Crew Dragon capsule to achieve more aerodynamically accurate test results. As such, it’s far simpler and cheaper than an actual Dragon spacecraft and shouldn’t take long at all to replace if SpaceX doesn’t already have a second similar mockup ready to go.

Thankfully, that means that the loss of the test article should have next to no serious impact on Crew Dragon’s inaugural astronaut launch schedule. Planned no earlier than (NET) mid-to-late May according to NASA’s latest official statement, SpaceX and the space agency still have at least a month and a half to work through a final parachute test campaign, complete an investigation into Starlink L6’s Falcon booster engine failure, and finish several trees worth of paperwork and reviews. Delays remain likely but they shouldn’t be more than a few weeks, barring any future surprises.
News
NVIDIA Director of Robotics: Tesla FSD v14 is the first AI to pass the “Physical Turing Test”
After testing FSD v14, Fan stated that his experience with FSD felt magical at first, but it soon started to feel like a routine.
NVIDIA Director of Robotics Jim Fan has praised Tesla’s Full Self-Driving (Supervised) v14 as the first AI to pass what he described as a “Physical Turing Test.”
After testing FSD v14, Fan stated that his experience with FSD felt magical at first, but it soon started to feel like a routine. And just like smartphones today, removing it now would “actively hurt.”
Jim Fan’s hands-on FSD v14 impressions
Fan, a leading researcher in embodied AI who is currently solving Physical AI at NVIDIA and spearheading the company’s Project GR00T initiative, noted that he actually was late to the Tesla game. He was, however, one of the first to try out FSD v14.
“I was very late to own a Tesla but among the earliest to try out FSD v14. It’s perhaps the first time I experience an AI that passes the Physical Turing Test: after a long day at work, you press a button, lay back, and couldn’t tell if a neural net or a human drove you home,” Fan wrote in a post on X.
Fan added: “Despite knowing exactly how robot learning works, I still find it magical watching the steering wheel turn by itself. First it feels surreal, next it becomes routine. Then, like the smartphone, taking it away actively hurts. This is how humanity gets rewired and glued to god-like technologies.”
The Physical Turing Test
The original Turing Test was conceived by Alan Turing in 1950, and it was aimed at determining if a machine could exhibit behavior that is equivalent to or indistinguishable from a human. By focusing on text-based conversations, the original Turing Test set a high bar for natural language processing and machine learning.
This test has been passed by today’s large language models. However, the capability to converse in a humanlike manner is a completely different challenge from performing real-world problem-solving or physical interactions. Thus, Fan introduced the Physical Turing Test, which challenges AI systems to demonstrate intelligence through physical actions.
Based on Fan’s comments, Tesla has demonstrated these intelligent physical actions with FSD v14. Elon Musk agreed with the NVIDIA executive, stating in a post on X that with FSD v14, “you can sense the sentience maturing.” Musk also praised Tesla AI, calling it the best “real-world AI” today.
News
Tesla AI team burns the Christmas midnight oil by releasing FSD v14.2.2.1
The update was released just a day after FSD v14.2.2 started rolling out to customers.
Tesla is burning the midnight oil this Christmas, with the Tesla AI team quietly rolling out Full Self-Driving (Supervised) v14.2.2.1 just a day after FSD v14.2.2 started rolling out to customers.
Tesla owner shares insights on FSD v14.2.2.1
Longtime Tesla owner and FSD tester @BLKMDL3 shared some insights following several drives with FSD v14.2.2.1 in rainy Los Angeles conditions with standing water and faded lane lines. He reported zero steering hesitation or stutter, confident lane changes, and maneuvers executed with precision that evoked the performance of Tesla’s driverless Robotaxis in Austin.
Parking performance impressed, with most spots nailed perfectly, including tight, sharp turns, in single attempts without shaky steering. One minor offset happened only due to another vehicle that was parked over the line, which FSD accommodated by a few extra inches. In rain that typically erases road markings, FSD visualized lanes and turn lines better than humans, positioning itself flawlessly when entering new streets as well.
“Took it up a dark, wet, and twisty canyon road up and down the hill tonight and it went very well as to be expected. Stayed centered in the lane, kept speed well and gives a confidence inspiring steering feel where it handles these curvy roads better than the majority of human drivers,” the Tesla owner wrote in a post on X.
Tesla’s FSD v14.2.2 update
Just a day before FSD v14.2.2.1’s release, Tesla rolled out FSD v14.2.2, which was focused on smoother real-world performance, better obstacle awareness, and precise end-of-trip routing. According to the update’s release notes, FSD v14.2.2 upgrades the vision encoder neural network with higher resolution features, enhancing detection of emergency vehicles, road obstacles, and human gestures.
New Arrival Options also allowed users to select preferred drop-off styles, such as Parking Lot, Street, Driveway, Parking Garage, or Curbside, with the navigation pin automatically adjusting to the ideal spot. Other refinements include pulling over for emergency vehicles, real-time vision-based detours for blocked roads, improved gate and debris handling, and Speed Profiles for customized driving styles.
Elon Musk
Elon Musk’s Grok records lowest hallucination rate in AI reliability study
Grok achieved an 8% hallucination rate, 4.5 customer rating, 3.5 consistency, and 0.07% downtime, resulting in an overall risk score of just 6.
A December 2025 study by casino games aggregator Relum has identified Elon Musk’s Grok as one of the most reliable AI chatbots for workplace use, boasting the lowest hallucination rate at just 8% among the 10 major models tested.
In comparison, market leader ChatGPT registered one of the highest hallucination rates at 35%, just behind Google’s Gemini, which registered a high hallucination rate of 38%. The findings highlight Grok’s factual prowess despite the AI model’s lower market visibility.
Grok tops hallucination metric
The research evaluated chatbots on hallucination rate, customer ratings, response consistency, and downtime rate. The chatbots were then assigned a reliability risk score from 0 to 99, with higher scores indicating bigger problems.
Grok achieved an 8% hallucination rate, 4.5 customer rating, 3.5 consistency, and 0.07% downtime, resulting in an overall risk score of just 6. DeepSeek followed closely with 14% hallucinations and zero downtime for a stellar risk score of 4. ChatGPT’s high hallucination and downtime rates gave it the top risk score of 99, followed by Claude and Meta AI, which earned reliability risk scores of 75 and 70, respectively.

Why low hallucinations matter
Relum Chief Product Officer Razvan-Lucian Haiduc shared his thoughts about the study’s findings. “About 65% of US companies now use AI chatbots in their daily work, and nearly 45% of employees admit they’ve shared sensitive company information with these tools. These numbers show well how important chatbots have become in everyday work.
“Dependence on AI tools will likely increase even more, so companies should choose their chatbots based on how reliable and fit they are for their specific business needs. A chatbot that everyone uses isn’t necessarily the one that works best for your industry or gives accurate answers for your tasks.”
In a way, the study reveals a notable gap between AI chatbots’ popularity and performance, with Grok’s low hallucination rate positioning it as a strong choice for accuracy-critical applications. This was despite the fact that Grok is not used as much by users, at least compared to more mainstream AI applications such as ChatGPT.