Overnight, the release notes for Tesla’s next FSD Beta software update made their way to the internet. In the post shared by @WholeMarsBlog, there’s a lot of great detail provided in these bullet points.
The official build number for FSD Beta v10.11 is 2022.4.5.15 and should be one of the final releases before the big FSD Beta V11 update that moves the programming to a single stack, allowing the highway and city street autonomy to run on the same code base, predominantly made up of AI, rather than hand-crafted code.
The update is due today for US customers, while new Canada customers who have completed a week of safety score, should also get it shortly. Originally the US release was scheduled for Tuesday and Canada for this weekend. Now the US release has been delayed till this weekend (a few hours left), its not clear if the release for Canada will now be at the same time of the US, or also delayed by a few days.
Regardless of when Canada get the FSD Beta release, expanding to a country outside the US is a really big step for the program and the world will watch closely to see how long region-specific differences can be accommodated.
Now for the detail on the release notes.
Upgraded modeling of lane geometry from dense rasters (“bag of points”) to an autoregressive decoder that directly predicts and connects “vector space” lanes point by point using a transformer neural network.
This enables us to predict crossing lanes, allows computationally cheaper and less error prone post-processing, and paves the way for predicting many other signals and their relationships jointly and end-to-end.
This item suggests Tesla is making a really big move in the vision system, moving to a neural network to determine lane geometry. ArcGIS is a Geospatial service that actually has a great visualisation of what Tesla is talking about with this ‘bag of points’ reference.
When the raw camera data is fed into the system, Autopilot and FSD make a determination about the environment it sees and therefore applies a confidence (or inference) level to that determination. The goal here is to change the interpretation and prediction techniques to improve the confidence in that assessment, so the car can make the right decision more often.
While Tesla does a great job of tracking lanes where you have white lines on both sides, there can be issues where lane markings are missing or where lane widths change significantly, like roads that go from single lanes to multiple lanes. There can often be turning lanes which also mean the lane lines are not present and the system needs to determine which lane to be in.
To add complexity to this, you have the curvature of the road, so naturally, the smoothest experience for your passengers would be to minimise the deviation from the current trajectory, but if that’s actually incorrect based on road rules, determined by a turning lane marking on the road that’s gets visualised late, it could force the path planning system to make a rapid adjustment to your trajectory.
If Tesla is able to accurately and rapidly transform the road ahead from the camera inputs, to a vector space, then we can imagine their path and route planning can occur just as well as it does in a video game. Video games work in the vector space and are then transformed into raster, so consider something like the route a safety car would take around a circuit. Video games are able to do this without issue and Tesla is hoping to do the same, but rather than the output is a digital one, FSD converts that into commands for the vehicle control system to execute.
All of this should improve the feeling of being in the vehicle and have a much smoother experience, fewer errors and get us closer to reliable automation in a Tesla.
Andrej Karpathy has since shared more detail on this point.
Use more accurate predictions of where vehicles are turning or merging to reduce unnecessary slowdowns for vehicles that will not cross our path.
With Tesla’s Computer Vision system, they not only monitor the road ahead for potential drivable space but also the trajectory of other objects in the enviornment. This monitoring isn’t static, it makes forecasts about the future trajectories a person, animal, or in this case, other cars will make.
Where a potential trajectory overlaps with the path panned for the Tesla vehicle to travel, it makes sense that brakes would be applied to avoid an accident. The issue here is that where a potential trajectory doesn’t turn out to be the actual trajectory, you’ve unnecessarily slowed, causing delays to the occupants and failing to respond to an environment the way a human would.
This is obviously incredibly important to get right and it’s important to know that Tesla isn’t guessing about where a vehicle will be 1-2 seconds from now, they make a determination based on the best data available at the time. Remember the advantages that computer vision has over humans, they are sampling the environment many times per second, much faster than humans, and adapting accordingly. The car doesn’t need to blink, turn its head to look in other directions, is not distracted by other passengers, or the music on the radio, so if Tesla gets this right, the unnecessary slowdowns will be a thing of the past.
Improved right-of-way understanding if the map is inaccurate or the car cannot follow the navigation. In particular, modeling intersection extents is now entirely based on network predictions and no longer uses map-based heuristics.
There has been an ongoing piece of feedback from FSD Beta users that where map data is incorrect, the car behaves badly, making decisions a human wouldn’t. This is essentially a conflict resolution issue. FSD is being told one thing by the map data, to route a particular way, but where there’s something preventing that (local road works etc), then the car needs to be able to dynamically adjust and make alternate arrangements.
The second part of this statement confirms that Tesla was using map-based heuristics as a technique to guess their way out of trouble, but this is now being replaced. Tesla will now use network predictions which should provide them with a much broader dataset to make routing decisions from. Rather than looking at this specific intersection and looking for potential exits, it can draw on a vast array of data from similar intersections and understanding how cars navigate similar environments, can make an alternate route plan to move around the obstacle (like a road closed sign).
Improved the precision of VRU detections by 44.9%, dramatically reducing spurious false positive pedestrians and bicycles (especially around tar seams, skid marks, and rain drops).
This was accomplished by increasing the data size of the next-gen autolabeler, training network parameters that were previously frozen, and modifying the network loss functions. We find that this decreases the incidence of VRU-related false slowdowns.
VRU stands for Vulnerable Road user and what’s really promising about this bullet point in the release notes, is the strong 44.9% improvement to detections. Previously we’ve seen single-digit gains in specific areas of the system, but to improve any part by this much is significant.
Often after FSD Beta testers get the new version of the software in their cars, they’ll take it for a few drives to determine if they can notice any difference. There are often subtle improvements, but a change this large should absolutely be noticeable by all users.
Tesla was clearly having difficulty with false positives in the system, pointing to spurious detections of pedestrians and bikes. A false positive of either could result in the brakes being applied when they are not necessary.
The next part of the statement is really telling, ‘especially around tar seams, skid marks, raid drops’. The first two of those relate to the system being able to determine depth from the camera images and understanding if they are something the car needs to break for or not.
Clearly, road fixes like tar seams and skid marks are a few mm high and are not something you would need to break for. It is easy to imagine that in particular lighting conditions, these may reflect light in a way that made them appear as a larger object than they really are and if it was something like tyre debris on the road, you’d absolutely want to avoid it.
The last part of the statement focused on rain drops which I assume is not a reference to raindrops on the windscreen that obscures the camera, the system (and Rain AI) does a pretty great job in this area. What I imagine the rain drop reference to be is when the rain starts and the road surface is only partially wet by the rain drops. This would form lighter and darker areas on the road which potentially was resulting in false positives and also sounds like something Tesla has resolved.
So how did they resolve this? Well, it’s interesting Tesla goes into some great detail here, something we often don’t get. The release notes suggest they increased the data size of the next-gen auto labeler. First off, auto-labeling is still a very new technique for Tesla, replacing human-labeled images and offers dramatic efficiency improvements. To see a reference to a next-gen auto labeler, it sounds like Tesla has evolved this already (potentially even rewritten it), since introducing it just a few months ago.
Reduced the predicted velocity error of very close-by motorcycles, scooters, wheelchairs, and pedestrians by 63.6%. To do this, we introduced a new dataset of simulated adversarial high speed VBU Interactions. This update improves autopilot control around fast-moving and cutting-in VRUs.
Again we see an incredibly large percentage gain at 63.6% change to the predicted velocity errors for motorbikes, scooters, wheelchairs and pedestrians close-by. Velocity prediction is most likely to play a big role when accelerating away from a traffic light, around speed zone changes or overtakes (like NoA).
The update reports to improve Autopilot control around fast-moving and cutting-in in relation to vulnerable road users. For other cars and bikes, I get it, but when I think about times where the car and particularly Autopilot would interact with scooters and wheelchairs, it’s likely on crossings, so hopefully, Tesla can find the balance of safely waiting for these VRU to pass, without being overly generous and taking too long to proceed.
Improved creeping profile with higher jerk when creeping starts and ends.
Creeping happens when the car’s view of oncoming traffic is somewhat obscured. As humans do, it can creep forward to get a better view of the traffic, to then select an appropriate gap to move into. The creeping profile relates to the profile of how fast this creep occurs and the rate of slow down to stop if the car needs to wait.
This should be smooth to be comfortable for occupants, however the reference to a higher jerk when creeping starts is a little confusing. Potentially this suggests it’ll be a little more spritely as it moves in this negotiation with other traffic, something definitely to watch for in the inevitable YouTube videos after the release reaches customers.
Improved control for nearby obstacles by predicting continuous distance to static geometry with the general static obstacle network.
A general static obstacle network is likely a reference to a neural network that determines the distance of objects over time, based on a set of pixels not moving over a series of frames. This is something Andrej Karpathy has spoken about before when highlighting Tesla’s approach to depth perception based solely on Computer Vision, in a world without Radar.
Reduced vehicle “parked” attribute error rate by 17%, achieved increasing the dataset size by 14%. Also improved brake light accuracy.
There has been an issue highlighted by a number of FSD Beta participants, where the car would attempt to overtake a car that is stopped. The system does need to do this where say a car is double-parked, or you’d be waiting all day for them to move, but done incorrectly, could see the car cross over double lines in attempt to move around a vehicle.
Where this occurs at a red light, an incorrect determination has been made and this bullet point suggests things sound improve significantly in this respect. It’s hard to say if this means the issue is solved, but certainly should result in a noticable improvement.
Improved clear-to-go scenario velocity error by 5% and highway scenario velocity error by 10%, achieved by tuning loss function targeted at improving performance in difficult scenarios.
Clear-to-go scenarios are when there is no objects (or future trajectories) that overlap with the projected path for the car to follow, therefore it is clear to go. It seems there were times that the car was not going and that often was resolved with a quick tap of the accelerator, but remember, all input is error, so this fix, should help resolve that.
The second point around highway velocity errors being improved by 10%, suggests that FSD or AP was setting the speed incorrectly on highways. Tesla says they solved this by tuning their loss function, a measure of how good your prediction model is, and that the change focused on difficult scenarios. Potentially this could improve driving for those who have identified speed signs on overpasses being incorrectly applied to your current speed.
Improved detection and control for open car doors.
We are all challenged by this task as drivers, when passing stationary cars, particularly ones that have recently stoped, there is a chance the occupants open their doors into your path. Without action, this is likely to create an accident, but Tesla’s ability to check available space in the lane (and the lane next to them) and move the car over to avoid an accident, could definitely help here.
While there’s no debate, the person opening the door has the obligation to check the path is clear, if Tesla can avoid an accident, they should. It seems like this update does a better job of detecting and then controlling (read: avoiding), doors opening in front of it.
Improved smoothness through turns by using an optimization based approach to decide which road lines are irrelevant for contra given lateral and longitudinal acceleration and jerk limits as well as vehicle kinematics.
While this is near the end of the release notes, this may get FSD Beta users the most excited of all. Smoothing out the turn radius of the vehicle around corners should deliver a more human-like and potentially even super-human feel to the drive.
At intersections, there can often be an large volume of superfluous lines on the road, most of which we ignore as drivers, while FSD is busily working out if any are meaningful to its path planning network. Most of the time, we create these trajectories around corners in our minds and adapt according to the specific corner we’re taking. It seems like Tesla are taking a similar approach in the way they guide the car through a particularly space, understanding that many lines are indeed irrelevant.
Improved stability of the FSD UI visualizations ethernet data transfer pipeline by 15%.
The visualisations on screen have improved dramatically since the first versions of FSD Beta and it seems we’re getting another significant boost to that in this release. As Tesla talks about the stability of the FSD UI visualisation, it’s easy to imagine this improvement as less cars moving between different object types (car, truck, back to a car etc). The other issue that is getting better is the ‘dancing’ cars that are obviously stationary when you look out the window, but the visualisation is busy assessing and re-assessing their position and orientation.
The whole idea of visualising the outside environment to the vehicle occupants, is to provide confidence the system is accurately seeing the world around it. When objects move, and morph, this reduces confidence, so any improvements here are very welcome.
Improved recall for vehicles directly behind ego, and improve precision for vehicle detection network.
Finally the last point relates to improved recall of vehicles directly behind. Vehicles behind may not seem all that important on the surface, but improving the precision around identifying the following vehicle can be incredibly important when forecasting (and hopefully avoiding) a potential rear-end collision.
If Tesla are able to monitor the distance to the following vehicle more accurately, they could potenitally check available space ahead of the vehicle and accelerate to avoid an accident.