More

    A peek inside Tesla Cortex, the AI supercluster that’ll deliver autonomous cars and robots

    Tesla CEO Elon Musk has shared a new video tonight from inside the Gigafactory in Austin Texas. The recent extension to the south side of the factory now houses tens of thousands of high-performance AI chips, combining to make one of the largest compute clusters in the world.

    Tesla famously made the decision to redirect Nvidia’s H100 GPUs from Tesla to xAI a few months back, which sparked considerable discussion regarding resource allocation across his enterprises.

    Initially, 12,000 GPUs intended for Tesla were redirected to xAI, with Musk emphasizing that Tesla lacked the immediate infrastructure to utilize these chips effectively.

    It seems that outrage was not really necessary, with Tesla’s turn now arriving to expand their AI training datacenter. It is expected that between Nvidia and Tesla’s own Dojo chips, that between 85,000 and 100,000 H100 equivalents, will be in place by the year-end.

    In Musk’s post, he says this supercluster is is being built at Tesla HQ (in Austin, TX), to solve real-world AI.

    Tesla’s description of “real-world AI” primarily revolves around its approach to developing autonomous driving technology. Having an understanding of the real world (and concepts like physics), is particularly useful in other applications like humanoid robots.

    Tesla Data-Driven Approach
    By using data collected from the real world, Tesla’s approach varies from traditional methods that rely heavily on simulations or pre-defined scenarios.

    When it comes to autonomous driving, Tesla’s system learns from the vast amount of data collected from vehicles on the road. Each car is equipped a serious of cameras, used to collect the data, which is fed back to the mothership for analysis and training the next version of the AI model. The model is then wrapped up in a software update and released to customer cars and the cycle continues.

    End-to-End Learning
    Tesla employs what’s often described as an end-to-end learning model.

    This means the AI system doesn’t just process sensory data to make decisions based on pre-programmed rules. Instead, it aims to directly translate raw input (like camera images) into driving actions.

    This approach reduces the need for human-defined rules and allows the system to learn driving behaviors more akin to how a human might learn to drive.

    Scalability and Generalization
    Tesla’s approach aims for an AI that can generalize from the data it’s been trained on to handle new, unseen situations. This is crucial for autonomous systems where the system must adapt to infinite variations in real-world conditions.

    What did we see in the video?

    Videos of this kind of infrastructure are rare, so what can we learn from the video we just got? Let’s break it down, frame by frame.

    1. In construction

    Musk wasn’t kidding when he said this is ‘being built’.. the cable trays at the top of the server cabinets are largely empty. There’s also a few safety bollards, ladders and safety vests through the video.

    2. With great AI comes great reliance on Power.

    While many server racks will feature high capacity power cables, we can see the infrastructure from above providing the power to these hungry GPUs. Given the current state of the cabinets (doors off), it’s unlikely the power draw is much right now, but expect the lights to dim in the surrounding suburbs when it is turned up to 11.

    3. GPUs as far as the eye can see.

    I watched the video multiple times, and tried to count the isles, but after about 8, it’s really difficult to see. We can spot some tiny humans well off into the distance.

    From this it is clear, this takes an immense amount of physic space, and it’d be easy to clock up your step count each day if you worked here.

    4. Hot/Cold isles

    When you build a modern datacenter, you have to think about heating and cooling (a lot). This isle is a cold isle (or more likely a room temperature isle), with the front of the server blades facing inwards. They will exhaust their hot air out the back through the hot isles where the head it taken away from the expensive chips.

    The floor looks like it’ll catch any dust, with the room likely to be engineered with positive pressure to ensure the dust stays away from the expensive bits.

    The roof is clearly mid-install and the worker appears to be using lift to install the server blades. The front of the racks will also get covers/dust filters once it is configured.

    On the roof there’s a CCTV camera to monitor the servers 24/7 and the whole room gets enclosed by glass doors at either end to complete the control over the airflow.

    5. High tech/low-tech

    While Tesla’s new massive compute cluster is a multi-billion dollar investment, there’s still something cute about a low-tech solution that works.

    Meet Isle #4, as indicated by the very low-tech cardboard sign, held up with duct tape… whatever works right.

    You can watch the full video from Elon below.

    Jason Cartwright
    Jason Cartwrighthttps://techau.com.au/author/jason/
    Creator of techAU, Jason has spent the dozen+ years covering technology in Australia and around the world. Bringing a background in multimedia and passion for technology to the job, Cartwright delivers detailed product reviews, event coverage and industry news on a daily basis. Disclaimer: Tesla Shareholder from 20/01/2021

    Leave a Reply

    Ads

    Latest posts

    Reviews

    Related articles

    techAU