Software Engineer, Model Scaling, Autopilot AI

Palo Alto, California


Employer: Tesla Motors
Industry: AI
Salary: Competitive
Job type: Full-Time

As a Software Engineer on Tesla's Autopilot AI team, you will play a crucial role in optimizing and scaling our neural network training infrastructure. You will join a specialized team of machine learning experts and have access to one of the world's largest model training clusters. Your primary focus will be to design, implement, and maintain high-performance applications for neural network training, evaluation, and data processing pipelines. Additionally, you will build supporting applications for profiling and debugging, and work on optimizing training and evaluation code to maximize efficiency and minimize resource usage.



Responsibilities
  • Design and Implement Large-Scale Data Pipelines: Build and maintain robust data processing pipelines that handle petabytes of autonomous vehicle data, including images, videos, and auto-generated labels, ensuring scalability and reliability
  • Optimize Neural Network Training Processes: Support neural network training by optimizing code and data formats for faster data loading, orchestrating auto-labeling jobs, and debugging bottlenecks to enhance overall training efficiency
  • Enhance System Performance: Develop and implement automation, monitoring, and optimization tools to improve the efficiency of system performance, including resource utilization, parallelism, and data I/O
  • Collaborate with Machine Learning Researchers: Work closely with researchers to understand and execute their data and infrastructure requirements, providing solutions that facilitate rapid experimentation and production-scale model deployment
  • Develop Evaluation Tools and Dashboards: Create and maintain evaluation metrics, tools, visualizations, and dashboards to support the development and refinement of neural networks
  • Implement Low-Level Integrations: Write efficient, low-level code that integrates with high-level training frameworks to enhance performance across various hardware platforms, including Dojo, Tesla's supercomputer
  • Stay Updated with ML Advancements: Keep abreast of the latest advancements and technologies in machine learning engineering to continually improve Tesla's AI infrastructure


Requirements
  • Strong Software Engineering Skills: Extensive experience with Python and software engineering best practices, including code optimization and system-level programming
  • Experience with Deep Learning Frameworks: Proficiency in one or more deep learning frameworks, such as PyTorch or TensorFlow, with hands-on experience in optimizing model training processes
  • Data Manipulation and Analysis Expertise: Proficiency with data manipulation tools, including Jupyter notebooks, numpy, scipy, matplotlib, and scikit-learn, and experience handling large-scale data processing
  • System Optimization and Debugging: Demonstrated experience in profiling and optimizing CPU/GPU code and debugging complex system-level software to ensure high performance and reliability
  • Distributed Systems Experience: Proven track record of building and managing large-scale distributed systems, particularly in AI/ML workflows, with a deep understanding of parallel computing, resource utilization, and data handling
  • Knowledge of Storage and Data Formats: Strong understanding of underlying storage mechanisms and experience designing and optimizing data formats for machine learning workflows
  • Familiarity with High-Performance Networking: Experience with high-performance networking technologies, such as Infiniband, RDMA, and NCCL, is a plus
  • Passion for AI and Machine Learning: A deep understanding of machine learning concepts and a passion for staying current with the latest advancements in AI research and engineering


Compensation and Benefits
Benefits

Along with competitive pay, as a full-time Tesla employee, you are eligible for the following benefits at day 1 of hire:
  • Aetna PPO and HSA plans > 2 medical plan options with $0 payroll deduction
  • Family-building, fertility, adoption and surrogacy benefits
  • Dental (including orthodontic coverage) and vision plans, both have options with a $0 paycheck contribution
  • Company Paid (Health Savings Account) HSA Contribution when enrolled in the High Deductible Aetna medical plan with HSA
  • Healthcare and Dependent Care Flexible Spending Accounts (FSA)
  • LGBTQ+ care concierge services
  • 401(k) with employer match, Employee Stock Purchase Plans, and other financial benefits
  • Company paid Basic Life, AD&D, short-term and long-term disability insurance
  • Employee Assistance Program
  • Sick and Vacation time (Flex time for salary positions), and Paid Holidays
  • Back-up childcare and parenting support resources
  • Voluntary benefits to include: critical illness, hospital indemnity, accident insurance, theft & legal services, and pet insurance
  • Weight Loss and Tobacco Cessation Programs
  • Tesla Babies program
  • Commuter benefits
  • Employee discounts and perks program


    • Expected Compensation

      $116,000 - $360,000/annual salary, depending on level + cash and stock awards + benefits

      Pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. The total compensation package for this position may also include other elements dependent on the position offered. Details of participation in these benefit plans will be provided if an employee receives an offer of employment.

Created: 2024-09-20
Reference: 224526
Country: United States
State: California
City: Palo Alto


Similar jobs: