AIML - Senior Software Engineer, Model Optimization Library, On-Device Machine Learning

Cupertino, California


Employer: Apple
Industry: Machine Learning and AI
Salary: Competitive
Job type: Full-Time

Summary
Are you excited about the impact that compressing deep learning models can have on enabling transformative user experiences? The field of ML compression research continues to grow rapidly and new techniques to perform quantization, pruning etc are increasingly available to be ported and adopted by the ML developer community looking to ship more models in a constrained memory budget and make them run faster. While there are thousands of ML models across Apple internal and external apps, running locally on millions of Apple devices, only a small fraction of these currently make use of such optimization techniques. Bridging this gap is the mission of this team. We work on a python library that implements a variety of training time and post training quantization algorithms and provides them to developers as simple to use, turnkey APIs, and ensures that these optimizations work seamlessly with the Core ML inference stack and Apple hardware.

We are a small core team that collaborates heavily with researchers at Apple, ML deployment, ML compiler and hardware architecture teams. If you are excited about making a big impact and playing a critical role in growing the user base and driving the adoption of a relatively new framework, this is a great opportunity for you.

We are looking for someone who is highly self motivated. If you are passionate about writing modular, extensible code, simple to use, clean APIs, and has a proven track record of shipping software that has been adopted by a large base of developers, we strongly encourage you to apply for this role.



Description
We work on developing, prototyping and productizing state of the art algorithms for neural network model compression. Our algorithms are implemented using PyTorch and optimizations are geared towards efficient deployment via Core ML. We optimize models across domains, including NLP, vision, text and large generative models. Our APIs are available to Core ML users, both internal to Apple and external developers via the Core ML Tools optimization sub module. As a successful engineer in our team, you will:

Implement latest algorithms from research papers for model compression
Set up training jobs, datasets, evaluation, performance benchmarking pipelines
Run detailed experiments to profile algorithms on various models and across different sizes and maintaining model cards
Collaborate with ML practitioners across the company to co develop and implement algorithms customized for Apple hardware
Manage python releases and API backward compatibility
Provide user support via various channels including engagement via open source Github
Expanding documentation with examples, benchmark data to drive user adoption and addressing user pain points
Self prioritize and adjust to changing priorities and asks
Be a team player who enjoys doing things that make the overall project successful and others in the team excel. This includes working on detailed reviews of PRs and technical docs, bug fixes, test infra etc.




Created: 2024-09-29
Reference: 200565770
Country: United States
State: California
City: Cupertino

About Apple

Founded in: 1976
Number of Employees: 154000


Similar jobs: