ASE -Site Reliability Engineer
Cupertino, California
Summary
The Apple Service Engineering - SRE team is looking for Site Reliability Engineers with experience in developing processes, tools, and automation for managing distributed systems in production environments. Our SRE team combines software and systems engineering and system administration practices to build and run large-scale, massively distributed, fault-tolerant systems. Our software ensures that Apple's services are reliable, scalable and secure, and we leverage both open source and home-grown technologies to provide managed data infrastructure services.
You will help building next generation search infrastructure and platform services, collaborating cross-functionally with various ASE teams, from store and commerce to search and recommendations. You'll create platforms that can rapidly scale to serve personalized and non-personalized data with very low latencies. You should be someone who is not afraid to question assumptions, are a good standout colleague under tight deadlines, and can take on problems with elegant technical solutions.
Key Qualifications
Demonstrated expertise developing database systems, storage engines, distributed systems, or performance engineering.
Experience developing critical internet services and/or platform infrastructure.
Proficient in one or more of the following programming languages: Java, Go (golang), Python
Optional experiencing with managing services run on Kubernetes
Optional experience with EC2, EBS, and Terraform
Description
The ASE SRE team develops applications and tooling that are safe, reliable, scalable, and fast. This work requires an innovative spirit and an extraordinary degree of care and difficulty in engineering. Team members contribute to all major components of Redis deployment infrastructure, including maintenance automation, backup service application, monitoring and alerting tooling/dashboards, deployment architecture, focused on stability, performance, and scaling.
Success in this role requires expertise in several of the following:
- Understanding of core SRE concepts - Monitoring, Alerting, Incident management.
- Understanding of database concepts (consistency models, isolation levels, crash and recovery semantics).
- Performance engineering (design concepts, profile-guided optimization).
- Service management across a bare metal, virtualized (EC2),Kubernetes platforms.
- Fundamentals of system-level hardware and networking components (storage devices and controllers, network interfaces, CPU and memory layout in server-class systems).
- Operating systems concepts (process scheduling, disk and network I/O, performance).
- Datacenter architecture (networking topologies, host placement strategies, and failure modes); design of multi-datacenter systems; failure domains; and wide-area networking.
This role also requires excellent communication and a high degree of customer focus when engaging with internal platform customers. As a distributed team, ability to work optimally with colleagues based in other locations is also essential; experience in this area is a plus. Prior experience with development or maintenance of distributed databases / storage systems is recommended.
Apple values craftsmanship and Performance is a key ingredient. Come join us at Apple Services Engineering and help us deliver services and applications that are fluid and responsive. You will collaborate with engineers from across Apple to define the metrics, set targets, uncover optimization opportunities, define quality guardrails, and ship a product/service that will delight our customers. This role is for engineers who enjoy deep technical engineering that spans large cross-organizational projects. Your openness to learning and implementing new technologies will contribute to the continuous evolution of our organization.
Education & Experience
BS or MS in Computer Science / related fields or equivalent work experience
The Apple Service Engineering - SRE team is looking for Site Reliability Engineers with experience in developing processes, tools, and automation for managing distributed systems in production environments. Our SRE team combines software and systems engineering and system administration practices to build and run large-scale, massively distributed, fault-tolerant systems. Our software ensures that Apple's services are reliable, scalable and secure, and we leverage both open source and home-grown technologies to provide managed data infrastructure services.
You will help building next generation search infrastructure and platform services, collaborating cross-functionally with various ASE teams, from store and commerce to search and recommendations. You'll create platforms that can rapidly scale to serve personalized and non-personalized data with very low latencies. You should be someone who is not afraid to question assumptions, are a good standout colleague under tight deadlines, and can take on problems with elegant technical solutions.
Key Qualifications
Demonstrated expertise developing database systems, storage engines, distributed systems, or performance engineering.
Experience developing critical internet services and/or platform infrastructure.
Proficient in one or more of the following programming languages: Java, Go (golang), Python
Optional experiencing with managing services run on Kubernetes
Optional experience with EC2, EBS, and Terraform
Description
The ASE SRE team develops applications and tooling that are safe, reliable, scalable, and fast. This work requires an innovative spirit and an extraordinary degree of care and difficulty in engineering. Team members contribute to all major components of Redis deployment infrastructure, including maintenance automation, backup service application, monitoring and alerting tooling/dashboards, deployment architecture, focused on stability, performance, and scaling.
Success in this role requires expertise in several of the following:
- Understanding of core SRE concepts - Monitoring, Alerting, Incident management.
- Understanding of database concepts (consistency models, isolation levels, crash and recovery semantics).
- Performance engineering (design concepts, profile-guided optimization).
- Service management across a bare metal, virtualized (EC2),Kubernetes platforms.
- Fundamentals of system-level hardware and networking components (storage devices and controllers, network interfaces, CPU and memory layout in server-class systems).
- Operating systems concepts (process scheduling, disk and network I/O, performance).
- Datacenter architecture (networking topologies, host placement strategies, and failure modes); design of multi-datacenter systems; failure domains; and wide-area networking.
This role also requires excellent communication and a high degree of customer focus when engaging with internal platform customers. As a distributed team, ability to work optimally with colleagues based in other locations is also essential; experience in this area is a plus. Prior experience with development or maintenance of distributed databases / storage systems is recommended.
Apple values craftsmanship and Performance is a key ingredient. Come join us at Apple Services Engineering and help us deliver services and applications that are fluid and responsive. You will collaborate with engineers from across Apple to define the metrics, set targets, uncover optimization opportunities, define quality guardrails, and ship a product/service that will delight our customers. This role is for engineers who enjoy deep technical engineering that spans large cross-organizational projects. Your openness to learning and implementing new technologies will contribute to the continuous evolution of our organization.
Education & Experience
BS or MS in Computer Science / related fields or equivalent work experience
Created: 2024-10-02
Reference: 200549871
Country: United States
State: California
City: Cupertino
About Apple
Founded in: 1976
Number of Employees: 154000
Website: https://www.apple.com/
Career site: https://www.apple.com/careers/us/
Wikipedia: https://en.wikipedia.org/wiki/Apple_Inc.
Instagram: https://www.instagram.com/apple/
LinkedIn: https://www.linkedin.com/company/apple
Similar jobs:
-
Software Build and Release EPM ASE
Apple in Cupertino, California -
Sr. Software Enginer - Cloud Platform, Kubernetes (ASE)
Apple in Cupertino, California -
Software Build and Release EPM - ASE
Apple in Cupertino, California -
Lead Software Engineer - Data Platform (ASE)
Apple in Cupertino, California -
Mechanic B- ASE Certified
RATP Dev in Visalia, California -
Senior Software Engineer - Data Infrastructure (ASE)
Apple in Cupertino, California -
Sr. Engineering Program Manager -Cloud Infrastructure- ASE
Apple in Cupertino, California -
Senior Data Science Manager, ASE iCloud Data
Apple in Cupertino, California -
Client Software Engineer - iOS/macOS - ASE
Apple in Cupertino, California -
Sr. Engineering Program Manager - Apple Service Engineering (ASE)
Apple in Cupertino, California -
Senior Data Scientist, ASE iCloud Data Organization [Executive Communications]
Apple in Cupertino, California -
Sr. Software Engineer - Traffic (ASE)
Apple in Cupertino, California -
Sr. Software Engineer - Data Platform, Flink (ASE)
Apple in Cupertino, California -
Software Technical Writer, ASE
Apple in San Diego, California -
Sr. Software Engineer (ASE Framework)
Apple in Cupertino, California -
Principal Software Engineer - Traffic (ASE)
Apple in Cupertino, California -
Lead Software Engineer - Data Platform ML (ASE)
Apple in Cupertino, California -
Site Reliability Engineer - ASE
Apple in Cupertino, California -
ASE Site Reliability Engineer
Apple in Cupertino, California -
Sr. Software Engineer - Data Services (ASE)
Apple in Cupertino, California