Senior Site Reliability Engineer - Apple Services Engineering (ASE)
Santa Clara Valley (Cupertino), California
Summary
Do you love engineering and running systems and infrastructure that will delight millions of customers? Imagine what you could do here. At Apple, new ideas have a way of becoming extraordinary products, services, and customer experiences very quickly! Bring passion and dedication to your job, and there's no telling what you could accomplish.
The iCloud SRE team is looking for a Service Reliability Engineer (SRE) to design, build tools and support our large-scale content system for iCloud. The best candidates will have proven software development skills and strong Linux / Systems expertise, understand SRE, and know what it will take to run services at Apple scale with high operational precision. We play a critical role in the day-to-day operations of services relied upon across Apple while partnering with engineering teams to ensure everyone is successful!
Key Qualifications
3+ years in a Site Reliability Engineering, DevOps, or Infrastructure focused role
Experience supporting internet-facing production services and distributed systems
Ability to implement and coordinate telemetry using monitoring and observability tools such as Splunk, Grafana, and Prometheus
Coding experience using a high-level programming language like Java, Golang, or Python
Familiarity with cloud infrastructure concepts (region, AZ, VPC, LB, compute, etc.) and experience with IaaC process
Experience with GitOps, CI/CD tools, and deployment strategies
Experience with configuration management solutions such as Puppet, ansible, or Saltstack and orchestration solutions like Kubernetes or Mesos
General understanding of security and networking concepts
Description
Our team is collaborative; we work closely with partner teams to deliver the best results for Apple. We strive to balance the best solution with the need to get things done for each engineering challenge we face. Good ideas are heard, and results are rewarded.
As an SRE at Apple, you will:
• Operate, monitor, and triage all aspects of our production and non-production environments.
• Pioneer and implement the next-generation telemetry system.
• Prepare alert handling procedures, runbooks, and collaborate with the off-shore SRE teams.
• Automate deployment and orchestration of services into the cloud environment as well as other routine processes.
• Actively participate in capacity planning, scale testing, and disaster recovery exercises.
• Interact with and support partner teams, including engineering, QA, and program management.
• Cultivate and maintain relationships with internal and external third-party vendors.
Education & Experience
Bachelor's Degree in Computer Science, an engineering-related field, or equivalent related experience. Advanced Degree preferred.
Additional Requirements
Do you love engineering and running systems and infrastructure that will delight millions of customers? Imagine what you could do here. At Apple, new ideas have a way of becoming extraordinary products, services, and customer experiences very quickly! Bring passion and dedication to your job, and there's no telling what you could accomplish.
The iCloud SRE team is looking for a Service Reliability Engineer (SRE) to design, build tools and support our large-scale content system for iCloud. The best candidates will have proven software development skills and strong Linux / Systems expertise, understand SRE, and know what it will take to run services at Apple scale with high operational precision. We play a critical role in the day-to-day operations of services relied upon across Apple while partnering with engineering teams to ensure everyone is successful!
Key Qualifications
3+ years in a Site Reliability Engineering, DevOps, or Infrastructure focused role
Experience supporting internet-facing production services and distributed systems
Ability to implement and coordinate telemetry using monitoring and observability tools such as Splunk, Grafana, and Prometheus
Coding experience using a high-level programming language like Java, Golang, or Python
Familiarity with cloud infrastructure concepts (region, AZ, VPC, LB, compute, etc.) and experience with IaaC process
Experience with GitOps, CI/CD tools, and deployment strategies
Experience with configuration management solutions such as Puppet, ansible, or Saltstack and orchestration solutions like Kubernetes or Mesos
General understanding of security and networking concepts
Description
Our team is collaborative; we work closely with partner teams to deliver the best results for Apple. We strive to balance the best solution with the need to get things done for each engineering challenge we face. Good ideas are heard, and results are rewarded.
As an SRE at Apple, you will:
• Operate, monitor, and triage all aspects of our production and non-production environments.
• Pioneer and implement the next-generation telemetry system.
• Prepare alert handling procedures, runbooks, and collaborate with the off-shore SRE teams.
• Automate deployment and orchestration of services into the cloud environment as well as other routine processes.
• Actively participate in capacity planning, scale testing, and disaster recovery exercises.
• Interact with and support partner teams, including engineering, QA, and program management.
• Cultivate and maintain relationships with internal and external third-party vendors.
Education & Experience
Bachelor's Degree in Computer Science, an engineering-related field, or equivalent related experience. Advanced Degree preferred.
Additional Requirements
- Strong verbal and written communication skills
- Automation advocate - you truly believe in removing operational load via software.
- A strong sense of ownership. At the same time, you're a great teammate who communicates clearly and transparently - Self-motivated, inquisitive, and always looking to learn more.
- Experience managing, scaling, and troubleshooting Java applications
Created: 2024-06-05
Reference: 200503087
Country: United States
State: California
City: Santa Clara Valley (Cupertino)
About Apple
Founded in: 1976
Number of Employees: 154000
Website: https://www.apple.com/
Career site: https://www.apple.com/careers/us/
Wikipedia: https://en.wikipedia.org/wiki/Apple_Inc.
Instagram: https://www.instagram.com/apple/
LinkedIn: https://www.linkedin.com/company/apple
Similar jobs:
-
Reliability Engineer
Meta in Sunnyvale, California -
Software Development Engineer in Test -II, WWGST Quality Reliability Engineering
Amazon in Irvine, California💸 $129300 per year -
Service Reliability Engineer - iCloud
Apple in Santa Clara Valley (Cupertino), California -
Principal, Site Reliability Engineer - Kubernetes
Fox Corporation in Los Angeles, California💸 $161500.00 per year -
Site Reliability Engineer - AML Global Recommendation - USDS
TikTok in Mountain View, California -
Senior Site Reliability Engineer, TikTok Server Architecture
TikTok in San Jose, California -
Sr Reliability Engineer
Valero Energy in Benicia, California💸 $123520 - $169840 per year -
Staff Site Reliability Engineer
Fox Corporation in Los Angeles, California💸 $140000.00 per year -
Site Reliability Engineer, Recommendation Infrastructure - USDS
TikTok in Los Angeles, California -
Site Reliability Engineer
Apple in Cupertino, California -
Software Engineering Manager, Reliability Engineering
Roblox in San Mateo, California -
Site Reliability Engineer - Solr
Apple in Cupertino, California -
Software Engineer - TikTok Mobile Reliability, iOS - San Jose
TikTok in San Jose, California -
Cloud DevOps / Site Reliability Engineer, Applied Machine Learning
Apple in Sunnyvale, California -
Engineering Technician, Reliability Testing, Tesla Bot
Tesla Motors in Palo Alto, California💸 $24.40 - $65.40 per hour -
AI Ops Site Reliability Engineer - Data Infrastructure
TikTok in San Jose, California -
Site Reliability Engineer
Insight Global in Redwood City, California💸 $20 - $30 per hour -
Site Reliability Engineer, Data Analytics
Apple in San Diego, California -
Senior SDE, Aurora MySQL Infrastructure Reliability Engineering
Amazon in East Palo Alto, California💸 $151300 per year -
Site Reliability Engineer - Data Infrastructure
TikTok in San Jose, California