Senior Site Reliability Engineer
Seattle, Washington
Summary
Apple Services Engineering team is one of the most exciting examples of Apple's long-held passion for combining art and technology. Join Apple Services Engineering Cloud Service Infrastructure team, as a Site Reliability Engineer, to help support and scale cloud services for millions of Apple users.
We are building and supporting new and existing critical infrastructural systems and frameworks which provide and support services like structured and unstructured storage, caching, queueing, searching, and much more at hyperscale. These form the platform upon which many iCloud and other backend systems at Apple are built. The team is responsible for the next generation platform that will power Apple's infrastructural services. These services operate at extremely large scale and store exabytes of data. The platform will support a variety of services based on open-source software, such as Kubernetes, Cassandra, Zookeeper, Kafka, Redis, etc, alongside internally developed services.
Key Qualifications
Strong emphasis on SRE as an engineering subject area, with proficiency in at least in one of the following languages (Golang, Rust, Python, Swift)
Successful track-record and proven experience as a backend internet services software developer
Knowledge of SDLC, including continuous integration, testing methodologies, TDD and agile development methodologies
Understanding of base internet infrastructure services including DNS, DHCP, LDAP, server virtualization, server monitoring in critical, large scale distributed systems experience, combining Hardware, Operating Systems and Software
Understanding of SRE principals, including monitoring, alerting, error budgets, fault analysis, and other common reliability engineering concepts, with a keen eye for opportunities to eliminate toil by code and process improvements.
Description
The Apple Services Engineering Cloud Services SRE organization is looking for a strong, enthusiastic developer to join as a member of this group. This person will have a tremendous amount of individual responsibility and influence over the direction the core platform of many critical Apple internet services takes for years to come. You are someone with ideas and real passion for software delivered as a service to improve reuse, efficiency, and simplicity. This engineer's work will affect hundreds of millions of users and be essential to the success of some of the most visible current and future Apple features.
We are domain experts in fleet management, systems, and software engineering. We build automations, instrument reliability tools, and respond to alerts and incidents which may pose a risk to the reliability of the platform. Team's focus is on infrastructure capabilities and processes, improving the reliability and efficiency of the systems, at scale. We are looking for a strong, enthusiastic developer to join as a member of this group. This person will have a tremendous amount of individual responsibility and influence over the direction the core platform of many critical Apple internet services takes for years to come. You are someone with ideas and real passion for software delivered as a service to improve reuse, efficiency, and simplicity. This engineer's work will affect hundreds of millions of users and be essential to the success of some of the most visible current and future Apple features.
Desired Skills:
• Experience with large scale server provisioning and maintenance (OpenStack Ironic, Metal3, MAAS, xCat, Netbox, Tinkerbell)
• Experience with development within Kubernetes ecosystem, including operator framework, controllers and CRDs
• Experience with UI frameworks such as React, Angular or JQuery
Some exposure to the following:
• Hardware bootstrap and associated security (PXE, BIOS, TPM, secure boot, trusted computing)
• Structured or unstructured storage and caching
• Automating operations processes via services and tools
• Configuration management and fleet orchestration via Puppet, Chef, Ansible, or others
• Cloud Services (AWS S3/EC2/CloudFront or equivalent)
Education & Experience
Bachelors or Masters in Computer Science, Computer Engineering, or equivalent experience.
Apple Services Engineering team is one of the most exciting examples of Apple's long-held passion for combining art and technology. Join Apple Services Engineering Cloud Service Infrastructure team, as a Site Reliability Engineer, to help support and scale cloud services for millions of Apple users.
We are building and supporting new and existing critical infrastructural systems and frameworks which provide and support services like structured and unstructured storage, caching, queueing, searching, and much more at hyperscale. These form the platform upon which many iCloud and other backend systems at Apple are built. The team is responsible for the next generation platform that will power Apple's infrastructural services. These services operate at extremely large scale and store exabytes of data. The platform will support a variety of services based on open-source software, such as Kubernetes, Cassandra, Zookeeper, Kafka, Redis, etc, alongside internally developed services.
Key Qualifications
Strong emphasis on SRE as an engineering subject area, with proficiency in at least in one of the following languages (Golang, Rust, Python, Swift)
Successful track-record and proven experience as a backend internet services software developer
Knowledge of SDLC, including continuous integration, testing methodologies, TDD and agile development methodologies
Understanding of base internet infrastructure services including DNS, DHCP, LDAP, server virtualization, server monitoring in critical, large scale distributed systems experience, combining Hardware, Operating Systems and Software
Understanding of SRE principals, including monitoring, alerting, error budgets, fault analysis, and other common reliability engineering concepts, with a keen eye for opportunities to eliminate toil by code and process improvements.
Description
The Apple Services Engineering Cloud Services SRE organization is looking for a strong, enthusiastic developer to join as a member of this group. This person will have a tremendous amount of individual responsibility and influence over the direction the core platform of many critical Apple internet services takes for years to come. You are someone with ideas and real passion for software delivered as a service to improve reuse, efficiency, and simplicity. This engineer's work will affect hundreds of millions of users and be essential to the success of some of the most visible current and future Apple features.
We are domain experts in fleet management, systems, and software engineering. We build automations, instrument reliability tools, and respond to alerts and incidents which may pose a risk to the reliability of the platform. Team's focus is on infrastructure capabilities and processes, improving the reliability and efficiency of the systems, at scale. We are looking for a strong, enthusiastic developer to join as a member of this group. This person will have a tremendous amount of individual responsibility and influence over the direction the core platform of many critical Apple internet services takes for years to come. You are someone with ideas and real passion for software delivered as a service to improve reuse, efficiency, and simplicity. This engineer's work will affect hundreds of millions of users and be essential to the success of some of the most visible current and future Apple features.
Desired Skills:
• Experience with large scale server provisioning and maintenance (OpenStack Ironic, Metal3, MAAS, xCat, Netbox, Tinkerbell)
• Experience with development within Kubernetes ecosystem, including operator framework, controllers and CRDs
• Experience with UI frameworks such as React, Angular or JQuery
Some exposure to the following:
• Hardware bootstrap and associated security (PXE, BIOS, TPM, secure boot, trusted computing)
• Structured or unstructured storage and caching
• Automating operations processes via services and tools
• Configuration management and fleet orchestration via Puppet, Chef, Ansible, or others
• Cloud Services (AWS S3/EC2/CloudFront or equivalent)
Education & Experience
Bachelors or Masters in Computer Science, Computer Engineering, or equivalent experience.
Created: 2024-04-19
Reference: 200536456
Country: United States
State: Washington
City: Seattle
ZIP: 98109
About Apple
Founded in: 1976
Number of Employees: 154000
Website: https://www.apple.com/
Career site: https://www.apple.com/careers/us/
Wikipedia: https://en.wikipedia.org/wiki/Apple_Inc.
Instagram: https://www.instagram.com/apple/
LinkedIn: https://www.linkedin.com/company/apple
Similar jobs:
-
Cloud Site Reliability Engineer, Cloud and System
TikTok in Seattle, Washington -
Site Reliability Engineering Manager, Storage - Apple Cloud Services
Apple in Seattle, Washington -
Site Reliability Engineer, Cloud Native Platform
TikTok in Seattle, Washington -
Site Reliability Engineer (SRE) Manager - Security, Apple Service Engineering
Apple in Seattle, Washington -
Site Reliability Engineer - AML Global Recommendation - USDS
TikTok in Seattle, Washington -
Infrastructure Site Reliability Engineer (Entry Level) - USDS
TikTok in Seattle, Washington -
Sr. Manager - Reliability Engineering, Project Kuiper
Amazon in Redmond, Washington💸 $143800 per year -
Site Reliability Engineer, Trust & Safety - USDS
TikTok in Seattle, Washington -
Senior Site Reliability Engineer, Cloud Native Platform
TikTok in Seattle, Washington -
Site Reliability Engineer, Global E-Commerce
TikTok in Seattle, Washington -
Sr Data Scientist, Global Reliability, Maintenance \u0026 Engineering, Decision Science and Technology (Level 6)
Amazon in Bellevue, Washington💸 $127300 per year -
Senior Software Engineer - TikTok Mobile Reliability, Android
TikTok in Seattle, Washington -
Site Reliability Engineer - USDS (Seattle)
TikTok in Seattle, Washington -
Site Reliability Engineer, Recommendation Infrastructure - USDS
TikTok in Seattle, Washington -
Site Reliability Engineer, TikTok Server Architecture
TikTok in Seattle, Washington -
Senior Site Reliability Engineer, Cloud Native Traffic Infrastructure
TikTok in Seattle, Washington -
Site Reliability Engineer, ML System
TikTok in Seattle, Washington -
Site Reliability Engineer - SRE
Sogeti in Seattle, Washington -
Senior Site Reliability Engineer, Infrastructure
TikTok in Seattle, Washington -
Site Reliability Engineering (SRE) Manager, Apple Services Engineering
Apple in Seattle, Washington