Atlassian Services Site Reliability Engineer
San Diego, California
Summary
The Atlassian Services Site Reliability Engineer (SRE) role resides within the Software Delivery organization, which is at the core of the Apple software release process. This role is responsible for applying SRE practices in maintaining Atlassian services, which are used by software engineers and project managers to develop Apple software for delivery to customers around the world. The Atlassian Services team drives reliability and performance engineering of data center applications, instruments observability of services, responds to incident alerts, and reports on SLI/SLO metrics for visibility across the organization. This SRE role is essential in maintaining the production systems of Bitbucket, Confluence, and Jira that are used to deliver the state-of-the-art operating systems, applications, and firmware to Apple customers.
Key Qualifications
Passion in building reliable, scalable, and performant distributed systems
Understanding of distributed systems w.r.t. application, networking, and security
SRE or Dev/Ops experience in managing customer-facing systems in 24/7 environment Experience in managing and monitoring fleets of *nix systems or container platforms
Excellent judgment and integrity with ability to make timely and sound decisions
Ability to anticipate the needs of others and adapt to changing conditions
Description
As an Atlassian Services Site Reliability Engineer, responsibilities include:
- Configuration and monitoring of on-prem and cloud-based dependencies
-Automate continuous integration (CI) and continuous delivery (CD) pipelines
- Maintain staging and production environments with goal of maximizing uptimes
- Implement observability of systems for monitoring, alerting, and metrics reporting
- Generate reports regarding service metrics on performance, availability, and reliability - Champion practices regarding change control management and incident response
A successful Atlassian Services Site Reliability Engineer will be expected to:
- Proactively communicate status of Atlassian services to stakeholders and follow through on time-sensitive tasks
- Demonstrate willingness to ask for clarification and increase awareness of the larger context
- Explore solutions to problems, evaluate risk vs reward, then execute best approach
- Communicate asynchronously with a global team across multiple timezones
- Document new processes or update existing documentation pages
- Eager and curious to learn across multiple technology stacks
Education & Experience
B.S. in Computer Science or related work experience
Additional Requirements
The Atlassian Services Site Reliability Engineer (SRE) role resides within the Software Delivery organization, which is at the core of the Apple software release process. This role is responsible for applying SRE practices in maintaining Atlassian services, which are used by software engineers and project managers to develop Apple software for delivery to customers around the world. The Atlassian Services team drives reliability and performance engineering of data center applications, instruments observability of services, responds to incident alerts, and reports on SLI/SLO metrics for visibility across the organization. This SRE role is essential in maintaining the production systems of Bitbucket, Confluence, and Jira that are used to deliver the state-of-the-art operating systems, applications, and firmware to Apple customers.
Key Qualifications
Passion in building reliable, scalable, and performant distributed systems
Understanding of distributed systems w.r.t. application, networking, and security
SRE or Dev/Ops experience in managing customer-facing systems in 24/7 environment Experience in managing and monitoring fleets of *nix systems or container platforms
Excellent judgment and integrity with ability to make timely and sound decisions
Ability to anticipate the needs of others and adapt to changing conditions
Description
As an Atlassian Services Site Reliability Engineer, responsibilities include:
- Configuration and monitoring of on-prem and cloud-based dependencies
-Automate continuous integration (CI) and continuous delivery (CD) pipelines
- Maintain staging and production environments with goal of maximizing uptimes
- Implement observability of systems for monitoring, alerting, and metrics reporting
- Generate reports regarding service metrics on performance, availability, and reliability - Champion practices regarding change control management and incident response
A successful Atlassian Services Site Reliability Engineer will be expected to:
- Proactively communicate status of Atlassian services to stakeholders and follow through on time-sensitive tasks
- Demonstrate willingness to ask for clarification and increase awareness of the larger context
- Explore solutions to problems, evaluate risk vs reward, then execute best approach
- Communicate asynchronously with a global team across multiple timezones
- Document new processes or update existing documentation pages
- Eager and curious to learn across multiple technology stacks
Education & Experience
B.S. in Computer Science or related work experience
Additional Requirements
- Desired, but not required, skills and experiences:
- - Experience as SCM administrator (e.g. Github, or similar)
- - Experience with container platforms (e.g. Docker, or similar)
- - Experience with monitoring and alerting (e.g. Prometheus, Grafana, or similar)
- - Experience with data analysis (e.g. Splunk, or similar)
Created: 2024-05-31
Reference: 200552021
Country: United States
State: California
City: San Diego
ZIP: 92109
About Apple
Founded in: 1976
Number of Employees: 154000
Website: https://www.apple.com/
Career site: https://www.apple.com/careers/us/
Wikipedia: https://en.wikipedia.org/wiki/Apple_Inc.
Instagram: https://www.instagram.com/apple/
LinkedIn: https://www.linkedin.com/company/apple
Similar jobs:
-
Reliability Engineer
Meta in Sunnyvale, California -
Software Development Engineer in Test -II, WWGST Quality Reliability Engineering
Amazon in Irvine, California💸 $129300 per year -
Service Reliability Engineer - iCloud
Apple in Santa Clara Valley (Cupertino), California -
Site Reliability Engineer
AEG in San Francisco, California💸 $110 - $147000 per year -
Principal, Site Reliability Engineer - Kubernetes
Fox Corporation in Los Angeles, California💸 $161500.00 per year -
Site Reliability Engineer - AML Global Recommendation - USDS
TikTok in Mountain View, California -
Senior Site Reliability Engineer, TikTok Server Architecture
TikTok in San Jose, California -
Sr Reliability Engineer
Valero Energy in Benicia, California💸 $123520 - $169840 per year -
Lead Site Reliability Engineer, TikTok Server Architecture
TikTok in San Jose, California -
Staff Site Reliability Engineer
Fox Corporation in Los Angeles, California💸 $140000.00 per year -
Site Reliability Engineer, Recommendation Infrastructure - USDS
TikTok in Los Angeles, California -
Site Reliability Engineer
Apple in Cupertino, California -
Software Engineering Manager, Reliability Engineering
Roblox in San Mateo, California -
Software Engineer - TikTok Mobile Reliability, iOS - San Jose
TikTok in San Jose, California -
Cloud DevOps / Site Reliability Engineer, Applied Machine Learning
Apple in Sunnyvale, California -
AI Ops Site Reliability Engineer - Data Infrastructure
TikTok in San Jose, California -
Site Reliability Engineer, Global E-commerce- USDS
TikTok in Mountain View, California -
Site Reliability Engineer, Data Platform USDS
TikTok in Mountain View, California -
Site Reliability Engineer
Insight Global in Redwood City, California💸 $20 - $30 per hour -
Site Reliability Engineer, Data Analytics
Apple in San Diego, California