Site Reliability Engineer
San Francisco, California
In order to be considered for this role, after clicking "Apply Now" above and being redirected, you must fully complete the application process on the follow-up screen.
Swish Analytics is a sports analytics, betting and fantasy startup building the next generation of predictive sports analytics data products. We believe that oddsmaking is a challenge rooted in engineering, mathematics, and sports betting expertise; not intuition. We're looking for team-oriented individuals with an authentic passion for accurate and predictive real-time data who can execute in a fast-paced, creative, and continually-evolving environment without sacrificing technical excellence. Our challenges are unique, so we hope you are comfortable in uncharted territory and passionate about building systems to support products across a variety of industries and enterprise clients. About the team The Swish Analytics DevSecOps and Infrastructure team is looking for an experienced Site Reliability Engineer who will support our enterprise infrastructure. In addition to supporting you will assist in optimizing incident response, observability, and working with technical teams to improve overall workload resiliency. Responsibilities
Swish Analytics is a sports analytics, betting and fantasy startup building the next generation of predictive sports analytics data products. We believe that oddsmaking is a challenge rooted in engineering, mathematics, and sports betting expertise; not intuition. We're looking for team-oriented individuals with an authentic passion for accurate and predictive real-time data who can execute in a fast-paced, creative, and continually-evolving environment without sacrificing technical excellence. Our challenges are unique, so we hope you are comfortable in uncharted territory and passionate about building systems to support products across a variety of industries and enterprise clients. About the team The Swish Analytics DevSecOps and Infrastructure team is looking for an experienced Site Reliability Engineer who will support our enterprise infrastructure. In addition to supporting you will assist in optimizing incident response, observability, and working with technical teams to improve overall workload resiliency. Responsibilities
- Support production systems and help triage issues during live sporting events
- Monitor the system and respond to incidents to maintain system SLO/SLA, review and follow up production incidents
- Write and review code, develop documentation, and debug problems, live, on complex distributed systems
- Optimize and facilitate incident response, conduct root cause analysis and blameless retrospectives
- Work closely with technical teams to implement, optimize, maintain, scale and debug workloads on Kubernetes using CI/CD, automation tools and scripting languages to deliver tools/software to improve the reliability and scalability of services
- 3+ years of experience working in an SRE leaning DevOps or full SRE roles
- 3+ years building CICD pipelines with Github Actions, Gitlab CICD, or similar
- Extensive experience with Kubernetes
- Experience in managing customer-facing systems in a 24/7 environment including escalations
- Experience triaging and escalation policies/protocols
- Strong communication and documentation skills
- Comfortable with scripting languages like Bash, Python, or similar
- Networking and routing experience
- Terraform in AWS to support global-scale services
- Improving observability in an engineering organization
- Past experience with PagerDuty or similar tools
Created: 2024-10-05
Reference: jsSLuLSNSpt3
Country: United States
State: California
City: San Francisco
ZIP: 94130
About AEG
Founded in: 1994
Number of Employees: 28000
Website: https://www.aegworldwide.com/
Career site: https://www.aegworldwide.com/careers
LinkedIn: https://www.linkedin.com/company/aeg
Facebook: https://www.facebook.com/AEGWorldwide/
Similar jobs:
-
Site Reliability Engineer
Insight Global in San Diego, California -
Service Reliability Engineer
Stanford University in Redwood City, California -
Reliability Engineer, Interiors & Seats, Semi
Tesla Motors in Fremont, California💸 $84000 - $276000 per year -
Atlassian Services Site Reliability Engineer
Apple in San Diego, California -
Reliability Engineer, Cameras & Sensors
Tesla Motors in Palo Alto, California💸 $84000 - $192000 per year -
Hardware Reliability Engineer - Ecosystem Accessories
Apple in Culver City, California -
Software Engineering Manager II, Site Reliability Engineering, Google Cloud
Google in Sunnyvale, California -
Site Reliability Engineer, Infrastructure and Assurance Services - USDS
TikTok in Mountain View, California -
Site Reliability Engineer Tech lead, TikTok Server Architecture
TikTok in San Jose, California -
Sr Reliability Engineer
Valero Energy in Benicia, California💸 $123520 - $169840 per year -
Reliability Engineer, Drive Unit, Semi
Tesla Motors in Fremont, California💸 $84000 - $222000 per year -
Staff Site Reliability Engineer, Infrastructure Engineering
Tesla Motors in Fremont, California -
Site Reliability Engineer
Apple in Sunnyvale, California -
Software Engineer - TikTok Mobile Reliability, iOS - San Jose
TikTok in San Jose, California -
Site Reliability Engineer, Global E-commerce- USDS
TikTok in Mountain View, California -
Site Reliability Engineer, Data Platform USDS
TikTok in Mountain View, California -
Reliability Engineer, Cell Qualification
Tesla Motors in Palo Alto, California💸 $104000 - $222000 per year -
Site Reliability Engineer, Edge - USDS
TikTok in Mountain View, California -
Mac Reliability Engineer
Apple in Cupertino, California -
Site Reliability Engineer, Cloud Infrastructure- USDS
TikTok in Mountain View, California