Site Reliability Engineer, Cloud Native Platform

San Jose, California


Employer: TikTok
Industry: R&D
Salary: Competitive
Job type: Full-Time

Responsibilities

TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul and Tokyo.

Why Join Us
At TikTok, our people are humble, intelligent, compassionate and creative. We create to inspire - for you, for us, and for more than 1 billion users on our platform. We lead with curiosity and aim for the highest, never shying away from taking calculated risks and embracing ambiguity as it comes. Here, the opportunities are limitless for those who dare to pursue bold ideas that exist just beyond the boundary of possibility. Join us and make impact happen with a career at TikTok.

Our infrastructure team is seeking experienced site reliability engineers to build globally distributed edge platform for provisioning and deploying edge services. Our team operates a large network of edge POPs around the world to accelerate site traffic and cache CDN content. We use Kubernetes to manage on-prem/cloud nodes and build an eco-system around it, including tools for monitoring, alerting, logging, CI/CD, etc. and various services with automated deployment and scaling in order to maximize daily operation efficiencies. On top of the Kubernetes infra, we build the edge computing platform (PaaS) to help deploy and manage global edge services.

Responsibilities
• Deploy and administrate Kubernetes clusters both on-prem and in cloud (AWS, GCP, etc.).
• Collaborate with software engineers to build enterprise-level edge computing platform (PaaS) with cutting-edge Cloud Native Computing Foundation (CNCF) technologies.
• Design, develop, automate, and continuously improve platform services and pipelines, such as monitoring, alerting, logging, tracing, CI/CD, etc.
• Improve Kubernetes system efficiency and debug issues related to networking, storage, scheduling, etc.
• Collaborate with open-source communities to advance Kubernetes and edge computing technologies.

Qualifications

Minimum Qualifications
• Master's degree (or Bachelor's degree with 3+ years of experience) in Computer Engineering, Computer Science, or related fields.
• 1+ years of experience in Kubernetes administration.
• 3+ years of experience in Unix/Linux systems from kernel to shell and beyond.
• Experience with Kubernetes CNI deployment and troubleshooting, including (but not limited to) the following CNIs: Cilium, Kube-Router, Calico, Flannel.
• Experience in designing, analyzing, and building automation tools for large scale and complex systems.

Preferred Qualifications
• CKA (Certified Kubernetes Administrator) certification.
• Experience in using and contributing to open-source projects in Kubernetes ecosystem, e.g. Kubespray, CNI, Helm, Istio/Linkerd, Prometheus, ArgoCD, OPA, Harbor, Envoy, etc.
• Experience in networking technologies such TCP/IP, BGP, DNS, load balancers, etc.
• Experience in CI/CD pipeline design and development.
• Experience in Kubernetes API, Operator, and Custom Resource Definition (CRD) development.

TikTok is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe and so does our workplace. At TikTok, our mission is to inspire creativity and bring joy. To achieve that goal, we are committed to celebrating our diverse voices and to creating an environment that reflects the many communities we reach. We are passionate about this and hope you are too.

TikTok is committed to providing reasonable accommodations during our recruitment process. If you need assistance or accommodation, please reach out to us at USRC@tiktok.com.

Created: 2024-05-18
Reference: DVVP
Country: United States
State: California
City: San Jose
ZIP: 95118


Similar jobs: