Staff DevOps Engineer, gRED - Computational Sciences

South San Francisco, California


Employer: ASK Staffing
Industry: 
Salary: Competitive
Job type: Full-Time

Job Title: Staff DevOps Engineer, gRED - Computational Sciences

Location: South San Francisco, CA 94080

Duration: 12 Months contract

Summary:

Computational technologies are increasingly a core part of drug discovery and development. We are on a mission to leverage big data, massive computing power, as well as advanced AI technologies to provide far more therapies at far less cost to society.
The AI & Cloud Engineering (ACE) group within gRED Computational Sciences department is seeking a Cloud DevOps Engineer reporting to the Cloud Infrastructure Tech Lead. The AI & Cloud Engineering (ACE) group provides cloud computing infrastructure for a wide range of business and research use cases including data engineering platforms, artificial intelligence, and business applications.
The incumbent will be for designing, developing, and managing the security architecture of our cloud-based systems and services. This role requires a deep understanding of security principles and controls, cloud security/operations tooling, and experience working with various cloud platforms (AWS, Azure, GCP), which significantly contribute to the advancement of the Clients drug development pipeline. The role will require cross-functional interaction and collaboration with various business and technology partners to influence and execute gRED ACE Cloud & AI Infrastructure strategy.
Responsibilities include:
Utilize DevOps principles and practices to deploy, operate, and manage AWS Well-Architected infrastructure to facilitate workflows and applications for Data Scientists and ML Engineers.
Design and manage CI/CD processes, focusing on high availability, low latency, and scalability with zero-downtime goals.
Develop and refine automation tools for efficient monitoring, alerting, and logging in large-scale environments, integrating AWS services with third-party monitoring and analytics tools.
Collaborate with Data Science and Engineering teams to convert requirements into technical solutions.
Serve as a primary Site Reliability Engineer (SRE), optimizing cloud architecture, addressing ad hoc requests, and coordinating with third-party support for performance and uptime goals.

The Staff DevOps Engineer must have:
4 years of DevOps experience, with a minimum of 3 years specializing in AWS solutions architecture, platform engineering, or site reliability engineering.
Well-versed in building and maintaining cloud infrastructure according to AWS "Well-Architected" principles.
Hands-on experience with AWS services such as IAM, VPC, EKS, EC2, S3, Sagemaker, MWAA, Redshift, and Lambda.
Skilled in the deployment, management, and orchestration of applications using Kubernetes, including containerization support for internal development teams.
Strong grasp of Infrastructure as Code principles using Terraform.
Adept at creating and managing CI/CD pipelines.
Demonstrated ability in deploying, managing, and integrating observability stacks.
Solid experience in Linux-based distributed production environments.
Familiarity with Single Sign-On (SSO) technologies, such as SAML, and OIDC.
Competency in Bash scripting and coding, preferably in Python or Go.
Exceptional interpersonal and communication skills, adept at fostering collaborative environments, influencing without authority, and building strong internal and external relationships.
Prior experience in MLOps and data science enablement is a plus.
Knowledge of observability (monitoring, logging and tracing) implementation and administration is a plus.
Familiarity with ZeroTrust Security principles is a plus.
AWS, HashiCorp, or Kubernetes certifications are a plus.

Created: 2024-05-04
Reference: 232474
Country: United States
State: California
City: South San Francisco