Site Reliability Engineer - Lead
[atlanta, Ga], Georgia
Employer: Saxon Global
Industry:
Salary: Competitive
Job type: Part-Time
Job Description:
Synopsis of the role:
Seeking creative, high-energy, diverse and driven software engineers with hands-on development skills to work on a variety of meaningful projects. Our software engineering positions provide you the opportunity to join a team of talented engineers working with leading-edge technology. You are ideal for this position if you are a forward-thinking, committed, and enthusiastic software engineer who is passionate about technology.
What you'll do:
• Work with teams across an organization and ensures core services reliability and keep an eye on capacity and performance.
• Responsible for blameless postmortems and proactive identification of potential outages factor into iterative improvement.
• Work closely with development and operations teams to build highly available, cost effective systems with extremely high uptime metrics.
• Hands on experience Configuring and Administering SCM(GIT, SVN), Build (CMake, Make files, Maven), Nexus, CI(Jenkins), CD Automation Tools
• Responsible for establishing end-to-end monitoring and alerting on all critical aspects to ensure SLAs and get proactive notifications of possible issues for all systems.
• Work with cloud operations team to resolve trouble tickets, developing and running scripts, and troubleshooting.
• Participate in 24x7X365 an on-call support for multiple core platforms globally. Using a "Follow the Sun" model, we expect working patterns will include on call duty, weekend and holiday season cover.
• Participate in release cycles of our offerings, deploying code to integration, staging and production environments, integrating with continuous integration (CI) and continuous delivery (CD) tools, monitoring, and change management
• Build Automation Work with Agile development teams to ensure smooth promotion of code, configuration and Docker images to production
• Oversee and adapt monitoring and alerting systems. Interact with automated monitoring and healing infrastructure to ensure healthy environments
• Develop automation to auto-correct or completely prevent issues in our solutions
• Perform software updates, peer code reviews, testing, and Common Vulnerabilities and Exposures (CVE) analysis; respond to security threats
• Identify single points of failure and other high-risk architecture issues; propose and implement more resilient resolutions
• Create and maintain standard operating procedures (SOPs) for performing maintenance tasks, applying configuration changes, and remediating problems in our environment
• Identify potential process improvements across the entire engineering organization
• Define and drive architectural enhancements into system to mitigate potential failure points
• Provide impact assessment and mitigation plan for changes going into the production environment
• Investigate root cause of severe and systemic outages, identify corrective actions
• As we transition to the Public cloud (Google or AWS), build new build and deployment patterns.
What experience you need:
• A minimum 10 years of experience as a Developer/Lead/Architect.
• Bachelor's Degree in Computer Science, Information Management or in "STEM" Majors
• Experience with configuring, customizing, and extending monitoring tools (Appdynamics, Apica, Sensu, Grafana, Prometheus, Graphite, Splunk, Zabbix, Nagios etc.)
• 10+ years' experience with all stages of an agile software development lifecycle (CI/CD) supporting Java/Javascript UI applications (ex: Angular JS) and SAAS applications.
• 5 years of experience building JavaEE applications using, build tools like Maven/ANT, Subversion, JIRA Jenkins, Bitbucket and Chef
• 8+ years' experience in continuous integration tools (Jenkins, SonarQube, JIRA, Nexus, Confluence, GIT-BitBucket, Maven, Gradle, RunDeck, is a plus)
• 3+ years' experience with configuration management and automation (Ansible, Puppet, Chef, Salt)
• 3+ years' experience deploying and managing infrastructure on public clouds (AWS, GCP, or Azure or Pivotal)
• 3+ years experience working on Kubernetes and other related applications.
• Experience working with Nginx, Tomcat, HAProxy, Redis, Elastic Search, MongoDB, and RabbitMQ, Kafka, Zookeeper.
• 3+ years' experience in Linux environments (CentOS).
• Knowledge of TCP/IP networking, load balancers, high availability architecture, zero downtime production deployments. Comfortable with network troubleshooting (tcpdump, routing, proxies, firewalls, load balancers, etc.)
• Demonstrated ability to script around repeatable tasks (Go, Ruby, Python, Bash)
• Experience with large scale cluster management systems (Mesos, Kubernetes)
• Experience with Docker-based containers is a plus
• Able to dive into any level of a modern internet service (schedulers, containers, Linux kernel, caching, object storage, distributed filesystems, RDBMS, NoSQL, etc.)
Required Skills : GCP, Kubernetes,Jenkins,cloud.
Basic Qualification : Looking for a Lead SRE.
Additional Skills : Looking for a Lead SRE.
Background Check :Yes
Drug Screen :Yes
Notes :
Selling points for candidate :
Project Verification Info :
Candidate must be your W2 Employee :Yes
Exclusive to Apex :No
Face to face interview required :No
Candidate must be local :Yes
Candidate must be authorized to work without sponsorship ::No
Interview times set : :No
Type of project :Development/Engineering
Master Job Title :Eng: Other
Branch Code :St. Louis
Synopsis of the role:
Seeking creative, high-energy, diverse and driven software engineers with hands-on development skills to work on a variety of meaningful projects. Our software engineering positions provide you the opportunity to join a team of talented engineers working with leading-edge technology. You are ideal for this position if you are a forward-thinking, committed, and enthusiastic software engineer who is passionate about technology.
What you'll do:
• Work with teams across an organization and ensures core services reliability and keep an eye on capacity and performance.
• Responsible for blameless postmortems and proactive identification of potential outages factor into iterative improvement.
• Work closely with development and operations teams to build highly available, cost effective systems with extremely high uptime metrics.
• Hands on experience Configuring and Administering SCM(GIT, SVN), Build (CMake, Make files, Maven), Nexus, CI(Jenkins), CD Automation Tools
• Responsible for establishing end-to-end monitoring and alerting on all critical aspects to ensure SLAs and get proactive notifications of possible issues for all systems.
• Work with cloud operations team to resolve trouble tickets, developing and running scripts, and troubleshooting.
• Participate in 24x7X365 an on-call support for multiple core platforms globally. Using a "Follow the Sun" model, we expect working patterns will include on call duty, weekend and holiday season cover.
• Participate in release cycles of our offerings, deploying code to integration, staging and production environments, integrating with continuous integration (CI) and continuous delivery (CD) tools, monitoring, and change management
• Build Automation Work with Agile development teams to ensure smooth promotion of code, configuration and Docker images to production
• Oversee and adapt monitoring and alerting systems. Interact with automated monitoring and healing infrastructure to ensure healthy environments
• Develop automation to auto-correct or completely prevent issues in our solutions
• Perform software updates, peer code reviews, testing, and Common Vulnerabilities and Exposures (CVE) analysis; respond to security threats
• Identify single points of failure and other high-risk architecture issues; propose and implement more resilient resolutions
• Create and maintain standard operating procedures (SOPs) for performing maintenance tasks, applying configuration changes, and remediating problems in our environment
• Identify potential process improvements across the entire engineering organization
• Define and drive architectural enhancements into system to mitigate potential failure points
• Provide impact assessment and mitigation plan for changes going into the production environment
• Investigate root cause of severe and systemic outages, identify corrective actions
• As we transition to the Public cloud (Google or AWS), build new build and deployment patterns.
What experience you need:
• A minimum 10 years of experience as a Developer/Lead/Architect.
• Bachelor's Degree in Computer Science, Information Management or in "STEM" Majors
• Experience with configuring, customizing, and extending monitoring tools (Appdynamics, Apica, Sensu, Grafana, Prometheus, Graphite, Splunk, Zabbix, Nagios etc.)
• 10+ years' experience with all stages of an agile software development lifecycle (CI/CD) supporting Java/Javascript UI applications (ex: Angular JS) and SAAS applications.
• 5 years of experience building JavaEE applications using, build tools like Maven/ANT, Subversion, JIRA Jenkins, Bitbucket and Chef
• 8+ years' experience in continuous integration tools (Jenkins, SonarQube, JIRA, Nexus, Confluence, GIT-BitBucket, Maven, Gradle, RunDeck, is a plus)
• 3+ years' experience with configuration management and automation (Ansible, Puppet, Chef, Salt)
• 3+ years' experience deploying and managing infrastructure on public clouds (AWS, GCP, or Azure or Pivotal)
• 3+ years experience working on Kubernetes and other related applications.
• Experience working with Nginx, Tomcat, HAProxy, Redis, Elastic Search, MongoDB, and RabbitMQ, Kafka, Zookeeper.
• 3+ years' experience in Linux environments (CentOS).
• Knowledge of TCP/IP networking, load balancers, high availability architecture, zero downtime production deployments. Comfortable with network troubleshooting (tcpdump, routing, proxies, firewalls, load balancers, etc.)
• Demonstrated ability to script around repeatable tasks (Go, Ruby, Python, Bash)
• Experience with large scale cluster management systems (Mesos, Kubernetes)
• Experience with Docker-based containers is a plus
• Able to dive into any level of a modern internet service (schedulers, containers, Linux kernel, caching, object storage, distributed filesystems, RDBMS, NoSQL, etc.)
Required Skills : GCP, Kubernetes,Jenkins,cloud.
Basic Qualification : Looking for a Lead SRE.
Additional Skills : Looking for a Lead SRE.
Background Check :Yes
Drug Screen :Yes
Notes :
Selling points for candidate :
Project Verification Info :
Candidate must be your W2 Employee :Yes
Exclusive to Apex :No
Face to face interview required :No
Candidate must be local :Yes
Candidate must be authorized to work without sponsorship ::No
Interview times set : :No
Type of project :Development/Engineering
Master Job Title :Eng: Other
Branch Code :St. Louis
Created: 2024-04-30
Reference: SG - 78368
Country: United States
State: Georgia
City: [atlanta, Ga]
Similar jobs:
-
Electrical Reliability Engineer
Koch Industries in Brunswick, Georgia -
AVP, Reliability Engineer
Synchrony Financial in Alpharetta, Georgia -
Reliability/ Maintenance Engineer
Gables Search Group in Augusta, Georgia -
Reliability Engineer
INSPYR Solutions in Carrollton, Georgia💸 $87000 - $105000 per year -
Engineering Reliability Director
Coca-Cola Company in Atlanta, Georgia -
AWS Site Reliability Engineer
NLB Services in Atlanta, Georgia -
Sr. Site Reliability Engineer
Apex Systems in Dunwoody, Georgia -
Site Reliability Engineer
Saxon Global in [alpharetta, Ga], Georgia -
Maintenance/Reliability Engineer
Koch Industries in Brunswick, Georgia -
Reliability Engineer - skv92gm0i5rt
Gables Search Group in Carrollton, Georgia -
AWS Site Reliability Engineer
Insight Global in Alpharetta, Georgia -
Site Reliability Engineer IV
LexisNexis Risk Solutions Group in Alpharetta, Georgia -
Reliability Engineer - IPG
Saint Gobain in ATHENS, Georgia -
Sr. Reliability Engineer
Gables Search Group in Perry, Georgia -
Cloud Site Reliability Engineer
Insight Global in Atlanta, Georgia -
Engineering Group Lead - Reliability
General Dynamics Corporation in Savannah, Georgia -
Site Reliability Engineer IV
LexisNexis Risk Solutions Group in Alpharetta, Georgia