ID
#46074828
Job type
Contract
Salary
$40 - $50
Source
Talent Hires
Date
2022-09-28
Deadline
2022-11-26

Site Reliability Engineer-Onsite(Atlanta, GA and St. Louis MI)

Georgia, Atlanta, 30301

Contract

Vacancy expired!

Hi,Greetings from HumetisJob Title:

Site Reliability Engineer Location:

Atlanta GA Onsite ,St. Louis MI Onsite

Job DescriptionResponsibilities as a Site Reliability Engineer are as follows: Experience

: 3 to 5 years

Objectives of this RoleRun the production environment by monitoring availability and taking a holistic view of system health Build software and systems to manage platform infrastructure and applications Improve reliability, quality, and time-to-market of our suite of software solutions Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve Provide primary operational support and engineering for multiple large, distributed software applications Influence and Design infrastructure, architecture, standards, and methods to build large-scale systems.Architect strategies and implementation plans to support end-to-end monitoring, alerting, troubleshooting and developing dashboards to ensure SLAs and proactive notifications.Identify business problems through deep dive discussions with business owners and device suitable solutions using SRE Principles, CI/CD, Virtual services and SDLC methodologies.Explore and work on latest DevOps, Infrastructure as a Code, Site Reliability and Cloud technologies to manage continuous integration &amp, delivery, configuration and infrastructure orchestration.Engage with Product Managers and Scrum masters to identify dependencies, risks and suggest remediation with corrective action plan via sustainable, preventive, and automated reconciliation.Work with development and operations teams to build highly available, cost-effective systems with extremely high uptime metrics.Guide the team on reliability practices through activities like architecture reviews, code reviews, creating platforms and frameworks, capacity planning, and chaos testing.Automate system scalability and recovery, and continually work to improve system resiliency, performance, and efficiency.Responsible to facilitate blameless postmortems and proactive identification of potential outages factor into iterative improvement.Contributes towards developing Service Level Objectives (SLOs), identifying Service Level Indicators (SLIs) and Error budgets based on organizational level SLAs.

Required Skills and QualificationsBachelor’s degree in computer science or other highly technical, scientific discipline Ability to program (structured and OO) with one or more high level languages, such as Python, Java, C/C, Ruby, and JavaScript Experience with distributed storage technologies like NFS, HDFS, Ceph, S3 as well as dynamic resource management frameworks (Mesos, Kubernetes, Yarn) A proactive approach to spotting problems, areas for improvement, and performance bottlenecks

Preferred QualificationsPrevious success in technical engineering Coding

Vacancy expired!

Report job