ID
#46046656
Job type
Contract
Salary
$70 - $85
Source
Khayainfotech
Date
2022-09-27
Deadline
2022-11-25

Monitoring Lead with Site Reliability Engineer ( Need to be on-site once a month)

New York, New york city, 10001

Contract

Vacancy expired!

Job Title: Monitoring Lead with SRE

Duration: Long Term Contract.

Location: New York(Need to on-site once a month)

Site Reliability Engineer good with Splunk and Dynatrace to set up Alerts dashboards and tune Splunk queries

Must Have :

Root Cause Analysis
Dynatrace
Splunk
Prod Support

As part of Platform Services Program, we are seeking a highly motivated, detail-oriented Senior Software Engineer-Site Reliability Engineer.The position requires you to build and support API Gateway that provides API traffic routing for entire enterprise. Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant API systems. SRE ensures that Client’s API services—both our internally critical and our externally-visible systems—have reliability, uptime appropriate to users' needs and a fast rate of improvement. Additionally SRE’s will keep an ever-watchful eye on our systems capacity and performance.This is a high visibility team that requires you to build scalable, resilient, fault tolerant and secure features onto API Gateway ensuring we meet our availability and performance SLAs.Responsibilities

Engage in and improve the whole lifecycle of services—from inception and design, deployment, operation, and refinement.
Provide guidance to other team members on managing availability and performance of mission critical services, on building automation to prevent problem recurrence, and on building automated responses for non-exceptional service conditions.
Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning, and launch reviews.
Maintain services once they are live by measuring and monitoring availability, latency, and overall system health.
Scale systems sustainably through mechanisms like automation; evolve systems by pushing for changes that improve reliability and velocity.
Practice sustainable incident response and blameless postmortems.
Adhere to Client security standards, change management and quality controls, enabling automations where required.

Minimum Qualifications

Experience working in Computer Science (e.g. networking, distributed systems, infrastructure, cloud)
Experience with Unix/Linux operating systems internals, administration and networking
Experience designing, building, and maintaining production services, and experience analyzing and troubleshooting systems
Experience in building REST, gRPC and SOAP APIs
Understanding of API authentication mechanisms such as oAuth, mTLS and JWT, SAML
Understanding of PKI, certificate management lifecycle, symmetric and asymmetric encryptions.
Experience with one or more of the following: Java, JavaScript, Python, Lua.
Bachelor's degree in Computer Science, a related technical field involving software/systems engineering, or equivalent practical experience.
A proactive approach to spotting problems, areas for improvement, and performance bottlenecks
Understanding of enterprise workloads
Experience with algorithms and data structures
Ability to debug, optimize code and automate routine tasks

Vacancy expired!

Report job