ID
#49579707
Job type
Contract
Salary
USD0 - USD0
Source
Strategic Staffing Solutions
Date
2023-03-29
Deadline
2023-05-28

Site Reliability Engineer (Onsite)

Illinois, Chicago, 60015

Contract

Vacancy expired!

STRATEGIC STAFFING SOLUTIONS (S3) HAS AN OPENING!

S3 is seeking a

Site Reliability Engineer for one of our partners in the Finance industry. Candidates MUST BE local to the Riverwoods/Chicago, IL area.

Job Title: Site Reliability Engineer (Onsite)

Location: Riverwoods, IL

Role Type: W2 Only, No C2C

Contract Length: 6 months, contract to hire

Pay Rate: $60.00-80/hr.

How to Apply: Please send resume and contact information to Keena Leo, Sourcing Specialist, at

and reference job #220503.

JOB DESCRIPTION/RESPONSIBILITIES:

Develop and run SRE own tooling and observability using automation like CI/CD, and Kubernetes.
Build monitoring that alerts on symptoms rather than on outages.
Document every action so your findings turn into repeatable actions and then into automation.
Debug production issues across services and levels of the stack.
Plan the growth and reliability of services.
Be on an on-call rotation to respond to “Code Red” incidents to help restore customer impacting service.
Automation like CI/CD, self-healing of services, end-to-end or performance testing
Improve monitoring (data Dog, AppD etc.) and building new smart metrics
Develop a relationship with a product group and help define their SLO/SLI
Work directly with AppDev to improve product by Non-functional and production readiness
Improve operability, latency, capacity planning, change management and improve MTTR (Mean Time to Repair)
Leading and contributing to scope and designs for issues, epics, and OKRs (Objective and Key Result)
Contributing to the Handbook, create and update runbooks, general documentation, and write blogs
Completing Root Cause Analysis (RCA) investigations and performing readiness reviews
Improving team practices through code reviews, handoffs of work and incidents
Provides emergency response either by being on-call or by reacting to symptoms according to monitoring and escalation when needed
Proposes ideas and solutions to debug, optimize code, and to automate tasks.
Plan, design and execute solutions within to reach specific goals agreed within the team.
Plan and execute configuration change operations both at the application and the infrastructure levels.
Actively look for opportunities to improve the availability and performance of the system by applying the learnings from monitoring and observation
Experience designing, analyzing, and debugging distributed systems

You may be a fit for this role if you have some of these inclinations:

Have an urge for delivering quickly and effectively and iterating fast.
Think about systems: edge cases, failure modes, behaviors, specific implementations.
As an engineer, when you see something broken, you cannot help but fix it.
Have an urge to document all the things so you do not need to learn the same thing twice.
Strong knowledge of SDLC (System Development Life Cycle)
Strong knowledge of git, Docker, Kubernetes, Jenkins, AWS (Amazon Web Services) or similar technologies
Know what the use of configuration management systems like Chef, Ansible
Have strong programming skills in one or more of the following languages: C, Ruby, Python, Java
Good understanding of hybrid infrastructure

REQUIRED SKILLS/QUALIFICATIONS:

Configuration management: experience with Chef and Ansible to effectively manage infrastructure
Infrastructure as code: experience with Terraform and GitLab CI/CD for automation, containerize environments (Kubernetes), and leverage cloud technologies
Systems: manage, configure, and troubleshoot operating system issues, storage (block and object), networking VPC (Virtual Private Cloud), proxies and CDN (Content Delivery Network) and administer high-availability PostgreSQL and Redis clusters
Monitoring and instrumentation: implement metrics in Prometheus, Grafana, log management and related system, and Slack/PagerDuty integrations
Engineering practices: availability, reliability, and scalability, as well as disaster recovery
Use and contribute to code to git
Experience coding in one or more of the following languages: C, Ruby, Python, Shell, Java
Planning: familiar with agile methodologies; use epics and issues to drive projects
Organization: workload organization, OKR (Objective and Key Result) leadership
Management: a manager of one, able to self-organize and report asynchronously
General knowledge of 4 technical expertise areas, with deep knowledge in 1 area:
- AWS Cloud Practitioner, resources provisioning and configuration through CLI/API
- Chef (basic syntax, recipes, cookbooks) or Ansible (basic syntax, tasks, playbooks)
- Working knowledge of CI/CD, Jenkins, Nexus, pipelines, jobs
- Kubernetes basic understanding, CLI (Command Line Interface), service re-provisioning
- Provision and setup metric in AppD or Grafana or Datadog
- Provision and setup logs and queries for frequent questions
- Networking VPC, proxies and CDN (Content Delivery Network)
Working knowledge of git
Mandatory: 5+ years experience and a BE/B.Sc

The S3 DifferenceThe global mission of S3 is to build trusting relationships and deliver solutions that positively impact our customers, our consultants, and our communities.

The four pillars of our company are to:

Set the bar high for what a company should do
Create jobs
Offer people an opportunity to succeed and change their station in life
Improve the communities where we live and work through volunteering and charitable giving

As an S3 employee, you’re eligible for a full benefits package that may include:

Medical Insurance
Dental Insurance
Vision Insurance
401(k) Plan
Vacation Package
Life & Disability Insurance Plans
Flexible Spending Accounts
Tuition Reimbursement

Vacancy expired!

Report job

ID

Job type

Salary

Source

Date

Deadline

Site Reliability Engineer (Onsite)