-
ID
#49579707 -
Job type
Contract -
Salary
USD0 - USD0 -
Source
Strategic Staffing Solutions -
Date
2023-03-29 -
Deadline
2023-05-28
Site Reliability Engineer (Onsite)
Illinois, Chicago, 60015 Chicago USAContract
Vacancy expired!
- Develop and run SRE own tooling and observability using automation like CI/CD, and Kubernetes.
- Build monitoring that alerts on symptoms rather than on outages.
- Document every action so your findings turn into repeatable actions and then into automation.
- Debug production issues across services and levels of the stack.
- Plan the growth and reliability of services.
- Be on an on-call rotation to respond to “Code Red” incidents to help restore customer impacting service.
- Automation like CI/CD, self-healing of services, end-to-end or performance testing
- Improve monitoring (data Dog, AppD etc.) and building new smart metrics
- Develop a relationship with a product group and help define their SLO/SLI
- Work directly with AppDev to improve product by Non-functional and production readiness
- Improve operability, latency, capacity planning, change management and improve MTTR (Mean Time to Repair)
- Leading and contributing to scope and designs for issues, epics, and OKRs (Objective and Key Result)
- Contributing to the Handbook, create and update runbooks, general documentation, and write blogs
- Completing Root Cause Analysis (RCA) investigations and performing readiness reviews
- Improving team practices through code reviews, handoffs of work and incidents
- Provides emergency response either by being on-call or by reacting to symptoms according to monitoring and escalation when needed
- Proposes ideas and solutions to debug, optimize code, and to automate tasks.
- Plan, design and execute solutions within to reach specific goals agreed within the team.
- Plan and execute configuration change operations both at the application and the infrastructure levels.
- Actively look for opportunities to improve the availability and performance of the system by applying the learnings from monitoring and observation
- Experience designing, analyzing, and debugging distributed systems
- Have an urge for delivering quickly and effectively and iterating fast.
- Think about systems: edge cases, failure modes, behaviors, specific implementations.
- As an engineer, when you see something broken, you cannot help but fix it.
- Have an urge to document all the things so you do not need to learn the same thing twice.
- Strong knowledge of SDLC (System Development Life Cycle)
- Strong knowledge of git, Docker, Kubernetes, Jenkins, AWS (Amazon Web Services) or similar technologies
- Know what the use of configuration management systems like Chef, Ansible
- Have strong programming skills in one or more of the following languages: C, Ruby, Python, Java
- Good understanding of hybrid infrastructure
- Configuration management: experience with Chef and Ansible to effectively manage infrastructure
- Infrastructure as code: experience with Terraform and GitLab CI/CD for automation, containerize environments (Kubernetes), and leverage cloud technologies
- Systems: manage, configure, and troubleshoot operating system issues, storage (block and object), networking VPC (Virtual Private Cloud), proxies and CDN (Content Delivery Network) and administer high-availability PostgreSQL and Redis clusters
- Monitoring and instrumentation: implement metrics in Prometheus, Grafana, log management and related system, and Slack/PagerDuty integrations
- Engineering practices: availability, reliability, and scalability, as well as disaster recovery
- Use and contribute to code to git
- Experience coding in one or more of the following languages: C, Ruby, Python, Shell, Java
- Planning: familiar with agile methodologies; use epics and issues to drive projects
- Organization: workload organization, OKR (Objective and Key Result) leadership
- Management: a manager of one, able to self-organize and report asynchronously
- General knowledge of 4 technical expertise areas, with deep knowledge in 1 area:
- AWS Cloud Practitioner, resources provisioning and configuration through CLI/API
- Chef (basic syntax, recipes, cookbooks) or Ansible (basic syntax, tasks, playbooks)
- Working knowledge of CI/CD, Jenkins, Nexus, pipelines, jobs
- Kubernetes basic understanding, CLI (Command Line Interface), service re-provisioning
- Provision and setup metric in AppD or Grafana or Datadog
- Provision and setup logs and queries for frequent questions
- Networking VPC, proxies and CDN (Content Delivery Network)
- Working knowledge of git
- Mandatory: 5+ years experience and a BE/B.Sc
- Set the bar high for what a company should do
- Create jobs
- Offer people an opportunity to succeed and change their station in life
- Improve the communities where we live and work through volunteering and charitable giving
- Medical Insurance
- Dental Insurance
- Vision Insurance
- 401(k) Plan
- Vacation Package
- Life & Disability Insurance Plans
- Flexible Spending Accounts
- Tuition Reimbursement
Vacancy expired!
Report job