• Find preferred job with Jobstinger
  • ID
    #6176737
  • Job type
    Contract
  • Salary
    BASED ON EXPERIENCE
  • Source
    Talent Software Services, Inc
  • Date
    2020-11-27
  • Deadline
    2021-01-26

Site Reliability Engineer

Minnesota, Minneapolis / st paul, 55401 Minneapolis / st paul USA
 
Contract

Vacancy expired!

SITE RELIABILITY ENGINEERJob Summary: Talent Software Services is in search of a Site Reliability Engineer for a contract to hire position in Minneapolis, MN. Primary Responsibilities/Accountabilities:

  • Take responsibility for designing solutions that correspond to non-functional requirements such as availability, performance, security, and maintainability.
  • Leverage your expertise in coding, algorithms, complex analysis, enterprise incident coordination, and large-scale system design.
  • Model SRE culture of intellectual curiosity, problem solving, openness, collaboration, reasonable risk taking, and big thinking in a self-directed environment.
  • Build highly scalable platforms and fault tolerant systems across a range of technologies
  • Define, drive adoption and enforcement of service level objectives at both service and experience levels
  • Analyze root-cause complex problems involving multiple integrated systems and services, networks, hardware and software that relate to scaling and performance
  • Set standards for deployments at scale, infrastructure reliability and scalability
  • Influence engineering teams with customer focus, world class quality, effective communication, decisive, fast moving solutions, quick and constructive resolution of conflicts
  • Manage service availability and scalability through process, tools, and automation
  • Perform post-mortems and optimize incident response processes
  • Lead incident response for production incidents; Drive investigation, analysis and troubleshooting to resolve production incidents and systematically drive down detection and mitigation times
  • Bring a strong engineering focus to operations, putting your energy into preventing incidents, automation frameworks, self-service infrastructure, logging and metrics, and operational scorecards
  • Develop CI/CD processes to improve cadence
  • Identify or utilize existing tools for logging, monitoring, event management, notification, runbook automation, root cause analysis
  • Partner with security engineers to develop plans and automation that aggressively and safely respond to new risks and vulnerabilities.
  • Develop, communicate, and monitor standard processes to promote the long-term health of sustainability and health of operational development tasks.
  • Define and propose SRE tasks, expectations, training, and measurable outcomes to establish the foundations for SRE client.
  • Define and propose an implementation approach for establishing PIM, based on ITIL 4. This should include tasks, expectations, training plans, and measurable outcomes.
  • Actively participate in the definition and adoption of cloud migration strategies for high reliability.
  • Lead the definition of requirements and implementation of processes and procedures for operations management in a hybrid cloud environment.
Qualifications:
  • 2+ years of experience related to IT Site Reliability Engineering such as configuration, monitoring, information management, AIOPS, DEVOPS, technical architecture, Cloud management systems, ITOM/HDIM, Incident Coordination, or other components of experience centric operations.
  • Experience:
    • Experience with building and maintaining application stacks in a Hybrid Cloud environment, as well as expertise with Microsoft Azure is a plus.
    • Thought leader and mentor for internal and external technical talent
    • 3-5 years or more building and scaling distributed systems leveraging web scale technologies like Linux, Apache, MongoDB, Python, Oracle RDBMS, Redis, Postgres and Hadoop
    • Experience with Linux/Unix internals and systems services like DNS, DHCP, TFTP, iptables, smtp, as well as networking protocols such as TCP, UDP and HTTP.
    • Programming experience in one or more of the following languages: Go, Java, Python, Ruby, Shell, Powershell, JSON, YAML, REST, CLI, and CI/CD tools such as Travis, Drone, Jenkins, Azure DevOps.
    • Hands-on experience using source control (Git, GitHub) and feature branching strategies Preferred Technical and Professional Expertise
    • Experience with containers, such as with Docker, Kubernetes and Open Shift
    • Experience with monitoring and observability such as with New Relic, Nagios, Icinga, or Sysdig
    • Experience automating infrastructure, configuration management, testing, and deployments using tools like Ansible, Chef and can explain the Infrastructure as Code paradigm
    • Participate in security compliance efforts; experience drafting and/or reviewing IT policies.
  • Excellent interpersonal, written, and verbal communication skills.
  • Ability to:
    • Adapt to changing priorities, demands, and timelines.
    • Champion change throughout the organization
    • Establish and maintain effective working relationships with all levels of the organization and contribute in a team environment
    • Work as a leader in a team environment ensuring customer satisfaction and technical excellence.
If this job is a match for your background, we would be honored to receive your application! Providing consulting opportunities to TALENTed people since 1987, we offer a host of opportunities including contract, contract to hire and permanent placement. Let's talk!

Vacancy expired!

Report job

Related Jobs

Jobstinger