This site uses cookies. To find out more, see our Cookies Policy

Site Reliability Engineer in Vancouver at Procom

Date Posted: 5/11/2018

Job Snapshot

  • Employee Type:
  • Location:
  • Job Type:
  • Experience:
    Not Specified
  • Date Posted:

Job Description

Job ID: 253693
Site Reliability Engineer
On behalf of our client, Procom is actively seeking a Site Reliability Engineer 
Site Reliability Engineer Job Details 
  • Engage in and improve the entire lifecycle of services-from inception and design, through deployment, operation and refinement
  • Develop self-service tooling to assist in freeing our development teams from being bottlenecked by operations support
  • Identify problems in critical services, develop automated processes for eliminating future occurrences where possible, and propose changes to existing configurations to form the base for automation
  • Support services before they go live through activities such as system design consulting and launch reviews
  • Maintain services once they are live by considering all aspects of supportability, reliability and performance
  • Practice sustainable incident response and blameless postmortems to drive continuous improvement
  • Assist with migrating complex, multi-tier applications to cloud environments
  • Design and deploy enterprise-wide scalable operations on mixed architectures
Site Reliability Engineer Mandatory Skills 
  • Proven ability to write programs in Javascript
  • Experience running large, diverse architectures with configuration management systems like: Ansible(preferred), Puppet, Chef, or Salt
  • Deep understanding of the Linux Operating System, including Kernel, Memory, Process, Threads, Static / Shared Libraries, IPC, Signals
  • Understanding of standard networking protocols and components such as: HTTP, DNS, TCP/IP, ICMP, the OSI Model, Subnetting and Load Balancing
  • Familiarity with distributed systems paradigms such as the CAP Theorem, Microservices, and the Twelve Factor App
  • Familiarity with the AWS Operations Pillar of Excellence
  • Passion for eliminating repetitive manual processes using automation
  • Systematic problem-solving approach, coupled with strong communication skills, ownership and drive
  • Ability to debug and optimize code and automate routine tasks
  • Experience with continuous integration and continuous delivery pipelines
  • Interest in designing, analyzing and optimizing large-scale distributed systems
  • Able to design and implement simple, secure solutions for complex problems in distributed systems
  • Strong sense of ownership, customer service, and integrity proven through clear communication
  • Bachelors or Masters degree, or equivalent experience
Site Reliability Engineer Start Date 
Site Reliability Engineer Location 
Site Reliability Engineer Duration 
12 months