Senior Site Reliability Engineer

Sorry, this job was removed at 11:02 a.m. (PST) on Tuesday, December 14, 2021
Find out who's hiring in Seattle.
See all Developer + Engineer jobs in Seattle
Apply
By clicking Apply Now you agree to share your profile information with the hiring company.

At PitchBook we work to provide global professionals with comprehensive data on the entire venture capital, private equity and M&A landscape so they can discover and execute opportunities with confidence. We credit our success and rapid growth to our cutting-edge products, customer centered attitude and ability to embrace and drive change.
In just over a decade, PitchBook has reached over 1,300 global employees with offices worldwide, and we're not slowing down! Consistently recognized as a Best Place to Work, our culture is at the heart of our success and is driven by excellence, inclusion and fun. At PitchBook we're committed to fostering an open and collaborative work environment.
About the Role: 

As a Senior Site Reliability Engineer in PitchBook's engineering division, you will be creating and evolving systems to automatically run our suite of products and services reliably and consistently. As part of a team of site reliability engineers and platform engineers, and in conjunction with group leadership, you will help define service level objectives (SLOs) which determine success and build systems to achieve those objectives. 
You will utilize your strong background deploying, managing, and maintaining production systems, working with developers to operate and monitor large-scale services with complex distributed systems and data integrations. You will incorporate observability tools (monitoring, telemetry, tracing, alerting), perform incident management, conduct root cause analyses, eliminate single points of failure, build reliability and redundancy into our infrastructure, establish and test our recoverability, mitigate failures, and do all of these things through automation and tools. 
As a senior SRE, you will take independent responsibility for building and managing large subsets of our systems. You will help build our best practices for infrastructure-as-code and your code will exemplify our quality controls. You will mentor and train other SREs, platform engineers, and software engineers in reliability topics.  
Your ability to collaborate with colleagues, exhibit poise and adaptability in stressful situations, communicate effectively, and build resilient systems that can be consistently relied upon will be critical to your success. You will solicit feedback, learn constantly, engage others with empathy, and help create a culture of belonging, teamwork, and purpose. 
If you love building customer-centric solutions, strive for excellence every day, are adaptable and focused, and believe work should be fun, come join us! 
Primary Job Responsibilities:

  • Establishes service level objectives (SLOs), error budgets, and service level indicators (SLIs) as success criteria that our systems and processes consistently meet or exceed these targets
  • Builds recoverability into our services and systems, including disaster recovery (DR), backups / recovery, and incorporation of multi-AZ, multi-regionality into cloud constructs
  • Manages connectivity (CIDRs, VPCs, Subnets), latency, and availability across distributed systems
  • Establishes clustering and load balancing techniques for high availability and scalability in containerized cloud native environments
  • Builds observability systems and services (monitoring, telemetry, tracing) for reuse in our platform architecture, creating alerting for fault identification and building dashboards for metrics
  • Operates and continuously improves our services' reliability, scalability, performance, security, and uptime
  • Learns constantly, including in available cloud managed services (PaaS/SaaS/IaaS), libraries, frameworks, and platforms (commercial and open source)
  • Participates in the company's application of Agile, Lean, and principles of fast flow to engineering department efficiency and productivity, and owns certain tasks in process automation to achieve fluidity

Skills and Qualifications:

  • B.S. in Computer Science, Software Engineering, or related. (M.S. Preferred) 
  • 5+ years' experience building and maintaining Linux/UNIX based systems, primarily in cloud environments (preferably GCP & AWS)
  • 3+ years' experience in a Reliability Engineering, DevOps, or infrastructure role, where infrastructure-as-code tools (e.g. Terraform, Puppet, Ansible, Chef) were used as a primary job function
  • 4+ years' experience coding in an object-oriented language, such as Java, Python, Go, or Kotlin
  • 2+ years' experience with containers and orchestration platforms, including Kubernetes and Docker
  • Deep knowledge of infrastructure systems, networking, and security, including in a cloud environment
  • Experience owning operational reliability, scalability, recoverability (backups, disaster recovery, failover) and capacity planning
  • Experience performing operational activities including batch processing, system backups, maintenance, monitoring and providing first-tier on-call support and being part of a 24x7 response team
  • Experience with distributed, scalable microservices and event-driven architectures
  • Experience with data storage, replication, caching, and search technologies, such as PostgreSQL, MySQL, MS SQL Server, Amazon RDS, GCP CloudSQL, Redis, Elasticsearch, and Lucene/Solr
  • Holds at least one professional certification in AWS or GCP (DevOps or SysOps Engineer preferred) 

If you are ready to start the conversation about how you might contribute to all the happenings at PitchBook, submit your resume today! PitchBook appreciates and respects diversity, and as such, we are an equal opportunity employer.
#LI-JH1

Read Full Job Description
Apply Now
By clicking Apply Now you agree to share your profile information with the hiring company.

Location

Located just blocks away from Pike Place Market and historic Pioneer Square, our headquarters boast stunning mountain views

Similar Jobs

Apply Now
By clicking Apply Now you agree to share your profile information with the hiring company.
Learn more about PitchBookFind similar jobs