Topstep Logo

Topstep

Staff Site Reliability Engineer

Reposted 5 Days Ago
Easy Apply
Remote
Hiring Remotely in United States
200K-250K Annually
Expert/Leader
Easy Apply
Remote
Hiring Remotely in United States
200K-250K Annually
Expert/Leader
As a Staff Site Reliability Engineer, you will shape reliability practices, optimize AWS infrastructure, lead incident response, and mentor engineers.
The summary above was generated by AI

Summary 

Are you a systems-minded engineer who thrives on building resilient infrastructure, driving operational excellence, and enabling teams to move fast with confidence? As a Staff Site Reliability Engineer at Topstep, you'll play a foundational role in shaping how we approach reliability, observability, and infrastructure at scale. You'll be instrumental in building out our SRE practice, defining our incident response culture, closing observability gaps, and optimizing our AWS infrastructure for both performance and cost. This role is ideal for someone who brings both deep technical expertise and a builder's mindset. Someone who's excited to establish best practices from the ground up, embed reliability into engineering culture, and create the foundations that let teams ship with speed and confidence. Join us and help define what operational excellence looks like at Topstep.

Key Responsibilities 

  • Set technical direction for reliability and observability across the entire engineering organization, influencing architectural decisions.
  • Build and mature our SRE practice defining SLOs, incident response protocols, and on-call standards
  • Own the observability stack using DataDog (primary platform for metrics, APM, logging) and CloudWatch (AWS-native monitoring), instrumenting distributed tracing and closing gaps that currently prevent diagnosis of production issues
  • Partner with engineering teams to embed reliability principles early in the design process and improve system resilience
  • Lead incident response and blameless post-mortems, turning outages into opportunities for systematic improvement
  • Mentor engineers across the organization on reliability practices, operational thinking, and production ownership
  • Champion a culture of transparency, continuous improvement, and shared ownership of production systems

Required Qualifications and Key Competencies

  • 7+ years of professional experience in SRE, infrastructure, or platform engineering, with demonstrated impact building practices that scaled across multiple teams
  • Proven track record either starting an SRE function from scratch or scaling an existing practice with measurable improvements to MTTR, MTTD, change failure rate, or availability
  • Strong proficiency with DataDog for end-to-end observability (metrics, APM, logs, distributed tracing) and building alerting that catches real issues without causing fatigue
  • Deep expertise with AWS infrastructure (EKS, ECS, EC2, and RDS) running production services at scale, and hands-on experience optimizing for both reliability and cost
  • Solid foundation in distributed systems, networking, database performance, and debugging complex system failures across service boundaries
  • Comfortable reading code, writing automation scripts, and contributing to infrastructure tooling when needed
  • Proficiency with infrastructure as code (Terraform) and GitOps practices
  • Track record of influencing engineering culture through documentation, tooling, mentorship, and technical leadership
  • Excellent communication skills with the ability to explain complex system behavior and trade-offs to varied audiences
  • Comfortable making pragmatic trade-offs between long-term platform vision and immediate business needs

Company Culture & Perks

  • Topstep is an engaging working environment which ranges from fully remote to hybrid. We foster a culture of collaboration with cameras on during meetings and a robust Slack environment for communication. 
  • 10 Company paid Holidays and generous Family Leave. Paid time off is accrued monthly.
  • Competitive 401(k) matching, health, dental, and vision insurance is offered for full time employees 
  • Vacations are encouraged with a bonus for taking 5 consecutive days. Employee referrals are bonused. Topstep offers a food and groceries budget and contributes towards health and wellness.

New Hire Base Salary Range

  • $200,000-$250,000
  • Bonus: This position is eligible for a performance-based bonus as provided by the plan terms and governing documents.
  • The compensation offered will take into account internal compensation structure and may vary depending on the candidate's geographic region, job-related knowledge, skills, and experience among other factors.

Equal Opportunity Employer

Topstep is an Equal Opportunity Employer. We are committed to fostering an inclusive environment where all employees and applicants are valued. All qualified candidates will receive consideration for employment without regard to race, color, religion, gender, gender identity or expression, sexual orientation, national origin, age, disability, or veteran status, in compliance with applicable federal, state, and local laws.

Interested in the role? Apply today with your resume and cover letter!

At this time immigration sponsorship is not available for this position (including H-1B, STEM OPT training plans, etc.).

Top Skills

AWS
Datadog
Gitops
Terraform

Similar Jobs

14 Days Ago
Remote or Hybrid
Orlando, FL, USA
Senior level
Senior level
Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Support and maintain the reliability, scalability, and performance of cloud infrastructure for US Public Sector customers, utilizing software development and systems engineering skills. Resolve issues, mentor team members, and drive automation initiatives.
Top Skills: AnsibleAWSAzureBashDockerGCPGrafanaJavaJavaScriptKafkaKubernetesLinuxMaria DbMySQLNginxOpenstackOraclePostgresPrometheusPuppetPythonSplunkTerraform
22 Days Ago
Remote or Hybrid
New York, NY, USA
130K-180K Annually
Senior level
130K-180K Annually
Senior level
AdTech • Cloud • Digital Media • Information Technology • News + Entertainment • App development
Oversee SAP BTP CPI operations, manage incidents, collaborate with teams for enhancement and deployment, ensuring system availability and performance.
Top Skills: AbapCapmCloud ConnectorCpiIdocJSONMessage QueuesOauthOdataRestSAMLSap BtpSfapiSftpSoapXML
An Hour Ago
Remote or Hybrid
US
183K-245K Annually
Senior level
183K-245K Annually
Senior level
Artificial Intelligence • Cloud • Fintech • Machine Learning • Mobile • Software
The Staff Site Reliability Engineer will design, implement, and optimize infrastructure for AI services, ensure reliability and performance, and drive automation and observability excellence across engineering teams.
Top Skills: AzureAzure DevopsDockerElk StackGithub ActionsGrafanaKubernetesMimirPostgresPrometheusSQL ServerTeamcityTerraform

What you need to know about the Seattle Tech Scene

Home to tech titans like Microsoft and Amazon, Seattle punches far above its weight in innovation. But its surrounding mountains, sprinkled with world-famous hiking trails and climbing routes, make the city a destination for outdoorsy types as well. Established as a logging town before shifting to shipbuilding and logistics, the Emerald City is now known for its contributions to aerospace, software, biotech and cloud computing. And its status as a thriving tech ecosystem is attracting out-of-town companies looking to establish new tech and engineering hubs.

Key Facts About Seattle Tech

  • Number of Tech Workers: 287,000; 13% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Amazon, Microsoft, Meta, Google
  • Key Industries: Artificial intelligence, cloud computing, software, biotechnology, game development
  • Funding Landscape: $3.1 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Madrona, Fuse, Tola, Maveron
  • Research Centers and Universities: University of Washington, Seattle University, Seattle Pacific University, Allen Institute for Brain Science, Bill & Melinda Gates Foundation, Seattle Children’s Research Institute

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account