Maximum of 25 job preferences reached.
Top Remote Site Reliability Engineer Jobs in Seattle, WA
Information Technology • Legal Tech
The Senior Technology Site Reliability Engineer is responsible for maintaining and optimizing infrastructure and applications, ensuring reliability and performance while automating processes and collaborating with teams.
Top Skills:
AWSChefDatadogGoGrafanaJavaPrometheusPuppetPythonSaltTerraform
Artificial Intelligence • Information Technology • Consulting
The Linux Systems Administrator will maintain and troubleshoot Linux systems, support network services, and work on systems integration while collaborating with infrastructure teams.
Top Skills:
DhcpDnsLinuxNtpPython
Information Technology • Cryptocurrency
The Site Reliability Engineer will lead technical initiatives, architect solutions, troubleshoot issues, mentor team members, and improve observability practices.
Top Skills:
ArgocdBashElk StackGCPGoGrafanaHelmKubernetesPrometheusPythonTerraform
Blockchain • Financial Services • Cryptocurrency • Web3
Design, build, and operate scalable, observable infrastructure for AI agent workflows. Build platform services, APIs, and SDKs; manage cloud, Kubernetes, and model-serving compute; implement IaC, CI/CD, monitoring, incident response, security controls, and runbooks; collaborate with AI and data teams to productionize agent prototypes.
Top Skills:
AWSBashCi/CdDockerKubernetesPythonTerraform
Information Technology • Internet of Things • Software • Virtual Reality
Lead reliability, availability, and resiliency strategies for large-scale systems, drive operational excellence, and provide technical mentorship across engineering teams.
Top Skills:
AWSCi/CdJavaMongoDBRabbitMQZookeeper
Cloud • Security • Software • Cybersecurity
Join the GOV/Sovereign Cloud SRE team to maintain and improve reliability for the Veeam Data Cloud. Responsibilities include incident response, SLIs/SLOs, observability (monitoring, alerting, dashboards), runbooks and documentation, IaC and CI/CD work in compliance-restricted environments, and participation in on-call rotations. Collaborate with engineering, security, and compliance teams to implement high availability and automation.
Top Skills:
ArgocdAzureAzure DevopsAzure GovernmentC#Elk StackGithub ActionsGitlab CiGoGrafanaJavaJavaScriptKubernetesOpentelemetryPrometheusPulumiTerraformTerragruntTypescript
HR Tech • Information Technology • Professional Services • Sales • Software
Own and operate production-grade Kubernetes infrastructure on AWS, build GitOps CI/CD with GitHub Actions and ArgoCD, develop AI agents and internal DevOps tooling, maintain Datadog-based observability, and manage on-call incident response while collaborating with engineering teams to improve reliability and delivery speed.
Top Skills:
Ai/LlmArgocdAWSCi/CdDatadogGithub ActionsGitopsGoKubernetesPython
AdTech • Digital Media • Information Technology • Other
As a Software Engineer in the Tooling and Reliability Platforms team, you'll develop AI services, manage incident tools, and utilize Infrastructure as Code for high-availability systems. You'll focus on integrating AI workflows and improving operational resilience for Yahoo's brands.
Top Skills:
AWSCloudFormationDockerGCPGoJavaKubernetesPythonTerraform
Artificial Intelligence • Healthtech • HR Tech • Software
Own the Heroku-to-GCP migration, maintain Postgres and data pipelines, optimize high‑traffic code paths, build monitoring/alerting, lead incident response and post‑mortems, reduce costs and scale proactively, and coach other infrastructure engineers.
Top Skills:
AppsignalBigQueryBugsnagCannyClaude CodeFivetranGoogle Cloud PlatformHerokuHexHotwireInfrastructure-As-CodePostgresRuby On Rails
Real Estate • Travel • PropTech
The Engineering Manager for Storage SRE will lead a team to ensure reliable database operations, improve developer experience, and expand tooling and operational models, focusing on mission-critical systems.
Top Skills:
Cloud InfrastructureDatabasesSite Reliability EngineeringStorage Systems
Artificial Intelligence • Fintech • Software • Financial Services
The SRE will own reliability for a cloud-native platform, optimizing performance, availability, and observability, while mentoring engineering teams.
Top Skills:
AWSClickhouseGoKafkaKubernetesPulumiPythonTerraform
Aerospace • Manufacturing
As a Site Reliability Engineer, you'll build and manage observability platforms for satellite communications, define SLOs/SLIs, and collaborate on incident response and deployment automation.
Top Skills:
ArgocdAWSElkGCPGoGrafanaIstioJaegerKubernetesLinkerdLokiOpentelemetryPrometheusPythonTempoTerraform
New
Cut your apply time in half.
Use ourAI Assistantto automatically fill your job applications.
Use For Free
Automotive
Design and implement scalable cloud infrastructure, monitor performance, automate processes, ensure security and compliance, and lead a DevOps team.
Top Skills:
AWSBashCi/CdDockerElk StackGCPGrafanaKubernetesPrometheusPythonTerraform
Cloud • Software
The Site Reliability Engineer will ensure reliable cloud operations by applying Python for infrastructure automation, managing OpenStack and Kubernetes, and practicing devsecops in a fast-paced environment.
Top Skills:
KubernetesLinuxOpenstackPython
Cloud • Security • Software • Generative AI
Design, build, and automate large-scale multi-cloud infrastructure and internal SRE tools. Improve host lifecycle, observability, alerting, and reliability; operate containerized workloads; participate in on-call rotations, incident response, runbooks, postmortems, code reviews, and mentoring.
Top Skills:
AnsibleArgo CdArgo WorkflowsCueDockerElastic StackGoGraphiteInfluxKubernetesLinuxPrometheusPuppetTerraformUbuntuUbuntu Live Patch
Cloud • Software • Database
Lead design, build, and operate the YugabyteDB DBaaS infrastructure. Drive architecture, automate lifecycle and maintenance, manage incidents and on-call rotations, implement security/encryption processes, and optimize reliability using SRE principles and observability.
Top Skills:
AksAnsibleAWSAzureBashDockerEksGCPGitGithub ActionsGkeJavaKubernetesLinuxPostgresPrometheusPythonShellTerraform
Other • Social Impact
As a Senior Site Reliability Engineer, you will design, develop, and maintain reliable infrastructure for Wikimedia's API services, ensuring performance and availability while driving reliability engineering practices and improving developer experience.
Top Skills:
AnsibleArgocdAWSAzureGCPGitlabGoKubernetesOpentelemetryPrometheusPythonTerraform
Other • Social Impact
The Senior Site Reliability Engineer is responsible for maintaining Wikimedia's infrastructure, improving reliability, automating processes, and collaborating with teams. The role involves troubleshooting, managing deployments, and leading incident responses while working remotely.
Top Skills:
AnsibleBashCassandraDebianGoGrafanaHhvmKubernetesMariadbMemcachedPHPPrometheusPuppetPythonRedisRubyShell
Financial Services
Prototype, write, test, document, and deploy release automation across environments. Build and maintain pipelines, collaborate with engineers and product teams, troubleshoot issues, participate in on-call rotation, and improve software delivery, configuration, monitoring, and operations.
Top Skills:
AnsibleBashDockerGitlabJenkinsKubernetesMssqlPostgresPowershellPythonRedisTeamcity
Cloud • Security • Software • Cybersecurity
The Site Reliability Engineer II - Database ensures the integrity, security, and performance of MySQL databases while collaborating with development and operations teams to address database issues and improve reliability.
Top Skills:
MySQLSQL
Artificial Intelligence • Cloud • Information Technology • Software
The Site Reliability Engineer will provision and manage Kubernetes clusters, build automation tools, debug customer issues, and improve infrastructure reliability.
Top Skills:
AnsibleBashDatadogGoGrafanaHelmKubernetesLokiPrometheusPythonTerraform
Logistics • Software • Transportation
Lead and mentor teams in DevOps and SRE, architect scalable Azure Cloud infrastructure, implement CI/CD and IaC, ensure database reliability, and drive cross-functional collaboration.
Top Skills:
Azure CloudAzure DevopsCi/CdCosmosdbDockerElkGrafanaKubernetesMySQLPostgresPrometheusRedisSQL ServerTerraform
3 Days AgoSaved
Blockchain • Fintech • Software • Cryptocurrency • Metaverse
Design, build, and maintain internal monitoring and alerting for high-load real-time systems; automate production testing; troubleshoot and resolve performance issues; coordinate cross-team incident resolution; recommend architectural and process improvements; research vendor solutions and enforce security best practices.
Top Skills:
AWSGCPJavaScriptLinuxNode.jsRest ApiWebsockets
Healthtech • Social Impact • Software
Own the operational lifecycle of cloud-native data infrastructure: design and automate reliable deployments, observability, incident response, SLIs/SLOs, autoscaling and IaC, and improve platform efficiency and data freshness across GKE and Cloud Run.
Top Skills:
BashBigQueryCloud BuildCloud MonitoringCloud RunDatadogDockerGCPGithub ActionsGkeGoGrafanaJIRAKubernetesPrometheusPulumiPythonSentrySlackSnykSonarqubeTerraform
Aerospace • Defense
Lead design, implementation, and operation of scalable, secure hybrid-cloud infrastructure for satellite ground systems. Improve developer experience, automate CI/CD and IaC, own observability, troubleshoot reliability issues, and collaborate with developers and satellite operators to advance SatDevOps practices.
Top Skills:
C/C++Ci/CdGCPGoGrafanaInfrastructure As Code (Iac)JavaKubernetesLokiPrometheusPythonRustSoftware Defined Networking (Sdn)
Let Your Resume Do The Work
Upload your resume to be matched with jobs you're a great fit for.
Success! We'll use this to further personalize your experience.
Top Seattle, WA Companies Hiring Remote Site Reliability Engineers
See AllPopular Job Searches
All Filters
Total selected ()
No Results
No Results





_1.png)















