Get the job you really want.
Maximum of 25 job preferences reached.
Top Remote Site Reliability Engineer Jobs in Seattle, WA
Software • Analytics
The role involves automating and managing AWS infrastructure, ensuring reliability and scalability of stateful systems, and optimizing deployment processes. You'll also handle incident responses and improve operational tooling.
Top Skills:
AWSKubernetesTerraformTerragrunt
Healthtech • Other • Software
The Database Site Reliability Engineer will ensure the reliability and performance of PostgreSQL services, manage incidents, and automate tasks while collaborating with cross-functional teams in a 24x7 SaaS environment.
Top Skills:
AnsibleBashDatadogETLGrafanaPostgresPowershellPrometheusPythonTerraform
3D Printing • Artificial Intelligence • Software • Design
The role involves building reliable platforms for 3D/4D content delivery to AR/VR devices, monitoring system health, and improving operational practices in collaboration with the engineering team.
Top Skills:
Aws FargateCoreweaveGrafanaKubernetesPrometheusTerraform
Reposted 3 Days AgoSaved
Easy Apply
Easy Apply
Analytics
The Site Reliability Engineer will ensure the reliability and performance of IaaS services, perform incident resolution, and enhance system reliability through automation while supporting mobility across hybrid infrastructures and collaborating extensively with various teams.
Top Skills:
AnsibleAWSAzureBashGitlab CiJenkinsKubernetesLinuxOpenshiftPythonTerraformVmware Vsphere
Energy • Manufacturing • Solar • Renewable Energy
The Platform System Reliability Engineer manages and optimizes EKS Kubernetes environments, focusing on security, scalability, and performance, while automating processes and troubleshooting complex issues.
Top Skills:
AlbAnsibleAWSDatadogDynatraceEc2EksGoGrafanaKubernetesMskPrometheusPythonRdsS3SplunkTerraform
Fintech • Software
The Senior Site Reliability Engineer ensures fast, stable SaaS products through automation, collaboration, monitoring, and implementing AI tools to enhance performance and reliability.
Top Skills:
Ai ToolsAnsibleAppdynamicsAWSAzureAzure DevopsBashC# .NetCosmosDatadogDynatraceHarnessJavaJenkinsKubernetesNew RelicPowershellPythonSaaSSQLTerraform
Cloud • Software
The Site Reliability Engineer (SRE) will manage reliable, scalable systems, focusing on software development, infrastructure automation, and incident response. Responsibilities include monitoring, CI/CD pipeline management, security compliance, and cost optimization while collaborating with various teams.
Top Skills:
AWSAzureDockerElk StackGCPGitGrafanaJavaKubernetesPHPPrometheusPythonShellTerraform
Security • Software • Analytics
Design, operate, and automate scalable, secure infrastructure for Axiom Cloud. Define SLOs, plan disaster recovery and capacity, tune performance, improve deployment practices, build reliability tooling, respond to incidents, and promote monitoring and observability across teams.
Top Skills:
Amazon EksAWSCircleCIDockerGithub ActionsGitlabGoKubernetesLinuxLlmsMonitoring And Observability ToolsPulumiTerraform
Blockchain • Fintech • Social Media • Cryptocurrency • NFT • Web3
Design, build, and operate scalable, highly available infrastructure and platform software for Zora's blockchain services (indexer, APIs, data pipelines). Automate workflows, maintain core systems, improve developer experience, participate in on-call rotation, and contribute strategic technical direction.
Top Skills:
AsyncioBaseBridgesCephCloudflare Pages FunctionsDatadogDockerEthereumGoIpfsKubernetesMongoDBOpentelemetryOptimismOptimistic RollupsPlasmaPolygonPostgresPythonRpc NodesSidechainsVercelZk-Rollups
Blockchain • Financial Services • Cryptocurrency • Web3
As a SRE/DevOps Engineer at Kraken, you will build infrastructure, support tools, drive standardization, and guide engineers in an efficient remote environment.
Top Skills:
BashContinuous IntegrationDockerGitGrafanaLinuxPrometheusPythonRustTerraform
Blockchain • Financial Services • Cryptocurrency • Web3
As a Senior Site Reliability Engineer, you will manage the reliability and efficiency of Kraken's Data platform, working with multiple teams to ensure high performance and scalability. Responsibilities include designing data governance mechanisms, managing CI/CD pipelines, implementing monitoring solutions, and collaborating on various data projects.
Top Skills:
Apache AirflowSparkAWSDebeziumDockerKafkaKubernetesPythonTerraform
Cloud • Information Technology • Security • Software
As a DevOps Architect, you'll lead automation efforts for SaaS services, mentor junior team members, and set strategic directions for CI/CD and monitoring solutions.
Top Skills:
AIAWSAzureFluxGCPGoGrafanaJavaJenkinsKubernetesOciProgramming Languages: C/C++PrometheusPython
New
Cut your apply time in half.
Use ourAI Assistantto automatically fill your job applications.
Use For Free
Artificial Intelligence • Big Data • Computer Vision • Machine Learning • Natural Language Processing • Software • Cybersecurity
Maintain and improve the internal developer platform, observability stack, and AWS infrastructure (Terraform); manage Kubernetes at scale; troubleshoot distributed systems; drive security, reliability, cost and performance improvements; partner with product teams and participate in on-call support.
Top Skills:
AWSCkaContainersGoKubernetesLgtm StackLinuxOpensearchPythonServerlessTcp/IpTerraform
Information Technology • Software
The Site Reliability Engineer manages system reliability, performance, and scalability for end-user services, leading software deployments, incident management, and service quality improvements. Responsibilities include collaboration with teams, maintaining a product roadmap, and automation of processes.
Top Skills:
AgileAternityDevsecopsItilPowershellPython
Aerospace • Manufacturing
As a Site Reliability Engineer, you'll build and manage observability platforms for satellite communications, define SLOs/SLIs, and collaborate on incident response and deployment automation.
Top Skills:
ArgocdAWSElkGCPGoGrafanaIstioJaegerKubernetesLinkerdLokiOpentelemetryPrometheusPythonTempoTerraform
Security • Software
Maintain, automate, and improve operational tools and customer deployment processes; monitor and ensure service SLOs, backup/restore, alerting, and incident response; drive GitOps/IaC practices, cost tracking, and automation of repetitive tasks while supporting outages and upgrades.
Top Skills:
AnsibleAWSAzureBashGCPGitopsGrafanaHelmKubernetesPrometheusPythonTerraform
Fintech
The Staff Site Reliability Engineer role involves leading architecture, automating GCP environment, defining SLIs and SLOs, mentoring teammates, and enhancing system reliability and performance.
Top Skills:
ArgocdDatadogGCPGoHelmJavaScriptKubernetesPythonTerraformTypescript
Cloud • Software • Database
Lead design, build, and operate the YugabyteDB DBaaS infrastructure. Drive architecture, automate lifecycle and maintenance, manage incidents and on-call rotations, implement security/encryption processes, and optimize reliability using SRE principles and observability.
Top Skills:
AksAnsibleAWSAzureBashDockerEksGCPGitGithub ActionsGkeJavaKubernetesLinuxPostgresPrometheusPythonShellTerraform
Database
The Site Reliability Engineer will oversee the Digital Realty interconnection fabric network infrastructure, focusing on network operations, automation, and development. Responsibilities include maintaining global network infrastructure, responding to alerts, and working with various cloud platforms and automation tools.
Top Skills:
AnsibleAWSAzureGitGCPIbm CloudJenkinsLinuxOracle CloudPythonTerraform
Aerospace • Big Data • Greentech • Hardware • Social Impact
The Site Reliability Engineer will build, deploy, and operate computing services for satellite imaging, ensuring reliable and scalable infrastructure while collaborating with cross-functional teams.
Top Skills:
AlloyAnsibleBashCloud-Native InfrastructureGrafanaHelmK3SKubernetesKustomizeOpentelemetryPrometheusProxmoxPythonRke2TalosTerraform
Artificial Intelligence • Cloud • Machine Learning • Software • Database • App development • Generative AI
As a Site Reliability Engineer at Replit, you'll enhance system reliability through observability, automation, incident management, and performance optimization, serving millions globally.
Top Skills:
AnsibleDatadogGoGoogle Cloud PlatformGrafanaKubernetesPrometheusPulumiPythonTerraform
Artificial Intelligence • Cloud • Machine Learning • Software • Database • App development • Generative AI
As a Staff Site Reliability Engineer at Replit, you will ensure infrastructure reliability, drive automation, lead incident management, and mentor the engineering team while enhancing system performance and observability.
Top Skills:
DatadogGoGoogle Cloud PlatformGrafanaKubernetesOpentelemetryPrometheusPythonTerraform
Logistics • Software • Transportation
Lead and mentor teams in DevOps and SRE, architect scalable Azure Cloud infrastructure, implement CI/CD and IaC, ensure database reliability, and drive cross-functional collaboration.
Top Skills:
Azure CloudAzure DevopsCi/CdCosmosdbDockerElkGrafanaKubernetesMySQLPostgresPrometheusRedisSQL ServerTerraform
Healthtech • Software
Maintain reliability, performance, and scalability of cloud-hosted services and databases. Implement SRE best practices, define SLIs/SLOs, respond to incidents, build monitoring and automation, perform DBA tasks (backups, restores, tuning), support CI/CD and DB migrations, and document runbooks and procedures.
Top Skills:
Amazon RdsAzure Sql DatabaseBashEcs FargateFlywayGitlabJenkinsKubernetesLiquibaseOctopus DeployOraclePostgresPowershellPythonRedisSolarwinds DpaSQL Server
Software
The role involves managing compute infrastructure for decentralized applications, requiring critical thinking, documentation skills, and experience in Kubernetes and blockchain management.
Top Skills:
BlockchainGitopsInfrastructure-As-CodeKubernetesProgramming Languages
Top Seattle, WA Companies Hiring Remote Site Reliability Engineers
See AllPopular Job Searches
All Filters
Total selected ()
No Results
No Results
































