OXIO Jobs

Site Reliability Engineer

OXIO

Site Reliability Engineer

Posted 21 Days Ago

Remote

Hiring Remotely in USA

Mid level

Remote

Hiring Remotely in USA

Mid level

As a Site Reliability Engineer, you will design cloud platforms, automate operations, maintain infrastructure, and support engineering teams in delivering reliable services.

The summary above was generated by AI

Site Reliability Engineer
OXIO is the first NeoTelco. We are building the world’s largest, most accessible, and insightful Telecom network. Our platform empowers anyone to spin up their own carrier from a browser, scaling and supporting you as you scale your network to millions of users.

We ensure that users and devices are connected, and stay connected wherever they go: Cross- country, carrier, or cellular technology. We help them pay less for mobile data. This technology is provided through our Carrier-as-a-Service platform: BrandVNO, a fully customizable telecom service. In addition, we enable clients of our service to extract the value from telecom data - enriching their customer experience, business intelligence, and product understanding in the many markets in which we operate.

Come join us in creating a modern technology platform with a group of engineers dedicated to advancing our vision. Our team is passionate about what we build, open to new ideas and challenges, and has our sights set on the future of connectivity.

Responsibilities

Design and implement platform on the cloud to support OXIO backend services
Automate technical operations: deployments, scaling, recovery, etc.
Monitor and maintain mission-critical production infrastructure to ensure maximum uptime
Participate in an on-call rotation and culture of continuous improvement through blameless postmortems
Enable the Engineering/Telecom/Data Engineering teams by providing them the tools to operate the service they build

Essentials

Understanding of Linux/Unix systems (most systems are Linux-based).
Familiarity with Linux/Unix system internals like process management, filesystems, memory management, and networking.
Proficiency in at least one programming language (Python, Go, or Ruby) and strong skills in scripting (Bash, Perl).
Experience with infrastructure provisioning tools such as Terraform, CloudFormation, or Ansible.
Familiarity with containerization (Docker) and orchestration tools (Kubernetes).
Familiarity with monitoring tools like Prometheus, Grafana, or Datadog.
Knowledge of setting up alerts, analyzing logs, and creating dashboards for observability.
Familiarity with incident management practices (e.g., runbooks, postmortems).
Experience in being part of an on-call rotation and handling incidents.
Experience in setting up and maintaining Continuous Integration/Continuous Delivery pipelines (Jenkins, GitLab CI, CircleCI, etc.).
Hands-on experience with cloud providers (AWS, Google Cloud, Azure).
Knowledge of virtualization technologies (VMware, KVM) and cloud-native architecture.
Understanding of TCP/IP, DNS, HTTP/HTTPS, load balancing, and firewalls.

Nice to have

Strong understanding of deployment strategies (canary releases, blue-green deployments, etc.).
Familiarity with high availability and understanding failover mechanisms.
Familiarity with IAM (Identity and Access Management) and zero trust principles.
Experience working with distributed systems (e.g., Kafka, Cassandra, Elasticsearch).

Building custom monitoring tools or writing complex automation scripts.

Functional knowledge of database management (SQL and NoSQL).

Familiarity with distributed tracing (Jaeger, OpenTelemetry) and advanced log aggregation strategies (ELK stack, Splunk).
Familiarity with performance profiling tools and optimizing application performance under heavy load.

Familiarity in load testing and identifying bottlenecks.

Familiarity with Configuration Managment using SaltStack for maintaining server configurations.

Similar Jobs

MongoDB

Site Reliability Engineer

36 Minutes Ago

Easy Apply

Remote or Hybrid

New Jersey, USA

Easy Apply

127K-249K Annually

Senior level

127K-249K Annually

Senior level

Big Data • Cloud • Software • Database

Maintain and improve multi-cloud Kubernetes infrastructure, CI/CD (Argo Workflows/ArgoCD), observability, and networking. Build reliable continuous deployment tooling and onboarding flows, provide internal support, collaborate across Platform Engineering, contribute upstream (open-source/operators), and participate in a 24/7 on-call rotation to resolve deployment infrastructure issues.

Top Skills: AlertingArgo WorkflowsArgocdAWSAzureCi/CdContainersDnsGCPGoKubernetesLinuxLoad BalancerObservabilityPythonService MeshTcp/IpTls

Domino Data Lab

Site Reliability Engineer

Yesterday

Easy Apply

Remote or Hybrid

Easy Apply

200K-230K Annually

Senior level

200K-230K Annually

Senior level

Artificial Intelligence • Machine Learning

Lead development of AI-assisted reliability tooling, own incident response end-to-end, improve observability and SLO/SLI frameworks, scale single-tenant SaaS operations, mentor engineers, and reduce recurring operational toil through engineering and automation.

Top Skills: Cloud PlatformsGoKubernetesLinuxLlm/Ai ToolingLogs And TracingObservability ToolingPythonSlo/Sli Frameworks

Coinbase

Site Reliability Engineer

8 Days Ago

Easy Apply

Remote

USA

Easy Apply

218K-257K Annually

Senior level

218K-257K Annually

Senior level

Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3

Own reliability, monitoring, and incident response for AI infrastructure; build automation and CI/CD tooling; manage Kubernetes/Docker production workloads; partner with infrastructure, security, and compliance; improve observability and documentation; develop internal full‑stack tooling in Go or Python.

Top Skills: AnsibleAWSBashChefCi/CdDockerEc2GitGoKubernetesLinuxLog AggregationNetwork SecurityPuppetPythonRubySaltTerraform

What you need to know about the Seattle Tech Scene

Home to tech titans like Microsoft and Amazon, Seattle punches far above its weight in innovation. But its surrounding mountains, sprinkled with world-famous hiking trails and climbing routes, make the city a destination for outdoorsy types as well. Established as a logging town before shifting to shipbuilding and logistics, the Emerald City is now known for its contributions to aerospace, software, biotech and cloud computing. And its status as a thriving tech ecosystem is attracting out-of-town companies looking to establish new tech and engineering hubs.

Key Facts About Seattle Tech

Number of Tech Workers: 287,000; 13% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Amazon, Microsoft, Meta, Google
Key Industries: Artificial intelligence, cloud computing, software, biotechnology, game development
Funding Landscape: $3.1 billion in venture capital funding in 2024 (Pitchbook)
Notable Investors: Madrona, Fuse, Tola, Maveron
Research Centers and Universities: University of Washington, Seattle University, Seattle Pacific University, Allen Institute for Brain Science, Bill & Melinda Gates Foundation, Seattle Children’s Research Institute