Voltage Park Logo

Voltage Park

Infrastructure Engineer (Observability)

Reposted 24 Days Ago
Remote
Hiring Remotely in USA
140K-180K Annually
Expert/Leader
Remote
Hiring Remotely in USA
140K-180K Annually
Expert/Leader
The Infrastructure Engineer will design and maintain observability platforms, such as metrics and alerting systems, collaborating closely with various teams to enhance operational insights and reliability.
The summary above was generated by AI

Voltage Park is seeking an Infrastructure Engineer with a focus on Observability to join our Infrastructure Engineering team. Our engineers design and operate the systems that manage thousands of bare-metal servers, GPUs, and high-performance networks across multiple data centers.

This role combines the breadth of a core infrastructure engineer with a specialty in observability and telemetry. You’ll design and operate metrics, logs, traces, and alerting pipelines that provide actionable insights for both internal teams and external customers — helping to ensure reliability and transparency at scale.

This is a fully remote position, although candidates must be based in the continental United States. Unfortunately, we are unable to provide sponsorship for this role.

Responsibilities
  • Design, build, and maintain observability platforms spanning metrics, logs, traces, and events.

  • Create dashboards and alerting for internal stakeholders (InfraOps, Engineering, Customer Success) and scoped visibility for external customers.

  • Ingest and correlate telemetry from GPUs, CPUs, networking (Ethernet & InfiniBand), containers, APIs, and BMC/Redfish.

  • Implement noise-resistant alerting pipelines that improve detection and reduce operational load.

  • Collaborate with infrastructure, platform, and customer-facing teams to embed observability into workflows.

  • Contribute to broader infrastructure engineering projects beyond observability.

Qualifications
  • 8+ years in infrastructure engineering, SRE, or observability roles.
    Strong experience with monitoring systems (Prometheus, Grafana, ELK, VictoriaMetrics, or similar).

  • Proficiency in Python, Go, or bash for automation and data integration.

  • Familiarity with container/Kubernetes observability.

  • Understanding of streaming telemetry pipelines (Kafka, OTEL, Promtail, or equivalent).

  • Strong written and verbal communication skills.

Ideal Experiences
  • Experience with GPU observability, particularly NVIDIA DCGM.

  • Designing multi-tenant observability solutions with RBAC and scoped queries.

  • Prior work with correlation engines for RCA, forecasting, or predictive alerting.

  • Broader exposure to infrastructure domains (networking, storage, provisioning).

Culture
  • You enjoy working with a small, highly motivated team.

  • You’re comfortable balancing autonomy with company-wide priorities.

  • You value clarity, documentation, and actionable insights in observability systems.

You’re excited to specialize in observability while contributing as a core infrastructure engineer.

Voltage Park is an equal opportunity employer and makes employment decisions on the basis of merit. All qualified applicants will receive consideration without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, protected veteran status, or any other characteristic protected by law.

Voltage Park is an equal opportunity employer and makes employment decisions on the basis of merit. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, protected veteran status, or any other characteristic under federal, state, or local law. If you require an accommodation during the job application process, please notify your recruiter. 

Top Skills

Bash
Elk
Go
Grafana
Kafka
Kubernetes
Otel
Prometheus
Promtail
Python
Victoriametrics

Voltage Park Redmond, Washington, USA Office

15809 Bear Creek Pkwy Suite 300, Redmond, WA, United States, 98052

Similar Jobs at Voltage Park

16 Days Ago
Remote
USA
145K-220K Annually
Senior level
145K-220K Annually
Senior level
Artificial Intelligence • Cloud • Hardware • Machine Learning • Software • Infrastructure as a Service (IaaS)
The role involves advising on legal and compliance issues in AI and cloud services, managing IP matters, and ensuring data privacy compliance.
Top Skills: AICloud ServicesData ComplianceIntellectual Property
22 Days Ago
Remote
USA
140K-200K Annually
Senior level
140K-200K Annually
Senior level
Artificial Intelligence • Cloud • Hardware • Machine Learning • Software • Infrastructure as a Service (IaaS)
The Infrastructure Operations Engineer ensures the stability and scalability of high-performance environments for AI/ML workloads through design, deployment, and collaboration with internal teams.
Top Skills: AnsibleAWSBashCephElk StackGoJuniper NetworksKubernetesLinuxNfsPalo AltoPrometheusPythonSonicTerraformUbuntu
24 Days Ago
Remote
USA
160K-200K Annually
Senior level
160K-200K Annually
Senior level
Artificial Intelligence • Cloud • Hardware • Machine Learning • Software • Infrastructure as a Service (IaaS)
Design and implement automation tools and APIs for managing infrastructure, collaborate with engineering teams, and participate in architectural discussions.
Top Skills: ContainerizationDell HardwareHpc InfrastructureJuniper NetworksLinuxNetworkingOrchestrationPalo Alto FirewallsPythonSonic SwitchesVast Storage Systems

What you need to know about the Seattle Tech Scene

Home to tech titans like Microsoft and Amazon, Seattle punches far above its weight in innovation. But its surrounding mountains, sprinkled with world-famous hiking trails and climbing routes, make the city a destination for outdoorsy types as well. Established as a logging town before shifting to shipbuilding and logistics, the Emerald City is now known for its contributions to aerospace, software, biotech and cloud computing. And its status as a thriving tech ecosystem is attracting out-of-town companies looking to establish new tech and engineering hubs.

Key Facts About Seattle Tech

  • Number of Tech Workers: 287,000; 13% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Amazon, Microsoft, Meta, Google
  • Key Industries: Artificial intelligence, cloud computing, software, biotechnology, game development
  • Funding Landscape: $3.1 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Madrona, Fuse, Tola, Maveron
  • Research Centers and Universities: University of Washington, Seattle University, Seattle Pacific University, Allen Institute for Brain Science, Bill & Melinda Gates Foundation, Seattle Children’s Research Institute

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account