ElastixAI Logo

ElastixAI

AI Software Engineer

Posted Yesterday
Be an Early Applicant
Hybrid
Seattle, WA, USA
Mid level
Hybrid
Seattle, WA, USA
Mid level
Design and optimize a low-level AI inference serving stack: customize open-source frameworks, build model partitioning/scheduling, integrate with proprietary accelerators, profile and optimize across Python orchestration to C++ kernels and drivers, and enable PyTorch-native deployment tooling.
The summary above was generated by AI
About Elastix AI

We are building the next-gen AI inference platform.

Description

Job Title: Software Engineer, AI Inference Platform

Company: ElastixAI, Inc.

Location: Seattle, WA (Hybrid - 3 days/week in office)

About ElastixAI

ElastixAI is an early-stage startup building the next-generation AI inference infrastructure — co-designed across ML software and custom accelerator hardware. Our platform dynamically optimizes inference efficiency and scalability across diverse deployments, enabling adaptive, high-performance AI serving.

Role Summary

We’re looking for a systems-minded AI Software Engineer to join our core inference platform team. You’ll design and extend the low-level serving stack — hacking open-source frameworks like vLLM, SGLang, and TensorRT-LLM, building new model sharding and scheduling logic, and integrating deeply with our proprietary AI accelerator. This role sits at the intersection of ML systems, compiler/runtime engineering, and hardware-software co-design.

Key Responsibilities
  • Architect, extend, and optimize core components of our AI serving platform for throughput, latency, and scalability.

  • Customize open-source serving frameworks (e.g., vLLM) for proprietary model ingestion and accelerator integration.

  • Develop efficient model partitioning, scheduling, and memory management strategies for multi-device inference.

  • Collaborate with ML engineers on model export and runtime optimization (quantization, graph transforms).

  • Work closely with hardware engineers to influence accelerator interface design and performance tuning.

  • Build APIs and runtime tools enabling flexible, PyTorch-native model deployment on our infrastructure.

  • Profile, debug, and optimize across the full stack — from Python orchestration to C++ kernels and PCIe drivers.

Required Qualifications
  • BS/MS/PhD in Computer Science, Electrical/Computer Engineering, or related field.

  • 3+ years of professional experience in systems programming, ML infrastructure, or distributed inference.

  • Proficient in C++ and Python, with strong debugging and performance analysis skills.

  • Deep familiarity with one or more LLM serving frameworks (vLLM, SGLang, TensorRT-LLM, DeepSpeed-Inference, etc.).

  • Understanding of model deployment internals — token scheduling, KV caching, batching, and pipelined inference.

  • Comfortable working close to the hardware abstraction layer — CUDA, PCIe, memory management, or runtime scheduling.

  • Strong collaboration and communication skills; ability to work cross-functionally in a fast-paced startup environment.

Preferred / Bonus
  • Experience with hardware-aware ML optimization, compiler/runtime integration, or accelerator SDKs.

  • Hands-on experience profiling GPU/accelerator workloads.

  • Familiarity with containerized deployments (Docker/Kubernetes).

  • Exposure to distributed systems or large-scale inference clusters.

  • Contributions to open-source ML or serving frameworks.

What We Offer:

  • A chance to be a foundational engineer in an innovative AI startup

  • A dynamic and collaborative work environment and the change to have a significant impact on new technology

  • The opportunity to work on challenging problems at the intersection of ML, software, and systems.

  • Competitive compensation and startup equity package

  • Comprehensive medical, dental, and vision coverage (100% paid by employer)

  • Life insurance and AD&D

  • Flexible Time Off (FTO)

  • 12-paid holidays

  • Paid parental leave

  • Gym or fitness benefit

  • Commuter benefit

  • Weekly catered lunches in the office

  • Investment in employee learning & development

Similar Jobs

4 Days Ago
Hybrid
Seattle, WA, USA
140K-240K Annually
Senior level
140K-240K Annually
Senior level
Artificial Intelligence • Fintech • Payments • Business Intelligence • Financial Services • Generative AI
Design, build, and operate autonomous infrastructure AI agents on Airwallex's Quartermaster platform to automate SRE, DevOps, and DBA workflows. Integrate agents with Kubernetes, Terraform, cloud APIs, monitoring, databases, and CI/CD. Define agent architecture, safety guardrails, evaluation metrics, and collaborate with SRE and platform teams to deploy reliable, auditable autonomous systems in production.
Top Skills: AliyunAWSCi/CdDatabasesGCPGoJavaKotlinKubernetesLlmsMonitoring/ObservabilityPythonQuartermasterTerraform
9 Days Ago
Easy Apply
Hybrid
Seattle, WA, USA
Easy Apply
163K-247K Annually
Senior level
163K-247K Annually
Senior level
Fintech • HR Tech
Own and build Gusto's AI assistant (Gus) and platform experiences end-to-end. Architect and implement scalable AI-native web experiences, create self-serve patterns for teams, collaborate with ML/product on prompts and escalation flows, run experiments, mentor engineers, and help scale a large Ruby on Rails and React application serving millions of users.
Top Skills: AIGraphQLReactRuby On Rails
10 Days Ago
In-Office
Bellevue, WA, USA
182K-242K Annually
Senior level
182K-242K Annually
Senior level
Cloud • Information Technology • Machine Learning
Design and build production-grade full-stack, AI-enabled applications. Develop React/Next.js frontends, backend services on Kubernetes, integrate LLM/AI features, connect data platforms, implement CI/CD, automated testing, observability, and ensure secure, high-performance APIs and services.
Top Skills: Ai/MlAutomated TestingC#Ci/CdDockerGoGrpcHelmJavaJavaScriptKafkaKubernetesLlmNext.JsObservabilityPythonReactRestSparkTypescript

What you need to know about the Seattle Tech Scene

Home to tech titans like Microsoft and Amazon, Seattle punches far above its weight in innovation. But its surrounding mountains, sprinkled with world-famous hiking trails and climbing routes, make the city a destination for outdoorsy types as well. Established as a logging town before shifting to shipbuilding and logistics, the Emerald City is now known for its contributions to aerospace, software, biotech and cloud computing. And its status as a thriving tech ecosystem is attracting out-of-town companies looking to establish new tech and engineering hubs.

Key Facts About Seattle Tech

  • Number of Tech Workers: 287,000; 13% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Amazon, Microsoft, Meta, Google
  • Key Industries: Artificial intelligence, cloud computing, software, biotechnology, game development
  • Funding Landscape: $3.1 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Madrona, Fuse, Tola, Maveron
  • Research Centers and Universities: University of Washington, Seattle University, Seattle Pacific University, Allen Institute for Brain Science, Bill & Melinda Gates Foundation, Seattle Children’s Research Institute

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account