ProRata.ai Logo

ProRata.ai

Senior Software Engineer, Inference

Job Posted 23 Days Ago Posted 23 Days Ago
Be an Early Applicant
In-Office
Bellevue, WA
160K-200K Annually
Senior level
In-Office
Bellevue, WA
160K-200K Annually
Senior level
Develop and optimize scalable inference systems for RAG applications, focusing on performance, latency, and cross-functional collaboration.
The summary above was generated by AI


Role

We’re looking for a Senior Software Engineer to join our Inference Team, where you’ll lead the design and development of our Retrieval-Augmented Generation (RAG) infrastructure. In this role, you will work closely with ML engineers, research scientists, and product teams to power both web search and API-based experiences for millions of users with fast, accurate, and context-aware responses. 

You will architect scalable systems that combine LLMs and vector retrieval, optimizing for relevance, recall, latency, and cost. This is a high-impact role focused on AI/ML inference, retrieval performance, and significant ownership in both technical decision-making and long-term architecture. 

  

Responsibilities

  • Design, build and scale a production-grade inference stack for RAG-based applications. 

  • Develop efficient retrieval pipelines using OpenSearch or similar vector databases, with a focus on high recall and response relevance. 

  • Optimize performance and latency for both real-time and batch queries. 

  • Identify and address bottlenecks in the inference stack to improve response times and system efficiency. 

  • Ensure high reliability, observability, and monitoring of deployed systems. 

  • Collaborate with cross-functional teams to integrate LLMs and retrieval components into user-facing applications. 

  • Evaluate and integrate modern RAG frameworks and tools to accelerate development. 

  • Guide architectural decisions, mentor team members, and uphold engineering excellence. 

  

Qualifications

  • Bachelor’s degree in Computer Science or related field, or equivalent practical experience. 

  • 8+ years of experience in software engineering, with a focus on AI/ML systems or distributed systems. 

  • Hands-on experience building and deploying retrieval-augmented generation (RAG) systems. 

  • Deep knowledge of OpenSearch, Elasticsearch, or similar search engines. 

  • Strong coding skills in Python and/or other backend languages (e.g., Rust, Java). 

  • Experience with vector search, embedding pipelines, and dense retrieval techniques. 

  • Proven ability to optimize inference stacks for latency, reliability, and scalability. 

  • Excellent problem-solving, analytical, and debugging skills. 

  • Strong sense of ownership, ability to work independently, and a self-starter mindset in fast-paced environments. 

  • Passion for building impactful technology aligned with our mission. 

  

Preferred Qualifications 

  • Experience with frameworks like LlamaIndex or LangChain. 

  • Familiarity with vector databases such as Pinecone, Qdrant, or FAISS. 

  • Exposure to LLM fine-tuning, semantic search, embeddings, and prompt engineering. 

  • Previous work on systems handling millions of users or queries per day. 

  • Familiarity with cloud infrastructure (AWS, GCP, or Azure) and containerization tools (Docker, Kubernetes). 


Work Environment

Location: This position is Onsite. This role is based at our Bellevue WA (or Pasadena, CA) office location, and employees are expected to work on-site during regular business hours. 

  

Compensation

The compensation for this position will be competitive and commensurate with experience. The estimated salary range for this role is 160,000 - 200,000 USD. 

What We Offer

  • Opportunity to work at the forefront of AI technology 

  • Collaborative and innovative work environment 

  • Competitive salary and benefits package 

  • Professional development and growth opportunities 

  • Chance to make a significant impact on the company's success 

 

Equal Employment Opportunity

  • ProRata is an Equal Opportunity Employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. All employment decisions are made based on qualifications, merit, and business needs.  

   

California Specific Notices

  • At-Will Employment: Employment at ProrataAI is at-will. This means that either the employee or the employer may terminate employment at any time, with or without cause or prior notice. 

  • Salary Disclosure: In compliance with California law, salary information is provided to ensure transparency and fairness. 

  • California Consumer Privacy Act (CCPA): ProrataAI complies with the CCPA. Personal information collected during the recruitment process will be used for employment purposes only.  

Top Skills

AWS
Azure
Docker
Elasticsearch
GCP
Java
Kubernetes
Opensearch
Python
Rust

Similar Jobs

20 Days Ago
In-Office or Remote
6 Locations
148K-288K
Senior level
148K-288K
Senior level
Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
As a Senior Applied AI Software Engineer, you will develop scalable AI infrastructure, focusing on distributed inference systems and optimizing GPU resource management in Kubernetes and Python environments.
Top Skills: C++DockerGoKubernetesNvidia Optimized Transfer LibraryPythonRustSglangTensorrt-LlmVllm
5 Hours Ago
Hybrid
4 Locations
60K-73K Annually
Junior
60K-73K Annually
Junior
AdTech • Consumer Web • Digital Media • eCommerce • Marketing Tech
The Digital Designer I will create and refine brand designs, work on digital assets, and collaborate with various teams on design projects.
Top Skills: Adobe Creative Suite
5 Hours Ago
In-Office
Seattle, WA, USA
200K-325K Annually
Mid level
200K-325K Annually
Mid level
Artificial Intelligence • Digital Media • eCommerce • Marketing Tech • Software
As a Senior Machine Learning Engineer, you will design and productionize machine learning models, collaborate with teams, and maintain high quality code.
Top Skills: AWSKubeflowMachine LearningTensorFlow

What you need to know about the Seattle Tech Scene

Home to tech titans like Microsoft and Amazon, Seattle punches far above its weight in innovation. But its surrounding mountains, sprinkled with world-famous hiking trails and climbing routes, make the city a destination for outdoorsy types as well. Established as a logging town before shifting to shipbuilding and logistics, the Emerald City is now known for its contributions to aerospace, software, biotech and cloud computing. And its status as a thriving tech ecosystem is attracting out-of-town companies looking to establish new tech and engineering hubs.

Key Facts About Seattle Tech

  • Number of Tech Workers: 287,000; 13% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Amazon, Microsoft, Meta, Google
  • Key Industries: Artificial intelligence, cloud computing, software, biotechnology, game development
  • Funding Landscape: $3.1 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Madrona, Fuse, Tola, Maveron
  • Research Centers and Universities: University of Washington, Seattle University, Seattle Pacific University, Allen Institute for Brain Science, Bill & Melinda Gates Foundation, Seattle Children’s Research Institute
By clicking Apply you agree to share your profile information with the hiring company.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account