Adaptive ML Logo

Adaptive ML

GPU Performance Engineer

Posted 4 Days Ago
Be an Early Applicant
In-Office
2 Locations
Mid level
In-Office
2 Locations
Mid level
The GPU Performance Engineer will optimize GPU code for LLM performance, contribute to the product roadmap, and engage in asynchronous communication within a collaborative team.
The summary above was generated by AI
About the team

Adaptive ML is building a reinforcement learning platform to tune, evaluate, and serve  specialized language models. We are pioneering the development of task-specific LLMs using synthetic data, creating the foundational tools and products needed for models to self-critique and self-improve based on simple guidelines. Adaptive Engine enables companies to build and deploy the best LLMs for their business. Our founders previously worked together to create state-of-the-art open LLMs. We closed a $20M seed with Index & ICONIQ in early 2024 and are live with our first enterprise customers (e.g., AT&T).

Our Technical Staff develops the foundational technology that powers Adaptive ML in alignment with requests and requirements from our Commercial and Product teams. We are committed to building robust, efficient technology and conducting at-scale, impactful research to drive our roadmap and deliver value to our customers.

About the role

As a GPU Performance Engineer in our Technical Staff, you will help ensure that our LLM stack (Adaptive Harmony) delivers state of the art performance across a wide variety of settings; from latency-bound regimes where serving requests with sub-second response times is key, to throughput-bound regimes during training and offline inference. You will help build the foundational technology powering Adaptive ML by delivering performance improvements directly to our clients as well as to our internal workloads. We are looking for self-driven, business-minded, and ambitious individuals interested in supporting real-world deployments of a highly technical product. As this is an early role, you will have the opportunity to shape our research efforts and product as we grow.

This is an in-person role based at our Paris or New York office.

Your responsibilities
  • Build and maintain fast and robust GPU code, focusing on delivering performance improvements in real world applications;

  • Write high-quality software in CUDA, CUTLASS, or Triton with a focus on performance and robustness;

  • Profile dedicated GPU kernels, optimizing across latency/compute-bound regimes for complex workloads;

  • Contribute to our product roadmap, by identifying promising trends that can improve performance;

  • Report clearly on your work to a distributed collaborative team, with a bias for asynchronous written communication.

Your (ideal) background

The background below is only suggestive of a few pointers we believe could be relevant. We welcome applications from candidates with diverse backgrounds; do not hesitate to get in touch if you think you could be a great fit, even if the below doesn't fully describe you.

  • A M.Sc. /Ph.D. in computer science, or demonstrated experience in software engineering, preferably with a focus on GPU-optimization;

  • Strong programming skills, preferably with a focus on systems and general purpose GPU programming;

  • A track record of writing high performance kernels, having preferably demonstrated ability to reach state of the art performance on well defined tasks;

  • Contributions to relevant open-source projects, such as CUTLASS, Triton and MLIR;

  • Passionate about the future of generative AI, and eager to build foundational technology to help machines deliver more singular experiences.

Benefits
  • Comprehensive medical (health, dental, and vision) insurance;

  • 401(k) plan with 4% matching (or equivalent);

  • Unlimited PTO — we strongly encourage at least 5 weeks each year;

  • Mental health, wellness, and personal development stipends;

  • Visa sponsorship if you wish to relocate to New York or Paris.

Top Skills

Cuda
Cutlass
Triton

Similar Jobs

7 Days Ago
Easy Apply
In-Office
3 Locations
Easy Apply
315K-560K Annually
Mid level
315K-560K Annually
Mid level
Artificial Intelligence • Natural Language Processing • Generative AI
As a GPU Performance Engineer, you will develop and implement systems for GPU optimization, enhancing performance for large language models, and addressing complexities in hardware and software integration.
Top Skills: CudaCutlassGpu ProgrammingJaxNcclNvlinkPyTorchTritonXla
43 Minutes Ago
Remote or Hybrid
New York, NY, USA
84K-105K Annually
Mid level
84K-105K Annually
Mid level
Digital Media • Gaming • Information Technology • Software • Sports • Esports • Big Data Analytics
Manage AI-powered support within the Customer Experience Team, optimizing AI agents, evaluating models, and collaborating with cross-functional partners to enhance customer experiences.
Top Skills: A/B TestingConversational AiGitMachine LearningNlp
2 Hours Ago
Remote or Hybrid
New York, NY, USA
139K-174K Annually
Senior level
139K-174K Annually
Senior level
Cloud • Fintech • Information Technology • Machine Learning • Software • App development • Generative AI
As an Executive Assistant to the CEO, you'll manage the CEO's calendar, handle travel arrangements, prepare meetings, and support various executive administrative tasks to keep the CEO focused on priority execution.
Top Skills: ConcurMS Office

What you need to know about the Seattle Tech Scene

Home to tech titans like Microsoft and Amazon, Seattle punches far above its weight in innovation. But its surrounding mountains, sprinkled with world-famous hiking trails and climbing routes, make the city a destination for outdoorsy types as well. Established as a logging town before shifting to shipbuilding and logistics, the Emerald City is now known for its contributions to aerospace, software, biotech and cloud computing. And its status as a thriving tech ecosystem is attracting out-of-town companies looking to establish new tech and engineering hubs.

Key Facts About Seattle Tech

  • Number of Tech Workers: 287,000; 13% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Amazon, Microsoft, Meta, Google
  • Key Industries: Artificial intelligence, cloud computing, software, biotechnology, game development
  • Funding Landscape: $3.1 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Madrona, Fuse, Tola, Maveron
  • Research Centers and Universities: University of Washington, Seattle University, Seattle Pacific University, Allen Institute for Brain Science, Bill & Melinda Gates Foundation, Seattle Children’s Research Institute

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account