AllCloud Logo

AllCloud

GPU Engineer

Sorry, this job was removed Sorry, this job was removed at 04:09 a.m. (PST) on Thursday, Jun 26, 2025
Be an Early Applicant
Remote
Hiring Remotely in United States
Remote
Hiring Remotely in United States

Similar Jobs

21 Days Ago
In-Office or Remote
6 Locations
148K-288K
Senior level
148K-288K
Senior level
Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
Guide design of LLVM backend JIT compiler for NVIDIA GPUs, develop compiler optimizations, and work with global teams on improvements.
Top Skills: C++CudaDxLlvmOpenglVulkan
11 Days Ago
In-Office or Remote
San Francisco, CA, USA
Senior level
Senior level
Artificial Intelligence • Generative AI
As a Senior Site Reliability Engineer, lead the operation of GPU clusters, manage Kubernetes, and implement Infrastructure-as-Code practices while optimizing performance and ensuring reliability.
Top Skills: AnsibleArgo CdBashCi/CdEbpfFluxGrafanaHelmInfinibandKubernetesNvidia DcgmOpentelemetryPrometheusPythonRdmaTerraform
6 Days Ago
In-Office or Remote
2 Locations
184K-357K
Senior level
184K-357K
Senior level
Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
The Senior Software Engineer will develop system software solutions for GPUs, focusing on display features, optimization strategies, and collaborating with teams on architecture specifications.
Top Skills: CDevice DriverEdpHdmiOperating System InternalsReal-Time Embedded Operating SystemsVesa Display Port Standards
Description

GPU Engineer

Location: US / Canada (Eastern Time) - Home based

Job Type: Full-time, Permanent 

About AllCloud

AllCloud is a global professional services company providing organizations with cloud enablement and transformation tools. As an AWS Premier Consulting Partner and audited MSP, a Salesforce Platinum Partner, and a Snowflake Premier Partner, AllCloud helps clients connect their front and back offices by building a new operating model to harness the benefits of cloud technology and data and analytics.

Job Summary

We are seeking an experienced GPU Engineer to join our innovative AI team at AllCloud. This role will be responsible for designing, implementing, and optimizing GPU-based infrastructure for large-scale LLM training and inference. The ideal candidate will have deep expertise in GPU architecture, parallel computing, and performance optimization for machine learning workloads. You'll work closely with our LLM Architects and ML Engineers to build and maintain the high-performance computing environment required for training our custom transformer-based language models.

Responsibilities

  • Design and implement scalable GPU clusters on AWS infrastructure for distributed LLM training
  • Optimize GPU memory usage, computational throughput, and inter-node communication for transformer model training
  • Configure and tune GPU acceleration libraries (CUDA, cuDNN, NCCL) for maximum performance
  • Implement mixed precision training and other optimization techniques to improve training efficiency
  • Architect and deploy GPU-based inference solutions that balance latency, throughput, and cost
  • Create benchmarking tools to measure and improve model training and inference performance
  • Establish monitoring and management systems for GPU resources to maximize utilization and reliability
  • Collaborate with LLM Architects to implement parallelization strategies (model, data, pipeline parallelism)
  • Troubleshoot hardware and software issues affecting GPU performance
  • Keep current with advancements in GPU technology and AI accelerator hardware


Requirements

Summary of Key Requirements

  • 5+ years of experience optimizing GPU infrastructure for machine learning workloads
  • Advanced knowledge of NVIDIA GPU architecture and CUDA programming
  • Strong understanding of HPC computing, AI network architecture, and physical layer management.
  • Experience with AWS GPU instances (e.g., P4d, P5, G5) and AWS Batch for ML workloads
  • Strong background in distributed computing and parallel processing techniques
  • Familiarity with transformer architecture and deep learning frameworks like PyTorch or TensorFlow
  • Expertise in performance profiling and bottleneck identification in GPU workloads
  • Experience with containerization (Docker) and orchestration (Kubernetes)
  • Understanding of memory optimization techniques for large language models
  • Bachelor's degree in Computer Science, Electrical Engineering, or related field (Master's preferred)

Certifications

  • AWS Certified Solutions Architect - Professional (Strongly Preferred)
  • NVIDIA-Certified Professional: Accelerated Data Science (Preferred)
  • NVIDIA-Certified Professional: AI Infrastructure or AI Networking (NCP-AIN) (Preferred)

Why work for us? 

Our team inspires progress in each other and in our customers through our relentless pursuit of excellence; you will work with leaders who promote learning and personal development.


AllCloud is an Equal Opportunity Employer and considers applicants for employment without regard to race, color, religion, sex, orientation, national origin, age, disability, genetics or any other basis forbidden under federal, provincial, or local law.


What you need to know about the Seattle Tech Scene

Home to tech titans like Microsoft and Amazon, Seattle punches far above its weight in innovation. But its surrounding mountains, sprinkled with world-famous hiking trails and climbing routes, make the city a destination for outdoorsy types as well. Established as a logging town before shifting to shipbuilding and logistics, the Emerald City is now known for its contributions to aerospace, software, biotech and cloud computing. And its status as a thriving tech ecosystem is attracting out-of-town companies looking to establish new tech and engineering hubs.

Key Facts About Seattle Tech

  • Number of Tech Workers: 287,000; 13% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Amazon, Microsoft, Meta, Google
  • Key Industries: Artificial intelligence, cloud computing, software, biotechnology, game development
  • Funding Landscape: $3.1 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Madrona, Fuse, Tola, Maveron
  • Research Centers and Universities: University of Washington, Seattle University, Seattle Pacific University, Allen Institute for Brain Science, Bill & Melinda Gates Foundation, Seattle Children’s Research Institute
By clicking Apply you agree to share your profile information with the hiring company.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account