Solutions Architect, Generative AI Inference and Deployment

Sorry, this job was removed at 08:08 p.m. (PST) on Thursday, Jul 24, 2025

Be an Early Applicant

In-Office or Remote

6 Locations

In-Office or Remote

6 Locations

Similar Jobs

Headway

Counsel

Yesterday

Easy Apply

Remote

USA

Easy Apply

173K-254K

Senior level

173K-254K

Senior level

Consumer Web • Healthtech • Professional Services • Social Impact • Software

As Counsel for Regulatory and Privacy, you will guide product and operations teams on healthcare regulations and privacy laws, ensuring compliance while enabling innovation.

Top Skills: Artificial IntelligenceHealth TechnologyHipaa

CDW

Security Engineer

Yesterday

Remote or Hybrid

126K-185K Annually

Senior level

126K-185K Annually

Senior level

Artificial Intelligence • eCommerce • Information Technology • Internet of Things • Automation

Design, implement, and manage IAM and IGA solutions. Collaborate with teams for compliance and support identity management processes, ensuring security and optimization.

Top Skills: Azure Active DirectoryEntra Id GovernanceForgerockHrm Systems (WorkdayIdentity And Access Management (Iam)Identity Governance And Administration (Iga)MimOauthOidcOktaPeoplesoft)SailpointSAMLScim

Hiya Inc.

Back-end Engineer

Yesterday

Remote or Hybrid

USA

Junior

Artificial Intelligence • Cloud • Mobile • Security • Software

Design, code, and operate high performance backend services and data processing systems primarily using Scala and AWS technologies.

Top Skills: SparkAWSDynamoDBKafkaPostgresRedshiftScala

NVIDIA is seeking outstanding AI Solutions Architects to assist and support customers that are building solutions with our newest AI technology. At NVIDIA, our solutions architects work across different teams and enjoy helping customers with the latest Accelerated Computing and Deep Learning software and hardware platforms. We're looking to grow our company, and build our teams with the smartest people in the world. Would you like to join us at the forefront of technological advancement? You will become a trusted technical advisor with our customers and work on exciting projects and proof-of-concepts focused on inference for Generative AI and Large Language Models (LLMs). You will also collaborate with a diverse set of internal teams on performance analysis and modeling of inference software. You should be comfortable working in a dynamic environment, and have experience with Generative AI, LLMs and GPU technologies. This role is an excellent opportunity to work in an interdisciplinary team at NVIDIA!

What You Will Be Doing:

Partnering with other solution architects, engineering, product and business teams. Understanding their strategies and technical needs and helping define high-value solutions
Dynamically engaging with developers, scientific researchers, and data scientists, gaining experience across a range of technical areas
Strategically partnering with lighthouse customers and industry-specific solution partners targeting our computing platform
Working closely with customers to help them adopt and build creative solutions using NVIDIA technology and MLOps solutions
Analyzing performance and power efficiency of AI inference workloads on Kubernetes
Some travel to conferences and customers may be required

What We Need To See:

BS, MS, or PhD in Computer Science, Electrical/Computer Engineering, Physics, Mathematics, other Engineering or related fields (or equivalent experience)
5+ years of hands-on experience with Deep Learning frameworks such as PyTorch and TensorFlow
Strong fundamentals in programming, optimizations, and software design, especially in Python
Proficiency in problem-solving and debugging skills in GPU orchestration and Multi-Instance GPU (MIG) management within Kubernetes environments
Experience with containerization and orchestration technologies, monitoring, and observability solutions for AI deployments
Strong knowledge of the theory and practice of LLM and DL inference
Excellent presentation, communication and collaboration skills

Ways To Stand Out From The Crowd:

Prior experience with DL training at scale, deploying or optimizing DL inference in production
Experience with NVIDIA GPUs and software libraries such as NVIDIA NIM, Dynamo, TensorRT, TensorRT-LLM
Excellent C/C++ programming skills, including debugging, profiling, code optimization, performance analysis, and test design
Familiarity with parallel programming and distributed computing platforms

The base salary range is 148,000 USD - 235,750 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

4545 Roosevelt Way NE 6th Floor, Seattle, Washington, United States, 98105

What you need to know about the Seattle Tech Scene

Home to tech titans like Microsoft and Amazon, Seattle punches far above its weight in innovation. But its surrounding mountains, sprinkled with world-famous hiking trails and climbing routes, make the city a destination for outdoorsy types as well. Established as a logging town before shifting to shipbuilding and logistics, the Emerald City is now known for its contributions to aerospace, software, biotech and cloud computing. And its status as a thriving tech ecosystem is attracting out-of-town companies looking to establish new tech and engineering hubs.

Key Facts About Seattle Tech

Number of Tech Workers: 287,000; 13% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Amazon, Microsoft, Meta, Google
Key Industries: Artificial intelligence, cloud computing, software, biotechnology, game development
Funding Landscape: $3.1 billion in venture capital funding in 2024 (Pitchbook)
Notable Investors: Madrona, Fuse, Tola, Maveron
Research Centers and Universities: University of Washington, Seattle University, Seattle Pacific University, Allen Institute for Brain Science, Bill & Melinda Gates Foundation, Seattle Children’s Research Institute

Apply Save

By clicking Apply you agree to share your profile information with the hiring company.

NVIDIA

Solutions Architect, Generative AI Inference and Deployment

Similar Jobs

Counsel

Security Engineer

Back-end Engineer

NVIDIA Seattle, Washington, USA Office

What you need to know about the Seattle Tech Scene

Key Facts About Seattle Tech