Atlan Logo

Atlan

Staff Software Engineer (Platform Infra)

Posted 9 Days Ago
Be an Early Applicant
Remote
Hiring Remotely in India
Senior level
Remote
Hiring Remotely in India
Senior level
You will lead efforts in managing Kubernetes infrastructure, automate deployment processes, enhance observability, and support tenant lifecycle management.
The summary above was generated by AI
Who We Are

Atlan is building the missing context layer for data and AI, helping enterprises close the AI value chasm. Today, 95% of AI pilots fail because AI systems don’t understand the context behind data: what it means, how it’s governed, and how it should be used.

Atlan connects to every part of the modern data and AI stack to unify this context into a single, shared layer that both humans and AI agents can rely on.
With Atlan, teams can discover, understand, and trust their data; build and collaborate on a shared body of knowledge; and activate that context across analytics, operations, and AI workflows.Trusted by global enterprises like Mastercard, Workday, General Motors, Unilever, Ralph Lauren, FOX, Nasdaq, and Medtronic, we’re backed by world-class investors including GIC, Insight Partners, Meritech, Peak XV, and Salesforce Ventures

What You’ll Own

As an engineer on the Foundation Platform team, you will be responsible for the production infrastructure that powers Atlan’s context layer for AI across AWS, Azure, and GCP, using AI-assisted development tools (such as Claude Code and Cursor) as a natural part of your daily workflow.

You will:

  • Own and evolve our multi-tenant infrastructure on Kubernetes, including dedicated clusters per customer and the full tenant lifecycle (provisioning, scaling, migration, and offboarding).

  • Make our GitOps deployments faster and safer by improving the ArgoCD and Helm based pipeline that deploys 1,000+ applications across hundreds of tenants.

  • Replace manual infrastructure runbooks (for example, Kubernetes upgrades, Private Link setups, DR drills, cluster onboarding) with reliable automation using Infrastructure-as-Code and workflow engines.

  • Strengthen observability and efficiency by improving our logging, metrics, and alerting stack and using it to drive better reliability, visibility, and meaningful cloud cost reduction.

  • Lead customer-facing infrastructure work and incidents end to end, and turn what you learn into clear runbooks, dashboards, and Claude Skills that help both humans and AI agents operate the platform.

What You Bring

  • 10+ years in platform engineering, infrastructure, SRE, or backend systems at a SaaS company, with high ownership, strong written/async communication, and enthusiasm for AI-native development tools.

  • Deep hands-on experience operating Kubernetes in production: managing clusters, upgrades, networking and RBAC, and multi-tenant concerns, not just deploying apps.

  • Strong GitOps and Helm experience (for example ArgoCD or similar) at meaningful scale, including dealing with sync failures, drift, chart complexity, and improving deployment safety and speed.

  • Production-quality infrastructure automation skills in Go or Python; familiarity with TypeScript is a plus.

  • Solid cloud and Infrastructure-as-Code foundation: deep experience with at least one major cloud (AWS, GCP, or Azure), and having designed, written, and reviewed substantial Terraform or Crossplane modules.

  • Comfort debugging end to end across GitOps pipelines, Kubernetes, and cloud provider layers when deployments or tenants are stuck.

  • Experience with multi-tenant SaaS infrastructure, observability stacks (logs, metrics, traces, dashboards), and practical cloud cost optimisation (for example autoscaling, instance strategy, or savings mechanisms), ideally with exposure to workflow engines such as Temporal or internal self-service / developer platform tooling.

What Great Looks Like:
The ideal candidate:

  • Has led at least one major, cross-cutting infrastructure initiative end to end, such as a deployment pipeline overhaul, Kubernetes upgrade framework, multi-tenant migration, DR architecture, or cost programme, and can explain the system-level impact across reliability, cost, and developer experience.

  • Can walk through a Kubernetes and GitOps platform they have built or significantly evolved, and how they improved deployment safety, speed, and operability for other teams.

  • Has clear examples of turning manual runbooks into automation for workflows like upgrades, DR, or networking, and making these safe, repeatable, and well-documented.

  • Uses observability and cost signals together to drive better reliability and meaningful cloud savings, and is confident owning customer-facing infrastructure and incidents end to end.

  • Acts as a technical multiplier through thoughtful design docs, reviews, documentation, and tools or Claude Skills that reduce “how do I do X?” questions for the whole team.

Why Atlan?

Joining Atlan means being part of a global movement to help data teams do their life’s best work. Here’s what you can expect:

  • Competitive Compensation: We benchmark at the top of the market and keep compensation simple: strong base salary, performance‑based variable pay, and impact‑driven equity (for most roles), so your total rewards grow in step with the value you create over time.

  • AI Native Culture: Atlan is where AI-native builders come to build the systems the future of work will run on. AI isn’t an add-on, it’s woven into how we build, think, and work every day, empowering every Atlanian to move faster and create a bigger impact.

  • Health & Wellness: From Day‑1 health, dental, vision, and mental health to flexible health stipends, we design benefits offerings that lead in each country we're in.

  • Flexible Time Off & Leave Policies: We trust you to own your energy: flexible time off and modern leave so you can unplug properly, support yourself and your loved ones, and come back ready to drive an impact.

  • Accelerated Growth & Learning: Develop at an uncommon velocity through cutting-edge tech, complex implementations, and an experienced team that values mastery.

  • Global, Remote-First, High-Trust: Work from anywhere with a diverse team across 15+ countries, in a trust-first, async environment that gives you true flexibility and ownership over how you work.

More About Us

Atlan is building the shared context layer that enterprises need so AI can operate on trusted, governed context. The conversation has moved from data leaders asking: “Can we trust the data in our stack?” to businesses asking: “Can we trust AI inside the business?”

We are the missing infrastructure for businesses becoming AI-forward - the connective tissue between their data stack, operational systems, and AI agents.
To learn more, visit www.atlan.com and follow us on LinkedIn.

Equal Opportunity Employer

Atlan is committed to building an inclusive, diverse, and authentic workplace. We do not discriminate based on race, color, religion, national origin, age, disability, sex, gender identity or expression, sexual orientation, marital status, military or veteran status, or any other legally protected characteristic.

Recruitment Fraud Alert
Atlan only posts job openings through our official Careers page at atlan.com/careers. Any other listings or communications claiming to represent Atlan may be fraudulent. We never ask for payment during hiring. Please report suspicious activity to [email protected].

Similar Jobs

7 Hours Ago
Remote or Hybrid
Mid level
Mid level
Big Data • Food • Hardware • Machine Learning • Retail • Automation • Manufacturing
The Full Stack Engineer will manage technical roadmaps, develop software solutions, and lead projects for sourcing processes, ensuring quality and efficiency.
Top Skills: .NetCoupaCSSHTMLJavaJavaScriptPegaPHPPythonRubySAP
8 Hours Ago
Remote or Hybrid
Senior level
Senior level
Artificial Intelligence • Cloud • Information Technology • Sales • Security • Software • Cybersecurity
As a Senior Platform Operations Engineer, you will ensure system reliability through automation, collaborate with teams on solutions, and mentor junior engineers.
Top Skills: AnsibleAWSCucumberCypressGoGrafanaJavaJenkinsPlaywrightPythonSeleniumSpinnakerTerraform
8 Hours Ago
Remote or Hybrid
Junior
Junior
Artificial Intelligence • Cloud • Information Technology • Sales • Security • Software • Cybersecurity
This role involves developing and maintaining integrations and automation solutions using Workato, collaborating with analysts and QA engineers, and managing tasks in Jira.
Top Skills: Api IntegrationsConfluenceGitJIRAJSONNetSuiteRestful ServicesSalesforceWorkatoXML

What you need to know about the Seattle Tech Scene

Home to tech titans like Microsoft and Amazon, Seattle punches far above its weight in innovation. But its surrounding mountains, sprinkled with world-famous hiking trails and climbing routes, make the city a destination for outdoorsy types as well. Established as a logging town before shifting to shipbuilding and logistics, the Emerald City is now known for its contributions to aerospace, software, biotech and cloud computing. And its status as a thriving tech ecosystem is attracting out-of-town companies looking to establish new tech and engineering hubs.

Key Facts About Seattle Tech

  • Number of Tech Workers: 287,000; 13% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Amazon, Microsoft, Meta, Google
  • Key Industries: Artificial intelligence, cloud computing, software, biotechnology, game development
  • Funding Landscape: $3.1 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Madrona, Fuse, Tola, Maveron
  • Research Centers and Universities: University of Washington, Seattle University, Seattle Pacific University, Allen Institute for Brain Science, Bill & Melinda Gates Foundation, Seattle Children’s Research Institute

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account