Sayari is the judgment infrastructure for trustworthy AI in economic security and commercial risk. The Sayari Commercial World Model resolves 11.7B+ primary-source records from 250+ jurisdictions forming the ground truth of global commerce. A Judgment Ontology, encoding over a decade of investigative tradecraft, and Superconductor, an agentic orchestration platform, deliver AI that reasons like an expert analyst, shows its work, and traces every finding to its source. Trusted by U.S. Customs and Border Protection, HM Revenue & Customs, and Fortune 500 enterprises, Sayari is used by thousands of professionals across 35+ countries to secure supply chains and dismantle illicit networks. Headquartered in Washington, D.C., with offices in London, Singapore, Tokyo, and Tel Aviv.
As a Data Engineer at Sayari, you will be the engine behind the world’s most comprehensive commercial world model. You will join a high-autonomy team responsible for building and scaling the complex orchestration systems that transform billions of primary-source records into actionable intelligence. This is a role for a "builder" who respects the complexity of large-scale ETL and graph databases and is "PhD-curious" about the future of AI-native data products and modern orchestration.
JOB RESPONSIBILITIES
- Design, build, and maintain scalable data pipelines using Python, Spark, and Airflow to support our core data acquisition and entity resolution engines.
- Collaborate cross-functionally with AI/ML and Product teams to implement new features and AI-native products.
- Proactively identify and resolve bottlenecks in our complex ETL processes, bringing a fresh perspective to refine and optimize our existing codebase.
- Contribute to a robust engineering culture through rigorous code reviews, unit testing, and clear communication of design decisions.
- Own the end-to-end delivery of roadmap tasks within two-week sprints, ensuring work meets high standards for quality, documentation, and performance.
- Participate in roadmap planning and story refinement, eventually taking ownership of major epics that drive our long-term product defensibility.
Required
- Professional proficiency in Python and experience contributing to shared codebases using Git (branching, PRs, code reviews).
- Demonstrated experience working with relational databases (PostgreSQL/BigQuery) and an interest in or familiarity with graph databases.
- Familiarity with distributed computing (Spark) or a strong desire to master it.
- Strong collaborative skills and the ability to work effectively in an Agile, sprint-based environment.
- A "self-directed" orientation: ability to move tasks from "assigned" to "complete" with high autonomy and clear communication.
Preferred
- Experience with Django, Scala, or Scrapy.
- Hands-on experience with workflow orchestration tools like Airflow.
- Experience or strong interest in LLM tuning, deployment, and AI engineering best practices.
- Experience working with international or non-English datasets.
- Prior experience working with high-scale, complex data pipelines.
The target base salary for this position is $90,000-$120,000 plus company bonus and equity. Final offer amounts are determined by multiple factors including location, local market variances, candidate experience and expertise, internal peer equity, and may vary from the amounts listed above.
Benefits:
- 100% fully paid medical, vision, and dental for employees and their dependents
- Generous time off; we observe all US federal holidays, close our office for a winter break (12/24-12/31), in addition to granting 18 PTO days and 10 sick days
- Outstanding compensation package; competitive commissions for revenue roles and bonuses for non-revenue positions
- A strong commitment to diversity, equity, and inclusion
- Eligibility to participate in additional benefits such as 401k match up to 5%, 100% paid life insurance (up to $100,000 coverage),, and parental leave
- A collaborative and positive culture - your team will be as smart and driven as you
- Limitless growth and learning opportunities
Similar Jobs
What you need to know about the Seattle Tech Scene
Key Facts About Seattle Tech
- Number of Tech Workers: 287,000; 13% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Amazon, Microsoft, Meta, Google
- Key Industries: Artificial intelligence, cloud computing, software, biotechnology, game development
- Funding Landscape: $3.1 billion in venture capital funding in 2024 (Pitchbook)
- Notable Investors: Madrona, Fuse, Tola, Maveron
- Research Centers and Universities: University of Washington, Seattle University, Seattle Pacific University, Allen Institute for Brain Science, Bill & Melinda Gates Foundation, Seattle Children’s Research Institute



