System Engineer - Core Platform
At Qualtrics, our mission is to close experience gaps—the costly differences between what customers and employees expect, and what they’re receiving.
9,000+ organizations worldwide and more than 80% of the Fortune 100 rely on the Qualtrics Experience Management Platform™ to collect, analyze, and act on feedback—more feedback than they ever thought possible. With Qualtrics XM, organizations can manage the four core experiences of business—customer, employee, product, and brand experience. Organizations can be at every meaningful touchpoint, for every experience, and predict what will resonate most with customers and employees.
About the Core Platform Team
Our team is responsible for building critical systems and services which are used by all the Qualtrics’ product line teams and accelerate their efforts toward providing customer value. Examples range from ownership of common libraries, to our A/B testing service, or our asynchronous job ecosystem which includes scheduling, queueing, progress tracking, workers, and notifications. Our ambition for 2018 is to provide a unified messaging platform based on Kafka which will make it easy for teams to utilize the benefits of async pub/sub architectures. The ideal candidate will have experience running a Kafka cluster and related dependencies such as Zookeeper.
As a Support Engineer on the Core Platform team, you will have a significant impact to our operational success as we strive to be a “gold standard” for Qualtrics Engineering teams. This includes wearing a variety of hats: from deploying code, to creating and enhancing deployment pipelines, to applying security updates, to adding and tuning alerts, to helping improve our metrics collection and reporting to ensure the team’s success going forward. There’s plenty of opportunity for creativity: we are constantly looking for ways to improve, and you will be encouraged to play to your strengths in terms of improvements driven by load and/or performance testing, moving to better engineering practices, coding prototypes or bug fixes, and really any contribution that helps the team reach our goals with high quality.
- You will build systems to measure reliability of services and actively discover trends needing attention
- You will fine tune services to reduce latency, conduct operational readiness reviews and automate continuous delivery of software changes
- You will maintain service level agreements, and build systems to support it
- You will manage the health of distributed specialized server fleets and the software running on them
- You will execute regular maintenance activities for services including outage handling, security enhancements, and root cause fixes
- You will assist the team in our goal to release a unified messaging platform
- You will enhance team runbooks and wikis to make everyone better
- Bachelor's degree in CS preferred, or in a hard science or Information Systems
- 2+ years of software development or operations experience
- A high degree of organization and attention to detail
- Excellent leadership, verbal, and written communication skills
- Demonstrated skill and passion for operational excellence
- Experience running Kafka and Zookeeper clusters
- Experience with AWS technologies, Docker, Jenkins
- Experience with shell scripts and/or other scripting languages
- Experience with Unix/Linux platforms
- Proficiency solving problems and identifying the root cause of issues
- Experience running and maintaining highly available distributed systems
- Capability to retain composure and communicate effectively during operational incidents
- Ability to understand large systems, drilling down to code level