Staff Hardware Reliability Engineer at ServiceNow
Work matters. It’s where we spend a third of our lives. And the workplace of the future is going to be a great place. We’re dedicated to bringing that to life for people everywhere. That’s why we put people at the heart of everything we do.
People matter. Our people have a passion for learning, building, and innovating. Whether you’re an engineer, a sales professional, a finance professional, or anything in-between, our roles aim to provide each person with meaningful impact and plenty of space to grow.
The Hardware Reliability Engineering team is hyper-focused on the reliability and maintenance of the hardware powering the ServiceNow Enterprise Cloud. Our team works hard in a fast-paced and ever-growing environment, but we have fun doing it.
This role has a direct impact on the reliability of the hardware powering our Cloud. In this role you’ll be responsible for delivering root cause analysis for hardware failures, seeking out ways to improve reliability of hardware components, and developing methods to test and deploy reliability updates.
What you get to do in this role:
- You will perform root cause analysis for hardware failures and track trends for these failures and create action items for follow up
- Serve as an escalation point for the hardware break-fix team and less senior engineers, assisting with rare or unusual hardware faults
- Develop and maintain documentation required for the team including repair guides, training programs, and process documentation
- Troubleshoot firmware and configuration issues
- Recommends design and test method procedures for achieving required levels of product reliability
- Develop scripts to support overall team needs, including methods to deploy configuration changes and firmware updates
- Coordinate with outside teams to develop methods for hardware monitoring and script development
- Work directly with hardware vendors for troubleshooting and RCAs
In order to be successful in this role, we need someone who has:
- Bachelors degree in Computer Science (or related field) or equivalent work experience
- 7+ years supporting enterprise hardware at scale, including troubleshooting and diagnosis of rare/first occurrence hardware faults
- 7+ years working in Linux in large scale environments, including kernel crash dump analysis, use of other OS diagnostics for troubleshooting
- 7+ years working with enterprise grade server hardware, including troubleshooting and maintenance
- Strong working knowledge of trend analysis tools such as Splunk, etc.
- Strong Linux scripting experience, including development of automation tools, information gathering scripts, monitoring scripts (preferably Python, BASH)
ServiceNow is an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, national origin, age, disability, gender identity, or veteran status. If you are an individual with a disability and require a reasonable accommodation to complete any part of the application process, or are limited in the ability or unable to access or use this online application process and need an alternative method for applying, you may contact us at [email protected] for assistance.