Senior Site Reliability Engineer
Our mission is to protect life.
We’re out to make the world a safer place by solving big problems and taking on the public safety challenges of our time. From our company's inception building the TASER to a full suite of hardware and software solutions, we are focused on providing police agencies with the state-of-the-art devices and services they need to successfully serve and protect us. In the next few years, we're going to eliminate the burden of paperwork in policing, so officers can increase the time they spend building relationships and serving in their communities. We’ll put video at the heart of the police record so our justice system can get to the truth faster. And we won't stop innovating until the bullet is rendered obsolete.
It’s a big mission, but it’s one we’ll pursue relentlessly every single day.
As a engineer in the SRE organization, you'll be involved in ensuring that our services are reliable, maintainable and supportable, by working both with in the SRE organization and with our friends and colleagues in the wider engineering & customer support organizations. Our platform handles evidence from all over the world, uploaded by our devices & software & accessed by members of the law enforcement community from all over the world, the security & availability of this platform is paramount. Your role as a site reliability engineer is to ensure that our platform highly available and performant, even during application upgrades, security patching and cloud infrastructure outages: evidence.com is a 24/7, zero-downtime system.
We're evolving our stack from a cloud hosted system to being a truly cloud native system, some of the components & technologies we work with are:
Puppet, Terraform, Packer, TeamCity
Docker, Kubernetes, Prometheus, DataDog
ElasticSearch, Solr, Cassandra, Kafka, ActiveMQ, Redis, MySQL, MS SQL
Scala, Golang, .Net, Java
Your Day to Day
- Perform design, code, and process reviews to improve individual systems as well as engineering-wide
- Help make our platform better by contributing to design and launch reviews for new services
- Advocate for and apply best practices when it comes to availability, scalability, operational excellence and efficiency
- Mentoring colleagues in areas where you have experience & knowledge to share
- Engage with engineering team on-calls, helping to debug, improve, and optimize critical backend services
- Improving & maintaining our current platform & designing & delivering the next generation of our platform
- Successfully worked with distributed, multi-timezone teams
- A passion to remove manual toil from the system
- Experience of operating either a public or large internal service with demanding customers
- Strong working knowledge of Linux based systems
- Experience of building & operating Cloud hosted services (i.e. built on Azure, AWS, OpenStack etc using tools like Terraform, Cloudformation, Packer, Puppet, Chef, SaltStack, Ansible, Spinnaker etc)
- Experience with container-based platforms such as Docker, Kubernetes, OpenShift or Cloud Foundry
- Experience using metrics (such as DataDog, Prometheus, Graphite/Grafana etc) to triage and investigate uncommon, customer-impacting, failure scenarios - not every problem can be resolved by simply restarting a VM
- Production DBA experience in SQL Server and/or MySQL
- Production-scale cluster administration of Cassandra, Zookeeper, Solr, Kafka or Elastic Search
Compensation and Benefits
- Competitive salary and 401K with employer match
- Discretionary paid time off
- Robust parental leave policy
- An award-winning office/working environment
- Ride along with real police officers in real life situations, see them use technology, get inspired
- And more...
We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.
Read Full Job Description