Senior SRE Engineer
Description
The team at Switch, Inc. is looking for a talented Senior SRE Engineer with extensive experience designing and running services in regulated industries to be a part of challenging initiatives targeted at solving problems that many consumers encounter every day. This position provides a great opportunity for the right individual to drive the future of SRE within Switch.
About You
You are a leader. You enjoy engineering better production systems by focusing on optimizing existing systems, building infrastructure and eliminating work through automation. You have extensive experience building and running an SRE team in regulated industries. And you are entrepreneurial at heart. You want to work build and own services for a FinTech company building new products for consumers. You are proficient with systems management, CI/CD, database administration, network and systems design, security and compliance. You live for automation and have experience automating build, verification and deployment processes at scale. At the end of the day, you are passionate about delivering business value to our customers. You take ownership.
What you’ll be doing
Own end-to-end availability and performance of key services and build automation to prevent problem recurrence. Automate response to all non-exceptional service conditions.
Lead by example, mentor the team and establish credibility through quality technical execution.
Creating and managing an SRE team to support Switch services.
Architecting, designing, implementing, and operating service infrastructure at scale
Work cross-functionally with engineering teams to generate scalable solutions while operating within customer and regulatory expectations
Work with engineering teams to better address needs and enable more effective and efficient developer throughput.
Partner with QA organization to deliver exceptional services for our customers and drive improvements back into engineering teams
Assist the release manager in delivery of new features and services
Develop metrics and performance indicators to track operational, security, and infrastructure effectiveness drive improvements
Participate in roadmap and sprint planning, execution, and retrospectives.
Ensure Switch meets compliance and security controls
Managing risk and change management within the engineering organization
Working with technology teams to ensure deployable features and services
Implementing appropriate environments and process in support of feature development and testing
Operating services within defined Service Level Agreements
Developing and enabling continuous integration / continuous deployment (CI/CD) for engineering teams
Implementing and operating security and compliance controls
Qualifications
5+ years experience as an SRE
2+ years experience mentoring and leading others in an SRE capacity
Experience/knowledge administering application servers, web servers (nginx), and databases (MySQL)
Experience deploying and supporting Internet products or services
Experience architecting high scalability/availability systems and running them at scale within AWS
Possess excellent analytical skills, technical aptitude and a proven ability to consistently solve complex problems
Experience running or participating in change and risk management functions
Have an ability to engage effectively with business and technical teams
Ability to thrive in a dynamic, collaborative and fast paced environment
Strong interpersonal skills as well as strong problem-solving and analytical skills
A highly collaborative participant on teams using iterative development methodology
Have an ability to work independently as well as in a team to offer and drive solutions
Strive to stay current on existing and emerging development tools, platforms, and delivery models
Maintain code quality across multiple environments
Degree in Computer Science or equivalent experience required
Skills
Strong experience with AWS technologies such as EC2, ELB, RDS, S3/EBS/Glacier and VPC
Proficiency with Docker containerization and administration.
Proficiency with Linux systems and administration
Strong understanding of networking, routing, and security concepts
Proficiency with one or more scripting languages such as Javascript, Python, Ruby and Perl
Experience developing and deploying REST services
Experience deploying and administering integrated relational and non-relational database components within a production ecosystem
Proficiency in writing build and deployment automations
Proficient with monitoring tools and concepts
Experience with development tools such as Git
Experience with continuous integration frameworks such as Jenkins
Knowledge of NodeJS, RabbitMQ, Nginx, MemCached or Redis
Nice-to-haves
Knowledge of Node.js Express, Apache Spark, Apache Hadoop or Scala
Demonstrable "data science" skills and knowledge of learning algorithms
Experience with PCI-DSS or similar regulatory compliance a plus