Wells Fargo Logo

Wells Fargo

Senior Manager of Kubernetes Observability

Posted 10 Hours Ago
Be an Early Applicant
Hybrid
Irving, TX
6-6 Annually
Senior level
Hybrid
Irving, TX
6-6 Annually
Senior level
About the Role
We are seeking a Senior Manager of Kubernetes Observability to provide strategic leadership for the design, standardization, and scaled execution of our enterprise observability ecosystem across Kubernetes and OpenShift platforms, including Azure Kubernetes Service (AKS) and Google Kubernetes Engine (GKE). This role is responsible for ensuring a robust, unified, and automated observability platform that enables reliability, performance, and operational excellence across all clusters and workloads in hybrid and multi-cloud environments.
As a senior technology leader, you will define the long-term vision and operating model for metrics, logging, tracing, eventing, and monitoring standards across on-prem, cloud-managed, and hosted Kubernetes platforms. You will guide multiple engineering teams to execute consistently against this strategy, ensuring full instrumentation, proactive issue detection, reduced MTTR, and improved platform stability. Through strong architectural direction, organizational alignment, and focused mentorship, you will elevate engineering maturity and ensure developers and SREs have actionable insights that accelerate innovation and support enterprise growth at scale.
Key Responsibilities
Kubernetes Observability Strategy & Operating Model
  • Define the target-state vision and multi-year roadmap for observability across Kubernetes, OpenShift, AKS, and GKE, including metrics, logging, tracing, eventing, and alerting standards.
  • Establish a unified observability operating model that ensures consistency, scalability, and reuse across on-prem, cloud-managed, and multi-cloud Kubernetes environments.
  • Define success metrics and outcomes that measure observability effectiveness, reliability improvements, and reductions in MTTR across all platforms.

Platform Architecture, Standardization & Instrumentation
  • Set architectural direction for enterprise observability platforms, tooling, and telemetry pipelines across Kubernetes, OpenShift, AKS, and GKE.
  • Establish standardized instrumentation patterns for clusters, workloads, control planes, and platform services, ensuring complete and consistent telemetry coverage regardless of Kubernetes distribution or cloud provider.
  • Drive convergence toward unified observability frameworks that abstract provider-specific differences while preserving deep platform insight.

Automation, Telemetry Workflows & Adoption
  • Drive automation of observability onboarding and telemetry workflows across Kubernetes, AKS, and GKE to reduce manual effort and accelerate adoption.
  • Enable self-service observability capabilities that allow developers and SREs to easily instrument, monitor, and troubleshoot workloads across cloud and on-prem clusters.
  • Ensure observability is embedded by default into platform, infrastructure-as-code, and application delivery pipelines.

Reliability, Monitoring & Operational Excellence
  • Enable proactive issue detection through scalable alerting frameworks, actionable dashboards, and standardized monitoring practices across all Kubernetes platforms.
  • Improve reliability and performance visibility for workloads running on OpenShift, AKS, and GKE, reducing reliance on reactive troubleshooting.
  • Partner with SRE and operations teams to continuously improve incident response, post-incident learning, and preventative engineering across hybrid and multi-cloud environments.

Leadership, Organization & Cross-Team Alignment
  • Lead, mentor, and develop engineering leaders and teams responsible for observability platform components and services.
  • Align platform, SRE, cloud, and application teams around shared observability standards and operational goals across Kubernetes, AKS, and GKE.
  • Strengthen cross-team collaboration and engineering rigor to raise overall organizational maturity in observability and operations.

Required Qualifications
  • 6+ years of Software Engineering experience, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education
  • 3+ years of management or leadership experience
  • 5+ years of experience in platform engineering, reliability engineering, or observability-focused technical leadership roles, or equivalent demonstrated experience.
  • 6+ years of Grafana & Splunk
  • 5+ years of experience with Kubernetes observability concepts, including metrics, logging, tracing, eventing, and monitoring platforms, across OpenShift, AKS, and GKE.

Desired Qualifications
  • 6+ years of people management or senior technical leadership experience guiding multiple engineering teams.
  • Demonstrated success defining and scaling enterprise observability platforms across large, multi-cloud Kubernetes environments.
  • Strong understanding of SRE, operational excellence, and reliability engineering practices.
  • Experience driving automation and standardization to reduce MTTR and operational toil.
  • Proven ability to influence across platform, infrastructure, cloud, and application teams.
  • Strong executive communication skills, including the ability to articulate strategy, tradeoffs, and outcomes to senior stakeholders.

Job Expectations
  • There is no Visa sponsorship available for this position.
  • There is no relocation allowance available for this position
  • This position requires working in one of the posted locations in a hybrid environment

Top Skills

Azure Kubernetes Service
Eventing
Google Kubernetes Engine
Grafana
Kubernetes
Logging
Metrics
Monitoring
Openshift
Splunk
Tracing

Similar Jobs at Wells Fargo

10 Hours Ago
Hybrid
Entry level
Entry level
Fintech • Financial Services
As an Associate Personal Banker, you will provide exceptional customer service, assist with account openings, and promote products to help customers succeed financially.
10 Hours Ago
Hybrid
Senior level
Senior level
Fintech • Financial Services
The Lead Software Engineer will oversee the design and development of scalable data platforms, monitor performance, manage workflows, and advocate for engineering best practices within the CTR space at Wells Fargo.
Top Skills: SparkBigQueryCloud ComposerCloud DataflowCloud DataprocCloud MonitoringCloud StorageGoogle Cloud PlatformGrafanaPythonSQL
10 Hours Ago
Hybrid
Senior level
Senior level
Fintech • Financial Services
Lead and develop a team to drive business growth, ensure customer satisfaction, and manage operational compliance. Responsibilities include coaching, conflict resolution, and relationship building with customers and stakeholders.

What you need to know about the Seattle Tech Scene

Home to tech titans like Microsoft and Amazon, Seattle punches far above its weight in innovation. But its surrounding mountains, sprinkled with world-famous hiking trails and climbing routes, make the city a destination for outdoorsy types as well. Established as a logging town before shifting to shipbuilding and logistics, the Emerald City is now known for its contributions to aerospace, software, biotech and cloud computing. And its status as a thriving tech ecosystem is attracting out-of-town companies looking to establish new tech and engineering hubs.

Key Facts About Seattle Tech

  • Number of Tech Workers: 287,000; 13% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Amazon, Microsoft, Meta, Google
  • Key Industries: Artificial intelligence, cloud computing, software, biotechnology, game development
  • Funding Landscape: $3.1 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Madrona, Fuse, Tola, Maveron
  • Research Centers and Universities: University of Washington, Seattle University, Seattle Pacific University, Allen Institute for Brain Science, Bill & Melinda Gates Foundation, Seattle Children’s Research Institute

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account