1+ months

Senior Software Engineering Manager - Alerting and Monitoring

El Segundo, CA 90245
Apply Now
Apply on the Company Site
  • Jobs Rated
About the Company

At AT&T, were connecting the world through the latest tech, top-of-the-line communications and the best in entertainment. Our groundbreaking digital solutions provide intuitive and integrated experiences for millions of customers across online, retail and care channels. Join our mission to deliver compelling communication and entertainment experiences to customers around the world as we continue to evolve as a technology-powered, human-centered organization. As part of our team, youll transform the way we deliver a seamless customer experience with digital at the center of all you do. In our world, digital is much larger than just an eCommerce channel, we are transforming all channels to digitally perform as one team to create a better customer experience. As we move through 2021, the digital transformation will revolutionize the digital space and you can build a career that will propel your future.

About the Team

Our SPT Operations Alerting and Monitoring Enablement team is looking for an experienced Senior Software Engineering Manager to help us deliver best in class monitoring and alerting capabilities across our online digital ecosystem. As a Senior Software Engineering Manager you will be responsible for overseeing daily operations and execution aspects of alerting and monitoring enablement and act as a product owner helping to manage, prioritize and execute the backlog of new feature enhancement requests using Agile/Scrum methodologies.

About the Job

This is a fast-paced critical position; the conversion is being done very quickly and will require strong technical, managerial, and educational experience with a solid understanding of digital ecommerce, self-service capabilities and various tools used to manage and support our critical online customer journeys across the att.com platform and the Native (iOS/Android) application.

As part of the technical team, youll ensure alerts and monitors for our applications are effective and proactive. This will require investigating the backlog of product delivery work to identify upcoming changes required for alerts and monitors, to coincide with the release of those new features, and to work regularly with the SPT Operations Incident Management Tier 1 & 2 teams to look for new opportunities, and to evaluate existing alerts and monitors regularly.

The teams are moving to Site Reliability Engineering principles, so youll be an evangelist for that shift. This also means other teams will create and maintain alerts; youll ensure effective governance of those changes.

Another area of responsibility is synthetic monitoring. Youll help grow this practice, help it mature, and deliver value to leadership and to the Incident Management teams.

Experience with an Agile methodology., and the ability to oversee all parts of Agile delivery, will be another key to success.

Responsibilities and Day-to-Day View

Provide leadership, strategic direction and oversight for alerting and monitoring teams

Champion and drive Site Reliability Engineering (SRE) best practices

Interact with teams to identify new requirements and provide expertise across tool suites

Oversee improvements in monitoring and alerting capabilities across our digital platforms

Develop and deploy monitoring for ensuring application reliability and stability across the customer journeys

Maintain awareness of current monitoring technology, applicability and capabilities of tools

Manage internal customer relationships, and collaborate effectively across organizations

Ensure the effectiveness of alerts, dashboards, events and synthetic monitoring

Ensure reports are accurate and timely

Act as product owner liaison for tools and engagement of external vendors as required

Participate in all aspects of Agile/Scrum (stand ups, grooming, retrospects, etc.)


2(+) Years of experience with Site Reliability Engineering and operations for internet/eCommerce applications especially in large, multi-data center environments

2(+) Years in a lead or supervisory position, coaching and mentoring engineers

2(+) Years of experience in large scale site operations

Extensive experience with various site operations monitoring tools like Splunk, ElasticSearch, Dynatrace, Quantum Metric, Adobe and Catchpoint

Solution-oriented with proven success in a fast-paced environment

Strong organization and time management skills

Ability to communicate clearly and effectively with teammates and all levels of management

Excellent troubleshooting, analytical and problem-solving skills with demonstrated initiative of going the extra mile

Experience training customers on how to leverage tools to drive business value

Strong understanding and proven knowledge of Scrum/Agile methodologies

Preferred Qualifications

A Bachelor's degree in Computer Science, Information Systems, or related field from an accredited College or University

5(+) Years in a lead or supervisory position, coaching and mentoring engineers

5(+) Years of experience in large scale site operations

5(+) Years of experience with various site operations monitoring tools like Splunk, ElasticSearch, Dynatrace, Quantum Metric, Adobe and Catchpoint

1(+) Years of experience in architecture and design of systems using Microservices architecture

2(+) Years of experience in cloud technologies: AWS, Azure, OpenStack, Docker, Kubernetes etc.

Excellent written and verbal communication skills with demonstrated ability to present complex technical information in a clear manner to peers, developers, and senior leaders

Experience with Continuous Integration and Continuous Delivery concepts and tools

AT&T is leading the way to the future for customers, businesses and the industry. We're developing new technologies to make it easier for our customers to stay connected to their world. Together, weve built a premier integrated communications and entertainment company and an amazing place to work and grow. Team up with industry innovators every time you walk into work, creating the world you always imagined. Ready to #transformdigital with us? Apply now!

Click here to view this job description in Career Intelligence. (https://att.empath.net/my/jobs/40491310)

Job Code - 40491310
We expect employees to be honest, trustworthy, and operate with integrity. Discrimination and all unlawful harassment (including sexual harassment) in employment is not tolerated. We encourage success based on our individual merits and abilities without regard to race, color, religion, national origin, gender, sexual orientation, gender identity, age, disability, marital status, citizenship status, military status, protected veteran status or employment status.

Jobs Rated Reports for Software Engineer

Posted: 2022-01-28 Expires: 2022-07-27
Sponsored by:
ADP Logo

Before you go...

Our free job seeker tools include alerts for new jobs, saving your favorites, optimized job matching, and more! Just enter your email below.

Share this job:

Senior Software Engineering Manager - Alerting and Monitoring

El Segundo, CA 90245

Join us to start saving your Favorite Jobs!

Sign In Create Account
Software Engineer
8th2017 - Software Engineer
Overall Rating: 8/199
Median Salary: $100,690

Work Environment
Very Low
Very Good
Powered ByCareerCast