1+ months

Specialist Site Reliability Engineer

El Segundo, CA 90245
Apply Now
Apply on the Company Site

Want to work on some of the most cutting-edge technology in the entertainment industry? Want to bring joy to those who want to watch whatever they want to watch, virtually wherever and whenever they want? Join us!

AT&T provides world class digital content delivery and is continuing to expand its role as the largest digital video service provider in the world - diving into engineering challenges with OTT content delivery, mobile streaming, video on demand, interactive services, recommendations, cloud DVR and dynamic ad insertion. Our teams are challenged to produce innovative solutions more quickly. We are seeking talented, energetic engineers to develop and architect the future of cloud infrastructure and applications that scale seamlessly to meet the growth of services and exceed customer performance expectations. Our vibrant, fast paced environment is the perfect place for an enthusiastic engineer to thrive in.

As part of the Site Reliability Engineering team, you will personally make impactful decisions on how we support, monitor, and improve our products and services across the organization.

What to expect!
Youll be on a highly collaborative team that architects, designs, builds, deploys, tests and supports internal software and systems providing monitoring and observability, and auto-remediation for all OTT platforms. Additionally, you will work across teams and organizations to provide architectural guidance for new and existing services related to reliability, operability, manageability, serviceability and observability of services

We work with cloud platforms, modern programing languages and open source contributions across a range of frameworks and technologies. In our org, we are moving away from narrowly defined roles to allow everyone to work on all aspects of their product and to try out new things.

Were a leader in diversity with a commitment to fostering an inclusive culture. We strive to be a great place to work and we embrace our responsibility to reduce our environmental impact on the planet and are committed to helping our customers use our technology for social good.

What You Will Do @ AT&T Site Reliability Engineering

Develop and support the implementation of monitoring capabilities for greater visibility into the platforms performance, including managing tools like Prometheus and Jaeger.
Partner with Product, Development, and Architecture on achieving built-in-quality and operational readiness of new services
Work alongside scrum teams to enable a successful DevOps transformation of services
Work with scrum and operations teams to drive constant improvement across the OV platform


What you bring to the team!

End to end ownership mindset and a passion for learning both software and systems.
Proactive, problem-solving abilities with a deep desire to improve, innovate, challenge and change
Fantastic interpersonal skills and motivation to work together to solve problems
1+ years of experience with software application development bonus points if you have experience working with Docker, Kubernetes, AWS or other cloud offerings
Knowledge of coding languages (e.g. Java, Node.js, Go etc.), data structures, algorithms and distributed systems.
Interest in automation technologies such as Ansible or Terraform and working on both software development and underlying system automation.
Understanding of microservice architecture, cloud platforms like AWS
BS/MS in Computer Science or equivalent work experience


What you bring to the team!

End to end ownership mindset and a passion for learning both software and systems.
Proactive, problem-solving abilities with a deep desire to improve, innovate, challenge and change
Fantastic interpersonal skills and motivation to work together to solve problems
1+ years of experience with software application development bonus points if you have experience working with Docker, Kubernetes, AWS or other cloud offerings
Knowledge of coding languages (e.g. Java, Node.js, Go etc.), data structures, algorithms and distributed systems.
Interest in automation technologies such as Ansible or Terraform and working on both software development and underlying system automation.
Understanding of microservice architecture, cloud platforms like AWS
BS/MS in Computer Science or equivalent work experience

Posted: 2020-02-25 Expires: 2020-04-26

Before you go...

Our free job seeker tools include alerts for new jobs, saving your favorites, optimized job matching, and more! Just enter your email below.

Share this job:

Specialist Site Reliability Engineer

AT&T
El Segundo, CA 90245

Join us to start saving your Favorite Jobs!

Sign In Create Account
Powered ByCareerCast