1+ months

Principal Consultant - Site Reliability Engineer Lead

Apply Now
Apply on the Company Site
In this role you'll be leading a team of Site Reliability Engineers, subject matter experts across infrastructure operations, capacity management, configuration management, chaos Engineering .

Servant leader ensuring team is on track to complete epics assigned for the current PI and remove blocks and resolve issues as well as pre-planning for upcoming PI.

Represent the team to leadership up to and including officer level briefings

Escalation point for T2-T3 support for on prem/cloud infrastructure Operations.

Contribute to design and implementation of on-prem and cloud solutions which are secure, scalable, resilient, monitored, auditable and cost optimized

Design and develop solutions for Azure migrations and transformation tools

Migration of existing platforms and applications to Azure

Automation of cloud-based infrastructure deployments and maintenance

Build, manage and maintain tools for deployment, monitoring and operations.

Evaluate and recommend tools, technologies, and processes to ensure the highest quality and performance is achieved. Focus on scalability, security and availability of all infrastructure and processes.

Identifying and addressing infrastructure deficiencies, availability gaps, and performance bottlenecks

Help to determine technical feasibility of solutions for business requirements

Collaborate with peer organizations, product delivery teams, and support organizations on technical issues and provide guidance.

Perform root cause analysis brainstorming session on incident resolutions provide corrective and preventative measures to perform & avoid or mitigate future incidents working with DevOps teams.

Exercise a high degree of responsibility for the processes, systems, and tools created and managed.

Shift timing (if any):

Shift falls typically between 6 am to 10 PM India standard time. Occasionally may have to work long hours in situations when it is needed.

Overall Experience: 8+ years of experience in DevOps Engineers with emphasis on building overall eco system, Infra Operations supporting environment managing large scale applications in both on-prem and Cloud Environment

Solid experience with Site Reliability Engineering.

Solid experience in working in Linux Systems Administrator role

Solid experience with Azure core cloud technologies in a high traffic production setting.

Solid experience in application migrations to cloud using native patterns

Solid experience and understanding of cloud security experience which includes preventative and retrospective controls.

Extensive Experience in Devising strategies, Roadmap planning for On-Premise and Cloud Systems solutions meeting both business objectives and PCI/NON PCI, Security/CSO governance, OWASP top 10 security objectives

Extensively in involved in all phases of SDLC with focus AGILE

(SAFe)/DevSecOps methodologies, devising Cloud Agnostic Solutions for both on premise and hybrid cloud solutions from Inception/Design/Production rollout leveraging AZURE and AWS providers.

Must be very seasoned in assessing technologies, Platform APIs, Internal and external dependencies , Capex/Opex Funding strategies , legal compliance, Time to market customer driven solution evaluations, training SMEs, in house resource evaluations, avoid vendor lock-ins, KPIs, TCO-Cost saving aspects to run or lead this team.

Experience in Devising Design and Architecture for CI-CD/DEVSECOPS/Auto Scaling/SITE RELIABILITY ENGINEERING objectives through Automations Infrastructure as CODE/Platform as CODE using ANSIBLE/HELM any such item for configurations and TERRAFORMS for provisioning the infrastructure.

Experience in Mulesoft or Any gateway architecture solutions such as Strong loop IBM micro-gateway is also desired.

Experience in Streaming Solutions such as Kafka or Cloud Equivalent such as Event Hub etc is a big plus

Knowledge on CDN-Akamai/LOAD BALANCER/FIREWALL/DNS/PROXY/REVERSE PROXY/VPN TUNNEL ETC for meeting the needs of 3 layers and 7 layers architecture is a must have requirement for both on premise and cloud equivalent solutions

Experience in Cassandra or any NO SQL is a big plus.

Proven hands-on technical, managerial/leadership expertise, leading teams of geographically dispersed employees and contractors working on analyzing, defining, proposing IT platform and Infrastructure solutions for portfolios of DOTCOM.

Mainly focused on being hands on in Leading/Assisting Systems architects/Leads with exploration of Latest technologies and rolling out Platform solutions with special focus on Highly Scalable, Self-Healing, Nimble, Flexible Infrastructure Solutions for Business Portfolios.

Experience in Democratization of Platform and Systems Dashboard leveraging both real time and synthetic monitoring solutions with Primary focus on Site Reliability Engineering is Mandatory. Examples- Dyntrace/App Dynamics or New Relic for APM monitoring, EFK/ELK/Splunk for Logs Monitoring and any Synthetic monitoring components such as Catchpoint would be a big plus.

Extensively seasoned in Managing, Coaching/Mentoring, Budgeting and Vendor Management, etc

Experience with performance tuning in on-prem & cloud environment

Experience architecting, implementing, and managing monitoring solutions for production cloud environments

Build and manage on-prem Kubernetes services(K8s), Nginx, Application Gateways, Load balancers Redis webservers, app servers, cache engines, configuration management, CI/CD, GIT, Jenkins, Docker, Nexus, maven: 4 Advanced

Solid experience Build and manage in core Azure cloud technologies such as: Azure DevOps, VMSS, Vnet, Azure Load balancer, Azure Application gateway, Azure Private Link, Cosmos DB, Azure Monitor/Application Insights, AKS, Azure Cache, Event Hub, Azure Functions: 4 Advanced

Solid experience building cloud automation/orchestration solutions with technologies such as: Terraform, CloudFormation, Ansible, Chef, Puppet. 4 Advanced

Experience in designing / implementing highly available cloud/HybridCloud network solutions 4 Advanced

Experience with application performance management (APM), logging, tracing, and other monitoring tools like Dynatrace, Grafana, Prometheus, Nagios, ELK, Azure Insights: ( 3 Advanced)

Experience knowledge in Mulesoft architecture, development, administration experience 2 Novice

Knowledge & demonstrated experience in Agile methodologies and practice

Ability to adapt to a rapidly changing environment and technologies

Excellent written and verbal English communication skills to work in a Global team

Secondary / Desired skills:),

experience in Agile, Lean Agile and/or Scaled Agile methodologies: 2 - Novice (limited experience)

experience in following technologies Azure DevOps, VMSS, Vnet, Azure Load balancer, Azure Application gateway, Azure Private Link, Cosmos DB, Azure Monitor/Application Insights, AKS, Azure Cache, Event Hub, Azure Functions AWS EC2, ALB/ELB, RDS, S3, LAMBDA, API Gateway, CloudFront, SNS, SQS, DynamoDB, Cloudwatch, ElastiCache, and EKS, Ansible, Terraform, shell scripting, Kubernetes, Docker, Linux Administration RHEL/Centos/Ubuntu, Kafka, Rabbit, Redis, Cassandra, MongoDB, NGINX, Openstack, GIT, Jenkins, Splunk, ELK, Dynatrace, New Relic, Grafana, Prometheus, Mulesoft

Additional information (if any): Willing to work in Shift Duties, Willingness to learn is very important as AT&T offers excellent environment to learn Digital Transformation skills such as cloud, Big data, AI, Full stack etc.

Education Qualification: Bachelors/ Masters degree in Computer Science or related field
We expect employees to be honest, trustworthy, and operate with integrity. Discrimination and all unlawful harassment (including sexual harassment) in employment is not tolerated. We encourage success based on our individual merits and abilities without regard to race, color, religion, national origin, gender, sexual orientation, gender identity, age, disability, marital status, citizenship status, military status, protected veteran status or employment status.
Posted: 2021-10-08 Expires: 2021-12-27
Sponsored by:
ADP Logo

Before you go...

Our free job seeker tools include alerts for new jobs, saving your favorites, optimized job matching, and more! Just enter your email below.

Share this job:

Principal Consultant - Site Reliability Engineer Lead


Join us to start saving your Favorite Jobs!

Sign In Create Account
Powered ByCareerCast