Caro usuário, habilite o javascript para que esse site funcione corretamente.

Site Reliability Engineer - SRE

* Salário: R$ 2.000 a R$ 5.000 por mês (estimado)

* O valor exibido é uma estimativa calculada com base em dados públicos e referências do mercado. Não garantimos que este seja o salário oferecido para esta vaga específica.

Área: Outros

Nível: Senior

Detalhes da vaga

  • R$ 100 por hora
  • Há 2 dias

Qualificações

  • Ciência da Computação
  • Ansible
  • DevOps
  • Certificação AWS
  • Banco de Dados
  • Terraform
  • New Relic
  • Engenharia de Sistemas
  • Angular
  • Contratos
  • Desenvolvimento de Software
  • GitHub
  • APIs
  • S3
  • Apache
  • Análise de Causa Raíz
  • Cybersecurity
  • Liderança
  • Jenkins
  • Python
  • Debug

Descrição completa da vaga

Senior SRE

Location: Argentina, Bolivia, Mexico, Paraguay, Colombia,

Are you looking for a career that makes a positive difference in your life and reimagines learners and educators across the globe? Do you want to work with fun and social people in a positive and engaged virtual office environment? We are hiring a **Senior Site Reliability Engineer **who will build and support reliable, high-capacity, and well-performing systems in support of our mission to protect and improve our customer platforms, with an ever-watchful eye on reliability, security, performance, cost, and operational excellence. As a Sr Site Reliability Engineer, you will collaborate in a DevOps model with product development teams; designing, deploying, and managing automation tools that increase predictability as well as time to market while reducing cost. Our cloud stack includes:

  • Cloud: AWS ( Cloudfront, S3, EC2, ECS, SES, SQS, SNS, Load Balancing, VPC, Config, Systems Manager, Lambda, API Gateway, DB services many more).
  • Cloud (OCI cloud know how a plus. ( Exacs,OCI Compute, Load Balancers, Networking, VCN, Object storage)
  • Infrastructure as Code: Terraform
  • Programming: Python, Golang, Bash , Ansible
  • Containers: AWS ECS
  • Security: Rapid7, WAF
  • Web: Apache httpd, Apache Tomcat, Angular
  • Config Management and provisioning: Ansible, Packer
  • Telemetry: NewRelic, CloudWatch, DataDog
  • DevSecOps: Artifactory, Jenkins, CircleCI, SonarQube, Jfrog X-Ray, Control Tower, GitHub Enterprise and more

Your contributions

  • Cloud Engineering
  • Collaborate with product development teams in a DevOps model, designing, deploying, and managing automation tools to enhance predictability and accelerate time to market
  • Identify the highest-impact opportunities to optimize existing systems; ensuring “right-sized” solutions in consideration of technical and business constraints
  • Drive initiatives to enhance system reliability and performance
  • Ensure repeatability, traceability, and transparency of our infrastructure automation (infrastructure-as-code, monitoring-as-code)
  • Participate in continual learning of the AWS ecosystem, game day scenarios, and professional conferences
  • Actively monitor AWS costs, using optimization tools to maximize ROI while meeting Service Level Objectives.
  • Observability Engineering
  • Ownership of reliability, uptime, system security, cost, operations, capacity, resiliency, and performance-analysis thereof
  • Leads initiatives to improve the reliability and stability of applications and platforms using data-driven analytics to improve service levels
  • Ensure that the architecture and deployment models are adequately designed to meet SLA commitments
  • Serve as the primary point of contact during major incidents for your application, and demonstrate the ability to identify and resolve issues that trigger on-call alarms.
  • Maintain and enhance telemetry systems to improve visibility into application performance and business metrics, ensuring operational workloads are effectively managed
  • Develop, communicate, collaborate, and monitor standard processes to promote the long-term health and sustainability of operational development tasks
  • DevSecOps
  • Support healthy software development practices, including complying with agile software development methodology, building standards for code reviews, work packaging, and continuous delivery
  • Partner with CyberSecurity and develop plans and automation to respond to new risks and vulnerabilities
  • Resiliency Engineering
  • Collaborate with dev teams to identify failure points and blast radius of systems
  • Validate the effectiveness of monitoring and observability configurations
  • Coordinate failure injection testing
  • Observe and document steady state production levels, growth patterns
  • Plan and forecast for seasonal growth, communicate trend lines with leadership, enhance infrastructure scaling plans to accommodate 2x planned load
  • Coordinate improvements of existing software and infrastructure to meet resiliency goals
  • Mentor and nurture engineers across varying levels of experience; foster growth by setting high-reaching goals, and providing support to achieve them.
  • Ability to expand and collaborate across different levels and stakeholder groups.
  • Documents and shares knowledge within the organization via internal forums and communities of practice.
  • Good to have Kubernetes experience, EKS or managed their own Kubernetes clusters
  • Must have used terraform to create infrastructure within AWS. Must bring an automation-first mindset to the team.
  • On-call participation required. Person will lead triage bridges when necessary
  • Will be expected to monitoring customer experience, application metrics like golden signals/KPIs and infrastructure health.
  • Needs to work proactively across team boundaries on a daily basis.

Qualifications

  • Experience as a software engineer, with practical experience developing, debugging, and deploying enterprise applications
  • Experience with infrastructure automation technologies, preferably Terraform
  • Experience in container/container-fleet-orchestration technologies, preferably EKS or ECS
  • Versatility with troubleshooting diverse sets of hosting technologies: web server platforms, application platforms, operating systems, network components, virtualization technologies, storage, and database platforms.
  • Experience with continuous-deployment based software development lifecycles (e.g. CI/CD)
  • Experience with application caching strategies and high concurrency workloads
  • Strong communication, problem solving, root cause analysis and systems engineering skills
  • Ability to design and manage escalation response plans from monitoring, react, respond, remediate and retrospect in culturally aligned (proactive, customer focused, collaborative, data-driven) ways.
  • Demonstrated expertise building and managing highly scaled production infrastructure in the cloud
  • BS Degree in Computer Science (or related technical field and/or equivalent industry experience)

Job Type: Contract
Contract length: 12 months

Pay: R$100.00 per hour

Expected hours: 8 per week

Work Location: Remote