Lead Site Reliability Engineer

hace 2 días


Colombia Minacs A tiempo completo

» Job Details# Lead Site Reliability Engineer - BilingualLocationColombiaLanguageEnglish## SummaryAs a Lead Site Reliability Engineer, you’ll play a strategic role in shaping and scaling our DevSecOps ecosystem. You’ll lead the design and implementation of automated CI/CD pipelines, enforce enterprise-grade security and compliance standards, and drive reliability across the entire software delivery lifecycle. Partnering closely with development and operations teams, you’ll define best practices, optimize deployment workflows, and ensure our applications are resilient, observable, and continuously improving. Your expertise will be key to accelerating innovation while maintaining the highest levels of quality and performance.## Description**Key Responsibilities:*** **Stakeholder Management** Working with key technology stakeholders to deliver SRE strategies and capabilities to drive and support the digital transformation agenda.* **Architect and Optimize CI/CD Pipelines** Design and maintain cloud-native CI/CD workflows using tools like GitHub Actions, Jenkins, or ArgoCD. Automate build, test, and deployment processes for microservices across Kubernetes clusters and multi-cloud environments.* **Implement DevSecOps Practices** Integrate security into every stage of the pipeline—automating vulnerability scans, secrets management, and policy enforcement using tools like Snyk, HashiCorp Vault, and OPA.* **Ensure High Availability and Resilience** Build fault-tolerant systems using cloud-native patterns (e.g., self-healing, auto-scaling, blue/green deployments). Leverage Kubernetes, service meshes, and distributed tracing to maintain performance and uptime.* **Monitor, Alert, and Respond** Deploy observability stacks (Prometheus, Grafana, ELK, OpenTelemetry) to monitor system health. Define SLOs/SLIs, set up intelligent alerting, and lead incident response and postmortems.* **Manage Infrastructure as Code (IaC)** Using Terraform and cloud vendor tools to provision and manage cloud resources. Maintain version-controlled infrastructure and enforce change management practices.* **Enforce Compliance and Governance** Ensure systems meet regulatory and organizational standards (e.g., SOC 2, HIPAA, ISO 27001). Automate audit trails and implement continuous compliance checks.* **Collaborate Across Engineering Teams** Partner with developers, QA, and platform engineers to embed reliability and security into the SDLC. Advocate for cloud-native best practices and drive adoption of scalable patterns.* **Mentor and Lead by Example** Guide junior engineers, conduct technical reviews, and foster a culture of ownership, automation, and continuous learning.* **Continuously Improve Systems and Processes** Identify performance bottlenecks, reduce toil through automation, and evolve infrastructure to support rapid innovation and growth.**Skills Required:*** **Cloud Platforms**: Strong expertise in AWS, Azure, or GCP services (e.g., EC2, S3, IAM, Lambda, AKS, GKE)* 3- 5 Years of related Experience* **Containers & Orchestration**: Proficiency with Docker, Kubernetes, Helm, LGTm, Harbor* **CI/CD Tools**: Experience with GitHub Actions, GitLab CI, Azure DevOps, or similar* **Infrastructure as Code**: Skilled in Terraform, Pulumi, or CloudFormation* **Monitoring & Observability**: Familiarity with Prometheus, Grafana, ELK stack, OpenTelemetry* **Security & Compliance**: Knowledge of DevSecOps tools and practices (e.g., Vault, Snyk, OPA, CIS benchmarks)* **Programming/Scripting**: Strong skills in Python, Go, Bash, or similar languages* **Version Control & Collaboration**: Proficient in Git, GitOps workflows, and agile development practicesor #J-18808-Ljbffr



  • Colombia Mas Global Consulting Llc A tiempo completo

    Senior Site Reliability Engineer (SRE) | LATAM At MAS Global Consulting , we are a premium digital engineering partner delivering technology solutions to some of the world’s most innovative companies — from high-growth startups to Fortune 500 enterprises. With a people-first culture and a commitment to excellence, we combine nearshore talent, agile...


  • Colombia, Huila Groupon A tiempo completo

    Groupon is a marketplace where customers discover new experiences and services everyday and local businesses thrive. To date we have worked with over a million merchant partners worldwide, connecting over 16 million customers with deals across various categories. In a world often dominated by e-commerce giants, we stand out as one of the few platforms...

  • Azure DevOps Engineer

    hace 2 semanas


    Colombia Axiom Path Inc A tiempo completo

    **Azure DevOps Engineer / Site Reliability Engineer** **Contract, 100% REMOTE** - In this role, you will leverage your DevOps expertise to design, automate, and streamline the software development lifecycle while playing a crucial role in maintaining website uptime. This role requires a strong ability to handle emergencies, troubleshoot website outages, and...


  • Colombia, Huila Datavail A tiempo completo

    You will own reliability for core services across multiple clouds, drive automation, and mentor more junior engineers. You will partner with developer teams to embed resilience into feature delivery. **Responsibilities**: - Define and maintain SLIs/SLOs, monitor alignment and error budget usage - Lead incident response and postmortems, implement corrective...


  • Colombia Kyndryl Colombia SAS A tiempo completo

    **Why Kyndryl** Kyndryl is a market leader that thinks and acts like a start-up. We design, build, manage, and modernize the mission-critical technology systems that the world depends on every day. So why work at Kyndryl? We are always moving forward - always pushing ourselves to go further in our efforts to build a more equitable, inclusive world for our...


  • Colombia Felix Technologies, Inc. A tiempo completo

    About Us At Félix, we're building the financial ecosystem for Latin immigrants in the U.S., starting with a revolution in remittances. Our core product is an AI-powered chatbot built on WhatsApp, allowing our users to send money home as easily as sending a text message. We leverage cutting-edge technology like AI, blockchain, and stablecoins to make...


  • Colombia MAS Global Consulting A tiempo completo

    Who We AreAt MAS Global Consulting, we bring together diverse engineering talent and meaningful work opportunities with global clients who value innovation, quality, and people-first collaboration. Our mission is to help organizations build scalable, modern, and resilient platforms while enabling our consultants to grow in their careers.We are proud to...


  • Colombia, Huila Datavail A tiempo completo

    At least 2 years of hands-on experience with AWS - We require at least one AWS associate level certification. - Able to contribute through CloudFormation / Terraform - Good knowledge of AWS core services related to Infrastructure (EC2, ECS, EKS, RDS, EBS etc.), Networking (VPC, Network Security Groups, Peering, Transit Gateway, site-to-site VPN etc.),...


  • Colombia, Huila Groupon A tiempo completo

    Groupon is a marketplace where customers discover new experiences and services everyday and local businesses thrive. To date we have worked with over a million merchant partners worldwide, connecting over 16 million customers with deals across various categories. In a world often dominated by e-commerce giants, we stand out as one of the few platforms...

  • Reliability Engineer

    hace 2 semanas


    Colombia, Huila Baker Hughes A tiempo completo

    Role Description **Reliability Engineer** **Summary** Can work with limited supervision on assigned tasks with standard techniques to build on basic knowledge and develop skills in specific practice areas. Interacts with clients and client organisations and has an understanding of how maintenance management is executed. Understands project management...