Site Reliability Engineer

hace 7 días


Colombia MAS Global Consulting A tiempo completo

Who We Are
At MAS Global Consulting, we bring together diverse engineering talent and meaningful work opportunities with global clients who value innovation, quality, and people-first collaboration. Our mission is to help organizations build scalable, modern, and resilient platforms while enabling our consultants to grow in their careers.
We are proud to partner with Affirm, a leading fintech company known for strong engineering culture and technical excellence. In this role, you will join Affirm's Observability organization—a key team responsible for improving system visibility, debugging workflows, and operational efficiency at scale.
Who You Are
You are a Backend Software Engineer with DevOps experience who enjoys working with observability systems, log pipelines, and infrastructure automation. You are analytical, detail-oriented, and motivated by solving high-impact engineering problems. You thrive in collaborative environments and are comfortable driving projects independently from discovery through delivery.
You bring curiosity, adaptability, strong communication, and a willingness to learn new tooling quickly.
  What You'll Do
  • Contribute to observability-focused initiatives such as:
    a) Analyzing log data patterns to identify major sources of log volume and associated cost drivers.
    b) Reviewing Python and Kotlin codebases to remove unused log events.
    c) Supporting developers by helping them troubleshoot issues with Affirm's observability configuration tooling.
  • Collaborate closely with product, infrastructure, data, and SRE teams to understand pain points and translate them into simple, effective solutions.
  • Document all project steps, findings, and recommendations to support transparency, maintainability, and ongoing improvements.
What You Bring
Must-Have Qualifications
  • 2–4 years of experience in backend development, infrastructure engineering, or Site Reliability Engineering.
  • Experience working with large-scale observability systems (e.g., Prometheus, OpenTelemetry, Fluent, ELK Stack, Splunk, Chronosphere).
  • Proficiency with Kubernetes, Terraform, and cloud infrastructure practices.
  • Strong scripting and data analysis skills, preferably in Python.
  • Ability to independently drive a project from requirements clarification through analysis, execution, and delivery.
  • Strong analytical and problem-solving capabilities.
Nice-to-Have Qualifications
  • Experience with observability tooling integrated into distributed systems.
  • Familiarity with Python, Kotlin, or log-processing frameworks.
  • Exposure to cost-optimization or platform modernization initiatives.
  • Experience contributing to platform-level services or shared infrastructure used by multiple teams.


  • Colombia Mas Global Consulting Llc A tiempo completo

    Senior Site Reliability Engineer (SRE) | LATAM At MAS Global Consulting , we are a premium digital engineering partner delivering technology solutions to some of the world’s most innovative companies — from high-growth startups to Fortune 500 enterprises. With a people-first culture and a commitment to excellence, we combine nearshore talent, agile...

  • Azure DevOps Engineer

    hace 2 semanas


    Colombia Axiom Path Inc A tiempo completo

    **Azure DevOps Engineer / Site Reliability Engineer** **Contract, 100% REMOTE** - In this role, you will leverage your DevOps expertise to design, automate, and streamline the software development lifecycle while playing a crucial role in maintaining website uptime. This role requires a strong ability to handle emergencies, troubleshoot website outages, and...


  • Colombia, Huila Groupon A tiempo completo

    Groupon is a marketplace where customers discover new experiences and services everyday and local businesses thrive. To date we have worked with over a million merchant partners worldwide, connecting over 16 million customers with deals across various categories. In a world often dominated by e-commerce giants, we stand out as one of the few platforms...


  • Colombia Minacs A tiempo completo

    » Job Details# Lead Site Reliability Engineer - BilingualLocationColombiaLanguageEnglish## SummaryAs a Lead Site Reliability Engineer, you’ll play a strategic role in shaping and scaling our DevSecOps ecosystem. You’ll lead the design and implementation of automated CI/CD pipelines, enforce enterprise-grade security and compliance standards, and drive...


  • Colombia Felix Technologies, Inc. A tiempo completo

    About Us At Félix, we're building the financial ecosystem for Latin immigrants in the U.S., starting with a revolution in remittances. Our core product is an AI-powered chatbot built on WhatsApp, allowing our users to send money home as easily as sending a text message. We leverage cutting-edge technology like AI, blockchain, and stablecoins to make...


  • Colombia Kyndryl Colombia SAS A tiempo completo

    **Why Kyndryl** Kyndryl is a market leader that thinks and acts like a start-up. We design, build, manage, and modernize the mission-critical technology systems that the world depends on every day. So why work at Kyndryl? We are always moving forward - always pushing ourselves to go further in our efforts to build a more equitable, inclusive world for our...


  • Colombia, Huila Datavail A tiempo completo

    You will own reliability for core services across multiple clouds, drive automation, and mentor more junior engineers. You will partner with developer teams to embed resilience into feature delivery. **Responsibilities**: - Define and maintain SLIs/SLOs, monitor alignment and error budget usage - Lead incident response and postmortems, implement corrective...


  • Colombia, Huila Datavail A tiempo completo

    At least 2 years of hands-on experience with AWS - We require at least one AWS associate level certification. - Able to contribute through CloudFormation / Terraform - Good knowledge of AWS core services related to Infrastructure (EC2, ECS, EKS, RDS, EBS etc.), Networking (VPC, Network Security Groups, Peering, Transit Gateway, site-to-site VPN etc.),...

  • Reliability Engineer

    hace 2 semanas


    Colombia, Huila Baker Hughes A tiempo completo

    Role Description **Reliability Engineer** **Summary** Can work with limited supervision on assigned tasks with standard techniques to build on basic knowledge and develop skills in specific practice areas. Interacts with clients and client organisations and has an understanding of how maintenance management is executed. Understands project management...


  • Colombia, Huila Groupon A tiempo completo

    Groupon is a marketplace where customers discover new experiences and services everyday and local businesses thrive. To date we have worked with over a million merchant partners worldwide, connecting over 16 million customers with deals across various categories. In a world often dominated by e-commerce giants, we stand out as one of the few platforms...