Site Reliability Engineer

hace 6 días


WorkFromHome, Colombia DCT A tiempo completo

Overview DCT Bogota, D.C., Capital District, Colombia Site Reliability Engineer Responsibilities Service & Infrastructure Management: Oversee and manage core platform web services, including API and database servers to ensure optimal performance and health. System Monitoring & Emergency Response: Proactively monitor application and infrastructure health using tools like Grafana, ELK, and Sentry. Participate in a compensated 24/7 on-call rotation that is professionally managed and structured for fairness, conducted virtually (no need to be on-site). You will be backed up by a senior engineer for immediate support, troubleshooting, and swift emergency resolution. Automate recurring operational tasks, system deployments, backups, and maintenance procedures to improve efficiency. Partner with the Software Development team to provide guidance and embed modern DevOps practices directly into their development workflows. Security & Compliance: Assist the IT team in implementing security policies across the entire infrastructure. Requirements 4+ years of experience in a Site Reliability, DevOps, or Software Engineering role with a primary focus on production infrastructure. Willingness and ability to participate in a compensated on-call rotation to respond to and resolve after-hours emergencies. Linux Expertise: Strong practical experience with Linux system administration , including usage of the command line, shell scripting (Bash) , and advanced system-level troubleshooting. Containerization: Good understanding of container technologies, with hands-on proficiency using Docker and Docker Compose in a production context. Web Server Configuration: Experience configuring and managing web servers, specifically NGINX, for tasks like reverse proxying, load balancing, and SSL termination. Strong analytical and problem-solving skills, with the ability to take ownership and drive complex technical challenges to resolution. Nice to Have Knowledgeable of Amazon Web Services (AWS) cloud platform. Proficiency with infrastructure and application monitoring tools (e.g., Grafana, Amplify, Sentry, ELK stack). Networking Fundamentals: Solid understanding of core networking concepts and essential protocols like HTTP/HTTPS and DNS, along with basic familiarity with firewall and interface configuration. Experience with database administration (experience with AWS Aurora and PostgreSQL are a strong plus). Experience with Redis DB. Experience building and maintaining CI/CD pipelines. Experience with modern software development workflows based on Pull Requests, Continuous Delivery, and TDD, as well as an understanding of Agile principles. Experience with container orchestration technologies (e.g., Swarm, Podman, Kubernetes/K8s). Familiarity with Infrastructure as Code (IaC) principles and tools like Terraform. Familiarity with project management tools such as Jira or ClickUp. The Team You Will Join You will join a growing Engineering team, based in Bogotá in the role of Software Engineering focused on Site Reliability . You will report directly to our SRE Lead , receiving technical guidance and mentorship. In addition, you will be paired with a dedicated Line Manager whose primary focus is to support your long-term career progression and professional development. Who we are DCT is a global leader in the Fleet Telematics Industry with over 25 years of software and hardware development with headquarters in Miami, FL - USA. Our platform is the backbone for hundreds of customers across diverse industries and countries in more than 25 countries, with a significant and strategic focus in LATAM. What we offer Career Growth & Mentorship: A dedicated Line Manager and a personal training budget are provided to ensure you have the resources and guidance to advance your professional skills and career path. A Generative & Collaborative Culture: Join a dynamic and innovative team that embraces a generative culture to achieve quality products—we encourage curiosity and an open creative mindset as part of our core principles. Flexible Work Environment: We offer a flexible work-from-home policy designed to support a healthy work-life balance for our team members. Stability & Impactful Work: Be part of a globally recognized company with a 25-year track record of financial stability and technological innovation. Your work will have a direct and meaningful impact on a platform used by hundreds of leading businesses in the fleet telematics space handling massive streams of real-world data. We want to hear from you Even if the salary or benefits aren't exactly what you're looking for, we encourage you to apply if you believe you're a great fit for the role and the team Seniority level Mid-Senior level Employment type Full-time Job function Engineering and Information Technology Industries Technology, Information and Internet We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI. #J-18808-Ljbffr



  • WorkFromHome, Colombia Truelogic Software LLC A tiempo completo

    A leading software development firm based in Colombia is looking for a Site Reliability Engineer to enhance the reliability of their AWS and Kubernetes systems. The engineer will focus on observability, operational improvements, and collaborate with various engineering teams. This position offers 100% remote work and a highly competitive USD salary, along...


  • WorkFromHome, Colombia Epsilon Solutions Ltd. SA de CV. A tiempo completo

    Sr. Site Reliability Engineer Location: Colombia (REMOTE)Employment type: Full Time Contract Key Skills Microsoft Technologies, IIS, Azure, AWS Kubernetes (K8), CI/CD Pipeline – Git Action, IaC – CloudFormation, Terraform Monitoring – Grafana, Troubleshooting in SRE (Preferred engineering background) Responsibilities 80% – Production support under...


  • WorkFromHome, Colombia N-iX A tiempo completo

    A leading technology firm located in Bogotá, Colombia is seeking a Site Reliability Engineer to enhance the reliability and scalability of software production environments, especially in onboarding new microservices. Responsibilities include automating workflows, managing service reliability, and collaborating across teams. The ideal candidate has strong...


  • WorkFromHome, Colombia BairesDev A tiempo completo

    Overview Site Reliability Engineer at BairesDev. We are looking for a Site Reliability Engineer to build and maintain highly reliable, scalable, and secure OpenShift/Kubernetes clusters. Approach production systems from a software engineering perspective with a focus on automation and reliability. What you will do Build and automate and maintain...


  • WorkFromHome, Colombia Blankfactor A tiempo completo

    This is a remote position as a full time Colombia employee paid in COP. This requires a minimum of a B2 English comprehension, please be sure to apply with your English CV. We are seeking a proactive and experienced Site Reliability Engineer (SRE) to join our team, focusing on maximizing the reliability, availability, and performance of our enterprise...


  • WorkFromHome, Colombia N-iX A tiempo completo

    N-iX Bogota, D.C., Capital District, Colombia Overview Site Reliability Engineer (SRE) to help monitor, maintain, and scale software production environments, with a primary focus on onboarding new microservices. Work closely with development and platform teams to automate and program-managed onboarding lifecycle—from requirements and environment setup...


  • WorkFromHome, Colombia AgileEngine A tiempo completo

    A leading software development company in Colombia is seeking a Site Reliability Engineer to design and deploy scalable cloud-native systems. The ideal candidate has over 8 years of experience in SRE, is highly proficient in AWS and Terraform, and excels in CI/CD pipelines. The role involves mentoring teams, improving system reliability, and implementing...


  • WorkFromHome, Colombia Epsilon Solutions Ltd. SA de CV. A tiempo completo

    A leading technology solutions provider is seeking a Senior Site Reliability Engineer to provide production support and drive DevOps activities. This remote position focuses on troubleshooting issues in production and maintaining CI/CD pipelines while leveraging Microsoft technologies, AWS, and Kubernetes. Ideal candidates have strong skills in production...


  • WorkFromHome, Colombia AgileEngine A tiempo completo

    A leading software development firm in Colombia is seeking an experienced Site Reliability Engineer (SRE) to enhance cloud-native systems' reliability and efficiency. You will work closely with cross-functional teams, focusing on resilient AWS infrastructure and DevSecOps practices. Candidates should possess 8–10 years of experience in infrastructure or...


  • WorkFromHome, Colombia EPAM Systems A tiempo completo

    EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most...