Site Reliability Engineer
hace 4 días
At Truelogic we are a leading provider of nearshore staff augmentation services headquartered in New York. For over two decades, we've been delivering top-tier technology solutions to companies of all sizes, from innovative startups to industry leaders, helping them achieve their digital transformation goals.
Our team of 600+ highly skilled tech professionals, based in Latin America, drives digital disruption by partnering with U.S. companies on their most impactful projects. Whether collaborating with Fortune 500 giants or scaling startups, we deliver results that make a difference.
By applying for this position, you're taking the first step in joining a dynamic team that values your expertise and aspirations. We aim to align your skills with opportunities that foster exceptional career growth and success while contributing to transformative projects that shape the future.
Our ClientA data-driven technology company that partners with high-growth brands to optimize customer acquisition and retention. It specializes in delivering high-LTV audiences and enrichment data to increase repeat purchase rates. The company collaborates with major platforms and agencies such as Shopify, Experian, TransUnion, and top media partners, all focused on driving profitable revenue growth.
Job Summary
The Site Reliability Engineer plays a key role in platform enablement by building and maintaining core infrastructure tooling that enables teams to deploy and operate services reliably using AWS and Kubernetes. This position focuses on managing and evolving internal Infrastructure as Code (IaC) constructs, primarily Python-based abstractions built with AWS CDK and CDK8s. These constructs encompass networking, EKS configuration, data stores, observability, autoscaling patterns, and deployment primitives. The engineer collaborates closely with backend teams to ensure infrastructure is secure, consistent, and easy to integrate, driving platform reliability and developer productivity.
ResponsibilitiesDesigns, implements, and evolves shared AWS CDK and CDK8s constructs used across multiple services and teams.
Maintains core infrastructure components including VPC, EKS clusters and node groups, RDS, OpenSearch, and MSK.
Operates and extends Kubernetes cluster addons such as ingress controllers, cert-manager, autoscalers, and monitoring/logging stacks.
Ensures high reliability through structured alerting systems (Prometheus, CloudWatch), autoscaling strategies, and recovery mechanisms.
Manages and publishes baseline templates, configuration schemas, and comprehensive documentation for infrastructure usage.
Owns the CI/CD pipelines for Infrastructure as Code (IaC) codebases and platform component releases.
Collaborates with engineering teams to troubleshoot infrastructure-related issues and deliver scalable, reliable solutions.
Applies Site Reliability Engineering (SRE) principles—including SLIs, SLOs, observability, and fault tolerance—to all shared platform services.
Supports IAM roles, secrets management, and tenant isolation best practices.
Has 5+ years of experience in infrastructure or Site Reliability Engineering (SRE), including hands-on work with AWS services such as VPC, IAM, RDS, MSK, and S3, as well as Kubernetes components like Helm, RBAC, and ServiceAccounts.
Demonstrates fluency in Python and has practical experience with Infrastructure-as-Code using AWS CDK, CDK8s, or equivalent frameworks such as Pulumi.
Possesses a strong understanding of Prometheus, Grafana, and effective alert routing practices.
Has experience designing reusable infrastructure patterns or building internal developer platforms.
Shows a proven track record of improving system reliability through automation, monitoring, and operational best practices.
Has experience supporting Spark on Kubernetes, Argo, or Kafka-based batch pipelines.
100% Remote Work: Enjoy the freedom to work from the location that helps you thrive. All it takes is a laptop and a reliable internet connection.
Highly Competitive USD Pay: Earn an excellent, market-leading compensation in USD, that goes beyond typical market offerings.
Paid Time Off: We value your well-being. Our paid time off policies ensure you have the chance to unwind and recharge when needed.
Work with Autonomy: Enjoy the freedom to manage your time as long as the work gets done. Focus on results, not the clock.
Work with Top American Companies: Grow your expertise working on innovative, high-impact projects with Industry-Leading U.S. Companies.
A Culture That Values You: We prioritize well-being and work-life balance, offering engagement activities and fostering dynamic teams to ensure you thrive both personally and professionally.
Diverse, Global Network: Connect with over 600 professionals in 25+ countries, expand your network, and collaborate with a multicultural team from Latin America.
Team Up with Skilled Professionals: Join forces with senior talent. All of our team members are seasoned experts, ensuring you're working with the best in your field.
Apply now
-
Site Reliability Engineer
hace 2 días
Bogotá, Bogotá D.E., Colombia CBL Solutions A tiempo completo US$60.000 - US$120.000 al añoRole: Site Reliability EngineerLocation: Medellin or Bogota, ColombiaContract PositionRequirements:8 years of relevant experienceB1 English speakerSkills & Experience:8 years of relevant experienceExpert-level knowledge of distributed systems and cloud infrastructure.Extensive experience with automation and orchestration tools.Deep understanding of...
-
SRE – Site Reliability Engineer
hace 4 días
Bogotá, Bogotá D.E., Colombia Periferia IT A tiempo completo $1.200.000 - $3.600.000 al añoPrepárate para vivir una nueva etapa con Periferia IT Group Si tienes experiencia como SRE – Site Reliability Engineer quieres generar un impacto en el mundo tecnológico, esta es tu oportunidad para unirte a nuestro equipo. Trabajarás con más de 1,000 profesionales en una multinacional colombiana líder en el sector TI, con fuerte presencia en...
-
Site Reliability Engineer II-1
hace 4 días
Bogotá, Bogotá D.E., Colombia Mastercard A tiempo completo US$50.000 - US$1.200.000 al añoOur PurposeMastercard powers economies and empowers people in 200+ countries and territories worldwide. Together with our customers, we're helping build a sustainable economy where everyone can prosper. We support a wide range of digital payments choices, making transactions secure, simple, smart and accessible. Our technology and innovation, partnerships...
-
Site Reliability Engineer II-1
hace 4 días
Bogotá, Bogotá D.E., Colombia Mastercard A tiempo completo US$60.000 - US$120.000 al añoOur PurposeMastercard powers economies and empowers people in 200+ countries and territories worldwide. Together with our customers, we're helping build a sustainable economy where everyone can prosper. We support a wide range of digital payments choices, making transactions secure, simple, smart and accessible. Our technology and innovation, partnerships...
-
Site Reliability Engineer ID45689
hace 2 semanas
Bogotá, Bogotá D.E., Colombia AgileEngine A tiempo completo US$90.000 - US$145.000 al añoAgileEngine is an Inc. 5000 company that creates award-winning software for Fortune 500 brands and trailblazing startups across 17+ industries. We rank among the leaders in areas like application development and AI/ML, and our people-first culture has earned us multiple Best Place to Work awards. WHY JOIN US If you're looking for a place to grow, make an...
-
Lead Site Reliability Engineer
hace 2 semanas
Bogotá, Bogotá D.E., Colombia Masabi A tiempo completo US$60.000 - US$120.000 al añoIntroducing Masabi // At Masabi, we're driving the fare payment revolution, powering the journeys of millions all over the world. We build fare collection platforms that allow riders to seamlessly buy and present tickets for public transport either on their mobile phones, from a ticket machine, or even by tapping their bank card to travel. Our Justride...
-
Lead Site Reliability Engineer
hace 2 días
Bogotá, Bogotá D.E., Colombia Exari Systems A tiempo completo $1.200.000 - $3.600.000 al añoApply for this JobCoupa makes margins multiply through its community-generated AI and industry-leading total spend management platform for businesses large and small. Coupa AI is informed by trillions of dollars of direct and indirect spend data across a global network of 10M+ buyers and suppliers. We empower you with the ability to predict, prescribe, and...
-
Lead Site Reliability Engineer
hace 2 días
Bogotá, Bogotá D.E., Colombia Coupa A tiempo completo US$120.000 - US$240.000 al añoCoupa makes margins multiply through its community-generated AI and industry-leading total spend management platform for businesses large and small. Coupa AI is informed by trillions of dollars of direct and indirect spend data across a global network of 10M+ buyers and suppliers. We empower you with the ability to predict, prescribe, and automate smarter,...
-
Lead Site Reliability Engineer
hace 2 días
Bogotá, Bogotá D.E., Colombia Coupa Software, Inc. A tiempo completo US$120.000 - US$180.000 al añoCoupa makes margins multiply through its community-generated AI and industry-leading total spend management platform for businesses large and small. Coupa AI is informed by trillions of dollars of direct and indirect spend data across a global network of 10M+ buyers and suppliers. We empower you with the ability to predict, prescribe, and automate smarter,...
-
Senior Site Reliability Engineer
hace 2 semanas
Bogotá, Bogotá D.E., Colombia 6980a555-e6a5-4f00-b62c-c1b86f407eda A tiempo completo $1.200.000 - $2.400.000 al añoWhat we doWe are a global tech solutions company that believesCollaboration Betters The World.Leveraging strategy, technology, and design, we partner with organizations worldwide to offer comprehensive solutions from idea conception to product realization. We work with people around the globe to advise, build, run, and support the creation of products with...