Lead Site Reliability Engineer

hace 2 días


Bogotá, Bogotá D.E., Colombia Coupa Software, Inc. A tiempo completo US$120.000 - US$180.000 al año

Coupa makes margins multiply through its community-generated AI and industry-leading total spend management platform for businesses large and small. Coupa AI is informed by trillions of dollars of direct and indirect spend data across a global network of 10M+ buyers and suppliers. We empower you with the ability to predict, prescribe, and automate smarter, more profitable business decisions to improve operating margins.

Why join Coupa?

Pioneering Technology: At Coupa, we're at the forefront of innovation, leveraging the latest technology to empower our customers with greater efficiency and visibility in their spend.

Collaborative Culture: We value collaboration and teamwork, and our culture is driven by transparency, openness, and a shared commitment to excellence.

Global Impact: Join a company where your work has a global, measurable impact on our clients, the business, and each other. 

Learn more on Life at Coupa blog and hear from our employees about their experiences working at Coupa. 

The Impact of a Lead Site Reliability Engineer at Coupa:

If you are passionate about new technologies, have a strong technical background and you are looking for an environment where you can continuously expand your knowledge, you are the right fit for this role. At Coupa, the "Cloud team" is looking for a Lead engineer who is ready to constantly question the status quo with a mixture of system design, code development, deployment, automation, networking, and experience in managing Machine Learning/GenAI / Agentic AI platforms.

What You'll Do:
  • Build, deploy, and troubleshoot microservices in Kubernetes and Amazon EKS, ensuring scalability and reliability.
  • Design secure, highly available web applications with a focus on capacity planning and performance optimization.
  • Deploy and manage the lifecycle of LLMs and embedding models, defining KPIs to measure and improve AI application performance.
  • Evaluate and integrate emerging technologies such as RAG systems, MCP servers, AI Agents, and agentic workflows into our platform.
  • Manage AWS core and GenAI services (S3, IAM, EKS, Bedrock, etc.) using infrastructure-as-code tools like Terraform and Chef, while maintaining observability through tools like New Relic or PagerDuty.
  • Collaborate across product, platform, and engineering teams on architecture design, security patching, incident response, and release management to ensure the reliability of our ML and GenAI infrastructure
What You Will Bring to Coupa:
  • Bachelor's degree and 10+ years of experience managing large-scale cloud applications with a strong background in Linux administration and troubleshooting. Excellent communication skills, a collaborative mindset, and the confidence to take ownership, drive solutions, and deliver results independently while thinking globally.
  • Over 8 years of hands-on experience managing cloud infrastructure across AWS, GCP, and Azure environments.
  • A solid understanding of today's generative AI ecosystem, with practical experience using LLMs and embedding models (OpenAI, AWS Bedrock, SageMaker); familiarity with vector databases like LanceDB is a plus.
  • Strong scripting skills in Bash or Python, and experience with container orchestration platforms like Amazon EKS or Azure AKS.
  • Proficiency with DevOps and automation tools such as Chef, GitHub Actions, Rundeck, and IaC frameworks like Terraform, Spacelift, and Helm.
  • Working knowledge of DNS, load balancers, and MySQL, along with a good grasp of source control and branching strategies in Git.

Coupa complies with relevant laws and regulations regarding equal opportunity and offers a welcoming and inclusive work environment. Decisions related to hiring, compensation, training, or evaluating performance are made fairly, and we provide equal employment opportunities to all qualified candidates and employees. 

Please be advised that inquiries or resumes from recruiters will not be accepted.

By submitting your application, you acknowledge that you have read Coupa's Privacy Policy and understand that Coupa receives/collects your application, including your personal data, for the purposes of managing Coupa's ongoing recruitment and placement activities, including for employment purposes in the event of a successful application and for notification of future job opportunities if you did not succeed the first time. You will find more details about how your application is processed, the purposes of processing, and how long we retain your application in our Privacy Policy.



  • Bogotá, Bogotá D.E., Colombia CBL Solutions A tiempo completo US$60.000 - US$120.000 al año

    Role: Site Reliability EngineerLocation: Medellin or Bogota, ColombiaContract PositionRequirements:8 years of relevant experienceB1 English speakerSkills & Experience:8 years of relevant experienceExpert-level knowledge of distributed systems and cloud infrastructure.Extensive experience with automation and orchestration tools.Deep understanding of...


  • Bogotá, Bogotá D.E., Colombia Exari Systems A tiempo completo $1.200.000 - $3.600.000 al año

    Apply for this JobCoupa makes margins multiply through its community-generated AI and industry-leading total spend management platform for businesses large and small. Coupa AI is informed by trillions of dollars of direct and indirect spend data across a global network of 10M+ buyers and suppliers. We empower you with the ability to predict, prescribe, and...


  • Bogotá, Bogotá D.E., Colombia Truelogic A tiempo completo US$120.000 - US$180.000 al año

    About TruelogicAt Truelogic we are a leading provider of nearshore staff augmentation services headquartered in New York. For over two decades, we've been delivering top-tier technology solutions to companies of all sizes, from innovative startups to industry leaders, helping them achieve their digital transformation goals.Our team of 600+ highly skilled...


  • Bogotá, Bogotá D.E., Colombia Coupa A tiempo completo US$120.000 - US$240.000 al año

    Coupa makes margins multiply through its community-generated AI and industry-leading total spend management platform for businesses large and small. Coupa AI is informed by trillions of dollars of direct and indirect spend data across a global network of 10M+ buyers and suppliers. We empower you with the ability to predict, prescribe, and automate smarter,...


  • Bogotá, Bogotá D.E., Colombia Mastercard A tiempo completo US$50.000 - US$1.200.000 al año

    Our PurposeMastercard powers economies and empowers people in 200+ countries and territories worldwide. Together with our customers, we're helping build a sustainable economy where everyone can prosper. We support a wide range of digital payments choices, making transactions secure, simple, smart and accessible. Our technology and innovation, partnerships...


  • Bogotá, Bogotá D.E., Colombia Mastercard A tiempo completo US$60.000 - US$120.000 al año

    Our PurposeMastercard powers economies and empowers people in 200+ countries and territories worldwide. Together with our customers, we're helping build a sustainable economy where everyone can prosper. We support a wide range of digital payments choices, making transactions secure, simple, smart and accessible. Our technology and innovation, partnerships...


  • Bogotá, Bogotá D.E., Colombia AgileEngine A tiempo completo US$90.000 - US$145.000 al año

    AgileEngine is an Inc. 5000 company that creates award-winning software for Fortune 500 brands and trailblazing startups across 17+ industries. We rank among the leaders in areas like application development and AI/ML, and our people-first culture has earned us multiple Best Place to Work awards. WHY JOIN US If you're looking for a place to grow, make an...


  • Bogotá, Bogotá D.E., Colombia Periferia IT A tiempo completo $1.200.000 - $3.600.000 al año

    Prepárate para vivir una nueva etapa con Periferia IT Group Si tienes experiencia como SRE – Site Reliability Engineer quieres generar un impacto en el mundo tecnológico, esta es tu oportunidad para unirte a nuestro equipo. Trabajarás con más de 1,000 profesionales en una multinacional colombiana líder en el sector TI, con fuerte presencia en...


  • Bogotá, Bogotá D.E., Colombia Scotiabank A tiempo completo US$80.000 - US$120.000 al año

    ID de la solicitud:237005Programa de Referido de Empleados – Probable Pago:$400,000.00Estamos comprometidos a invertir en nuestros colaboradores y ayudarles a continuar su carrera profesional en ScotiaTech.As a member of the International banking Systems Reliability Office team, the System Reliability Engineer (SRE) will collaborate with a team that will...


  • Bogotá, Bogotá D.E., Colombia hireworks A tiempo completo $1.200.000 - $2.400.000 al año

    About hireworkshireworks is building a community of top talent in key international markets by unlocking unparalleled access to positions at leading U.S. based companies. As your employer, hireworks will ensure you have a seamless interview, onboarding, and employee experience - providing ongoing support and resources along the way. Established in 2023,...