Site Reliability Engineer

hace 4 días

Norte, Colombia Félix A tiempo completo

About Us At Félix, we're building the financial ecosystem for Latin immigrants in the U.S., starting with a revolution in remittances. Our core product is an AI-powered chatbot built on WhatsApp, allowing our users to send money home as easily as sending a text message. We leverage cutting‑edge technology like AI, blockchain, and stablecoins to make cross‑border payments faster, more affordable, and more accessible than ever before. We are a hyper‑growth Series B company, backed by over $100 million in funding from top‑tier global investors, including QED, Castle Island, Switch Ventures, HTwenty, Monashees, and General Catalyst Customer Value Fund. This isn't just about the numbers; it's a testament to the trust our investors have in our vision and our team. Additionally, Félix was selected as an “Endeavour Entrepreneur” and was a recipient of the CrossTech Fintech Startups Award. We are a group of extremely talented and dedicated high‑performers, united by our shared obsession with a single goal: empowering our customers. We are all owners of Félix, driven by a bias for action and a true experimentation spirit to get shit done with urgency and focus. Joining Félix means you will be part of a team building a legacy, a company that will outlive us all. This is a rare opportunity to apply your skills to a deeply meaningful mission—serving a community that has been underserved for too long. We are a team that is fiercely loyal to each other, where radical transparency and constructive feedback are how we grow and push for excellence. We are bold, we care less about what others are doing, and more about creating sustainable value and a product that truly makes our users' lives better. We are building the future, today. About the Role We’re looking for a Site Reliability Engineer (SRE) to join our Engineering Operations team, reporting directly to Damian Finol, Head of EngOps. This is a new role focused on strengthening the reliability, scalability, and security of the infrastructure that powers our fintech platform. You’ll work closely with Engineering and SecOps to ensure our systems are highly available, observable, and cost‑efficient. The role blends software engineering, systems operations, and security practices, with a strong emphasis on automation, proactive monitoring, and continuous improvement. Responsibilities Manage and optimize our infrastructure on Google Cloud Platform (GCP) and Google Kubernetes Engine (GKE). Automate provisioning and configuration using Terraform, Helm, and scripting languages such as Go, Python, and Bash. Build, maintain, and improve monitoring and alerting systems using Prometheus, Grafana, and centralized logging tools (e.g., ELK or Loki). Participate in on‑call rotations, incident response, and post‑mortem analyses, ensuring rapid recovery and continuous learning from failures. Define and track SLOs/SLIs and error budgets to monitor service health and performance. Implement cloud security best practices to protect sensitive data and maintain the integrity of our systems. Collaborate across Engineering, Security, and Product teams to embed reliability and automation in every phase of development and deployment. Contribute to GKE cost optimization and resource management strategies to enhance efficiency and control operational spend. Requirements 4+ years of experience as an SRE, DevOps, Infrastructure, or Platform Engineer. Strong hands‑on experience with GCP and GKE. Proficiency in Kubernetes (architecture, deployments, networking, and troubleshooting). Solid programming or scripting skills in Go, Python, or Bash. Experience with Terraform and Helm for Infrastructure as Code. Strong understanding of monitoring and observability using Prometheus, Grafana, and logging frameworks. Familiarity with incident management, on‑call operations, and post‑mortem processes. Knowledge of network fundamentals (TCP/IP, DNS, load balancing). Experience with PostgreSQL or distributed databases. Awareness of FinOps and cloud cost management principles. Excellent problem‑solving, communication, and collaboration skills, with a proactive mindset. Certified Kubernetes Administrator (CKA). Experience in FinOps, cloud security, or regulated industries. Familiarity with PagerDuty or similar incident management tools. Background implementing SLOs/SLIs and error budgets in production environments. These are the applicable requisites, although equivalent competencies in any of the above will also be considered. What We Offer Competitive salary Initial stock options grant Annual performance bonus Health, dental, and vision plans Remote work environment, although we have offices in Miami and México City and would love to work in hybrid model if you are up to it. Continuous learning opportunities Unlimited PTO Paid parental leave Empowering opportunities for growth in a dynamic entrepreneurial environment Equal Opportunity Employer At Félix, we are committed to providing equal employment opportunities to all qualified employees and applicants without regard to race, religion, nationality, sex, sexual orientation, gender identity, age, or disability. This policy applies to all terms and conditions of employment, including recruitment, hiring, placement, promotion, training, compensation, benefits, and termination. Want to learn more about our privacy practices? Check out our Privacy Policy. #J-18808-Ljbffr

Senior Site Reliability Engineer/DevOps

hace 6 días

Norte, Colombia Wizeline A tiempo completo

The Company Wizeline is a global digital services company helping mid-size to Fortune 500 companies build, scale, and deliver high-quality digital products and services. We thrive in solving our customer's challenges through human-centered experiences, digital core modernization, and intelligence everywhere (AI/ML and data). We help them succeed in building...
Senior AWS SRE

hace 6 días

Norte, Colombia Wizeline A tiempo completo

A global digital services company is seeking a Site Reliability Engineer to enable quick releases of high-quality products. The ideal candidate will have solid experience in AWS Cloud Services and strong skills in Terraform and Datadog. Excellent communication skills and a passion for mentoring are essential. This role combines software development with...
Observability & Reliability Engineer – Proactive SRE for 100+ Apps

hace 6 días

Norte, Colombia CI&T A tiempo completo

A tech transformation company seeks an Observability & Monitoring Specialist in Colombia to enhance visibility and performance across 100+ applications. The role involves managing tools like Splunk, LogicMonitor, and AppDynamics, ensuring proactive issue detection, and performing root cause analysis during incidents. Candidates should demonstrate strong...
Senior Data Engineer

hace 2 semanas

Norte, Colombia Lean Tech A tiempo completo

Lean Tech is a technology consultancy committed to excellence, innovation, and the delivery of high-impact engineering solutions. We empower a global clientele by providing skilled nearshore talent distinguished by strong ownership, technical depth, and a collaborative mindset. Our culture is founded on the principles of continuous improvement, autonomy, and...
Marketing Science Product Engineer

hace 2 semanas

Norte, Colombia Power Digital Marketing A tiempo completo

Who We Are: We are a tech-enabled growth firm–at the intersection of marketing, consulting & data intelligence–igniting revenue and brand recognition for leading and emerging companies around the world. As a people-first firm, we value diversity in backgrounds and experiences. We strongly believe our people and culture are key to our success. Our vision...
Lead Software Development Engineer

hace 4 días

Norte, Colombia Wizeline A tiempo completo

The Company Wizeline is a global digital services company helping mid-size to Fortune 500 companies build, scale, and deliver high-quality digital products and services. We thrive in solving our customer’s challenges through human-centered experiences, digital core modernization, and intelligence everywhere (AI/ML and data). We help them succeed in...
Asset Digitization Assets

hace 20 horas

Norte, Colombia A.P. Moller - Maersk A tiempo completo

APM Terminals At APM Terminals, a global leader in port and terminal operations, we enable global trade and drive sustainable growth. As part of the A.P. Moller‑Maersk Group, we connect economies and communities worldwide. Our success is driven by a strong commitment to LEAN methodologies, embedding continuous improvement into every aspect of our...
SecOps Engineer

hace 2 semanas

Norte, Colombia Addi A tiempo completo

About Addi We are a leading financial platform, building the future of payments, shopping, and banking—a world where consumers and merchants can transact effortlessly, grow together and where we create abundance and generate pride in them. Today, we serve over 2 million customers and partner with more than 20,000 merchants, making Addi Colombia’s...
Senior Python Software Engineer

hace 20 horas

Norte, Colombia Wizeline A tiempo completo

We are: Wizeline, a global AI-native technology solutions provider, develops cutting-edge, AI-powered digital products and platforms. We partner with clients to leverage data and AI, accelerating market entry and driving business transformation. As a global community of innovators, we foster a culture of growth, collaboration, and impact. With the right...
Data Centre Team Lead

hace 7 días

Norte, Colombia Ilkari A tiempo completo

Who we are Ilkari is a privately-held start-up based in Dublin, Ireland. We deliver hyper-private scale innovation and technology to safeguard and secure data, enabling true data sovereignty even as the pace of change accelerates. Our best-in-breed sovereign technology delivers privacy and control over where companies’ data resides, where it flows, and how...

América

Europa

Asia / Oceanía

África

Site Reliability Engineer