MLOps Support Engineer
hace 3 días
At CloudFactory, we are a mission-driven team passionate about unlocking the potential of AI to transform the world. By combining advanced technology with a global network of talented people, we make unusable data usable, driving real-world impact at scale. We are a global community founded on strong relationships and the belief that meaningful work transforms lives. Our Culture Mission-Driven: We focus on creating economic and social impact People-Centric: We care deeply about our team's growth, well-being, and sense of belonging Innovative: We embrace change and find better ways to do things together Globally Connected: We foster collaboration between diverse cultures and perspectives Role Summary The MLOps Support Engineer is an operations-first role, focused on ensuring AI/ML systems remain stable, observable, and supportable in production environments. This is not a data science or feature development role. The primary objective is to maintain continuous performance of ML models and associated pipelines with minimal disruption to both internal and client-facing services. You will provide Tier 1 and Tier 2 support, escalating to Tier 3 Engineering as needed. What you’ll do Provide Tier 1 / Tier 2 operational support for AI/ML solutions Identify failed jobs, degraded pipelines, or performance anomalies Triage incidents, investigate issues, and coordinate escalation to Tier 3 Engineering Participate in on-call rotas once established Validate that pipelines and jobs complete successfully Monitor data pipeline health, model execution, and basic performance metrics Identify operational issues before they impact customers Respond or alert customers when there has been an outage or issue with one of their models Support incident management, rollback, and recovery activities Use and maintain runbooks and operational documentation Work with Engineering to improve supportability and observability Contribute to knowledge sharing to reduce single points of failure Work within defined SLAs and support processes as the service matures Build quarterly business reviews to provide updates on the health of the ML Models Evaluate champion/challenger models to see if a new model should be promoted Monitor for model drift and performance degradation, while validating that updates (new champion models or added data) do not introduce bias Requirements Essential Experience in operations, DevOps, SRE, or platform support roles Strong troubleshooting skills in production environments Proficiency in SQL and scripting (Python, Bash) for developing and automating ML workflows Familiarity with Cloud-hosted systems (AWS, GCP, Azure) for cloud-based ML services Git: Solid understanding of version control, particularly in collaborative development environments Comfortable working from runbooks and structured processes Desirable Exposure to AI/ML systems in production Familiarity with monitoring and observability tools (Grafana, PowerBI, New Relic) Knowledge of MLOps tooling and data platforms (ML FLow, Databricks) Experience supporting customer-facing platforms Knowledge of containerization (Kubernetes) is a plus Experience of LLM Prompt Engineering and troubleshooting Early career in MLOps or ML Engineering Someone who is eager to learn about complex predictive models Background in computer science, informatics, or related fields Passion for Machine Learning and AI: An eager learner who is excited about working with cutting-edge ML technologies and is passionate about optimizing and maintaining ML models in production environments Early Career in MLOps or ML Engineering: Ideally, Junior ML Engineer with a strong desire to grow in the field of MLOps and AI operations A Collaborative Mindset: You thrive in a team setting and are ready to contribute to model improvement, A/B testing, and iterative development Attention to Detail: A focus on model performance, bias prevention, and ensuring optimal model behavior as new data and models are introduced. Additional information Nepal This role provides MLOps coverage from *07:45pm- 15:45am Nepal time* for US-based customers. You will be required to work during these hours and potentially outside of them if a model has issues Rotational On-Call work will also be required Colombia This role provides MLOps coverage from *11am to 9pm Colombia time* for a US-based customer. You will be required to work on a shift rota to cover 8 hour time blocks during this time period and potentially outside of them if a model has issues Rotational On-Call work will also be required note that these hours are subject to change upon review** #J-18808-Ljbffr
-
MLOps Support Engineer
hace 2 semanas
Medellín, Antioquia, Colombia CloudFactory A tiempo completoAbout the role:The MLOps Support Engineer is an operations-first role, focused on ensuring AI/ML systems remain stable, observable, and supportable in production environments. This is not a data science or feature development role.The primary objective is to maintain continuous performance of ML models and associated pipelines with minimal disruption to both...
-
MLOps Support Engineer
hace 2 semanas
Perímetro Urbano Medellín, Colombia CloudFactory A tiempo completoAbout the role:The MLOps Support Engineer is an operations-first role, focused on ensuring AI/ML systems remain stable, observable, and supportable in production environments. This is not a data science or feature development role.The primary objective is to maintain continuous performance of ML models and associated pipelines with minimal disruption to both...
-
MLOps Support Engineer
hace 1 semana
Medellín, Antioquia, Colombia CloudFactory A tiempo completoAt CloudFactory, we are a mission-driven team passionate about unlocking the potential of AI to transform the world. By combining advanced technology with a global network of talented people, we make unusable data usable, driving real-world impact at scale. More than just a workplace, we're a global community founded on strong relationships and the belief...
-
MLOps Support Engineer
hace 3 días
Medellín, Colombia CloudFactory A tiempo completoA technology firm focused on AI is seeking an MLOps Support Engineer in Medellín, Colombia. This role involves providing Tier 1 and Tier 2 operational support for AI/ML systems, ensuring their stability and performance. Candidates should have a strong background in operations, troubleshooting skills, and proficiency in SQL and scripting. The role requires...
-
Production ML Ops
hace 2 semanas
Medellín, Colombia CloudFactory A tiempo completoA leading technology company is seeking an MLOps Support Engineer in Medellín, Colombia. This role demands strong troubleshooting skills and experience in operations or platform support, focusing on maintaining AI/ML systems. Responsibilities include providing Tier 1/Tier 2 support, monitoring data pipelines, and collaborating with engineering teams....
-
Support Engineer
hace 1 semana
Medellín, Colombia LeoVegas Group A tiempo completo**ABOUT THE ROLE**: We are looking for a proactive and customer-focused IT Support Engineer to join our Global IT team. In this role, you will provide technical support, troubleshoot IT issues, and assist with system and network administration for our employees. You’ll work across macOS, Windows, enterprise networks, and collaboration tools to ensure...
-
Customer Support Engineer
hace 2 días
Medellín, Colombia SD Solutions A tiempo completoSD Solutions is looking for a talented **Customer Support Engineer** to step onto a fintech unicorn rocketship! Company offers a hybrid format of work: (hybrid office visit - 3 days in the office/2 remote). The office is located in Edificio Select, Cra. 29c, El Poblado, Medellín, Antioquia, Colombia. Customer Support Engineer works directly with customers...
-
Technical Support Engineer
hace 1 semana
Medellín, Colombia SD Solutions A tiempo completoSD Solutions is looking for a talented **Technical Support Engineer** to step onto a fintech unicorn rocketship! Company offers a hybrid format of work: (hybrid office visit - 3 days in the office/2 remote). The office is located in Edificio Select, Cra. 29c, El Poblado, Medellín, Antioquia, Colombia. As a Technical Support Engineer, you'll take ownership...
-
Information Technology Support Engineer
hace 2 semanas
Medellín, Antioquia, Colombia Buwelo Corporate A tiempo completoCompany DescriptionBuwelo was founded in 2017 by a team of senior contact center veterans who have been designing and managing high-performance customer care operations for Fortune 100 companies. They knew that the world's best-performing contact centers all have one thing in common: They have an exceptional culture that results in ultra-low...
-
Customer Support Engineer
hace 1 semana
Medellín, Colombia Tipalti A tiempo completo**Customer Support Engineer** - We are looking for a talented _**_Customer Support Engineer_**_ to step onto a fintech unicorn rocketship._ Customer Support Engineer works directly with customers in identifying and resolving basic customer issues and needs. **Why join Tipalti?** Tipalti is the AI-powered platform for finance automation, elevating how...