MLOps Support Engineer
hace 3 días
The MLOps Support Engineer is an operations-first role, focused on ensuring AI/ML systems remain stable, observable, and supportable in production environments. This is not a data science or feature development role.
The primary objective is to maintain continuous performance of ML models and associated pipelines with minimal disruption to both internal and client-facing services. You will provide Tier 1 and Tier 2 support, escalating to Tier 3 Engineering as needed.
What you'll do:- Provide Tier 1 / Tier 2 operational support for AI/ML solutions.
- Identify failed jobs, degraded pipelines, or performance anomalies.
- Triage incidents, investigate issues, and coordinate escalation to Tier 3 Engineering.
- Participate in on-call rotas once established.
- Validate that pipelines and jobs complete successfully.
- Monitor data pipeline health, model execution, and basic performance metrics.
- Identify operational issues before they impact customers
- Respond or alert customers when there has been an outage or issue with one of their models.
- Support incident management, rollback, and recovery activities.
- Use and maintain runbooks and operational documentation.
- Work with Engineering to improve supportability and observability.
- Contribute to knowledge sharing to reduce single points of failure.
- Work within defined SLAs and support processes as the service matures
- Build quarterly business reviews to provide updates on the health of the ML Models.
- Evaluate champion/challenger models to see if a new model should be promoted.
- Monitor for model drift and performance degradation, while validating that updates (new champion models or added data) do not introduce bias.
Essential
- Experience in operations, DevOps, SRE, or platform support roles.
- Strong troubleshooting skills in production environments.
- Proficiency in SQL and scripting (Python, Bash) for developing and automating ML workflows.
- Familiarity with Cloud-hosted systems (AWS, GCP, Azure) for cloud-based ML services.
- Git: Solid understanding of version control, particularly in collaborative development environments.
- Comfortable working from runbooks and structured processes.
- Exposure to AI/ML systems in production.
- Familiarity with monitoring and observability tools (Grafana, PowerBI, New Relic).
- Knowledge of MLOps tooling and data platforms (ML FLow, Databricks)
- Experience supporting customer-facing platforms.
- Knowledge of containerization (Kubernetes) is a plus.
- Experience of LLM Prompt Engineering and troubleshooting
- Early career in MLOps or ML Engineering.
- Someone who is eager to learn about complex predictive models.
- Background in computer science, informatics, or related fields
- Passion for Machine Learning and AI: An eager learner who is excited about working with cutting-edge ML technologies and is passionate about optimizing and maintaining ML models in production environments.
- Early Career in MLOps or ML Engineering: Ideally, Junior ML Engineer with a strong desire to grow in the field of MLOps and AI operations.
- A Collaborative Mindset: You thrive in a team setting and are ready to contribute to model improvement, A/B testing, and iterative development.
- Attention to Detail: A focus on model performance, bias prevention, and ensuring optimal model behavior as new data and models are introduced.
Nepal
- This role provides MLOps coverage from 07:45 – 15:45* NPT for US-based customers. You will be required to work during these hours and potentially outside of them if a model has issues.
- Rotational On-Call work will also be required.
Colombia
- This role provides MLOps coverage from 11am to 9pm* Colombia time for a US-based customer. You will be required to work on a shift rota to cover 8 hour time blocks during this time period and potentially outside of them if a model has issues.
- Rotational On-Call work will also be required.
*note that these hours are subject to change upon review.
-
Information Technology Support Engineer
hace 3 días
Medellín, Antioquia, Colombia Buwelo Corporate A tiempo completoCompany DescriptionBuwelo was founded in 2017 by a team of senior contact center veterans who have been designing and managing high-performance customer care operations for Fortune 100 companies. They knew that the world's best-performing contact centers all have one thing in common: They have an exceptional culture that results in ultra-low...
-
IT Support
hace 7 días
Medellín, Antioquia, Colombia Pacifica Continental A tiempo completoProfessional degree: Systems Engineer, IT Engineer or related.Tech Knowledge and skills required:· years of professional experience in Servers Support· Windows Server products: 2016, 2019, 2022. Active directory, Group policies, Domain Controller, DHCP, etc. Certification is a plus.· Support and configuration experience in servers: Nutanix, HP, DELL.·...
-
Tipalti | Technical Support Engineer
hace 7 días
Medellín, Antioquia, Colombia SD Solutions A tiempo completoOn behalf of Tipalti, SD Solutions is looking for a talented Technical Support Engineer to step onto a fintech unicorn rocketshipAs a Technical Support Engineer, you'll take ownership of technical escalations and serve as the central point of contact for various operational teams. Your primary focus is to assist our customers in maximizing the use of our...
-
Technical Support Engineer
hace 2 semanas
Medellín, Antioquia, Colombia Tipalti A tiempo completoTechnical Support EngineerAs a Technical Support Engineer, you'll take ownership of technical escalations and serve as the central point of contact for various operational teams. Your primary focus is to assist our customers in maximizing the use of our global financial automation platform, ensuring seamless business operations. Collaborate with engineering...
-
Technical Support Engineer
hace 2 semanas
Medellín, Antioquia, Colombia Tipalti A tiempo completoTechnical Support EngineerAs a Technical Support Engineer, you'll take ownership of technical escalations and serve as the central point of contact for various operational teams. Your primary focus is to assist our customers in maximizing the use of our global financial automation platform, ensuring seamless business operations. Collaborate with engineering...
-
System Engineer
hace 2 semanas
Medellín, Antioquia, Colombia GlobalNow IT Inc. A tiempo completoGlobalNow is seeking a skilled and service-orientedSystem Engineerto join our team. In this role, you will support and maintain end-user systems, networks, and Microsoft 365 environments, playing a key role in ensuring reliable IT operations and a seamless user experience across the organization. This position is ideal for someone who thrives in...
-
Senior Software Engineer
hace 7 días
Medellín, Antioquia, Colombia Katapult Labs A tiempo completoSenior Software Engineer / Machine Learning EngineerAbout KatapultKatapult is a nearshore software development agency that combines the best talent in LATAM, with world-class execution and leadership experience, with an AI-first approach to product engineering. Katapult works with PMF+ startups and businesses in the United States with a team-augmentation...
-
Civil Project Engineer
hace 5 días
Medellín, Antioquia, Colombia Designda Inc. A tiempo completoWe are a growing architecture & engineering firm seeking a Civil Engineer to manage structural and MEP projects and personally handle portions of the work when needed.This role is ideal for an engineer with direct Florida project experience who understands local permitting, coordination, and real-world execution. You'll oversee multiple projects, review...
-
Client Engineering – Associate Support Engineer
hace 2 semanas
Medellín, Antioquia, Colombia Deckard Technologies A tiempo completoNote: To be eligible for this role, you must be available to work across multiple U.S. time zones. You must be based in Medellin Colombia and able to work from our office 3 days per week.About Deckard TechnologiesDeckard Technologies is a GovTech data company helping local governments address residential property-related challenges with a focus on...
-
Sr. DevOps Engineer
hace 3 días
Medellín, Antioquia, Colombia ForUsAll A tiempo completoWhy This Role?Software is changing fast—but infrastructure is changing even faster. Traditional DevOps is no longer enough. The rise of AI-driven applications means that infrastructure isn't just about keeping the lights on—it's about enabling systems that learn, adapt, and operate at massive scale.If you're ready to evolve beyond "pipelines and...