C++ Engineer Ai Runtime

hace 1 semana


Medellín, Colombia Baasi A tiempo completo

**About Us**:
We are a **stealth-mode startup** building next-generation infrastructure for the AI industry. Our team has decades of experience in software, systems, and deep tech. We are working on a new kind of AI runtime that pushes the boundaries of performance and flexibility making advanced models portable, efficient, and customizable for real-world deployment.

If you want to be part of a small, fast-moving team shaping the **future of applied AI systems**, this is your opportunity.

**Role**:
We are looking for a **C++ Engineer** with strong systems and GPU programming background to help extend and optimize an open-source AI inference runtime. You will work on low-level internals of large language model serving, focusing on:

- Dynamic adapter integration (e.g., LoRA/QLoRA)
- Incremental model update mechanisms
- Multi-session inference caching and scheduling
- GPU performance improvements (Tensor Cores, CUDA/ROCm)

This is a **hands-on role**: you will be designing, coding, profiling, and iterating on high-performance inference code that runs directly on CPUs and GPUs.

**Responsibilities**:

- Implement support for **runtime adapter loading (LoRA)**, enabling models to be customized on the fly without retraining or model merges.
- Design and implement mechanisms for **incremental model deltas**, allowing models to be extended and updated efficiently.
- Extend runtime to handle **multi-session execution**, with isolation and caching strategies for concurrent users.
- Optimize core math kernels and memory layouts to improve inference performance on **CPU and GPU backends**.
- Collaborate with backend and infrastructure engineers to integrate your work into APIs and orchestration layers.
- Write benchmarks, unit tests, and profiling tools to ensure correctness and measure performance gains.
- Contribute to system architecture discussions and help define the roadmap for future runtime features.

**Requirements**:

- Strong proficiency in **modern C++ (C++14/17/20)** and systems programming.
- Solid understanding of **low-level performance optimization**: memory management, multithreading, SIMD, cache efficiency.
- Experience with **CUDA** and/or **ROCm/HIP** GPU programming.
- Familiarity with **linear algebra kernels** (matrix multiply, attention) and how they map to hardware acceleration (Tensor Cores, BLAS libraries, etc.).
- Exposure to **machine learning inference frameworks** (e.g., llama.cpp, TensorRT, ONNX Runtime, TVM, PyTorch internals) is a plus.
- Comfortable working in a **Unix/Linux** environment; experience with build systems (CMake, Bazel) and CI pipelines.
- Strong problem-solving and debugging skills; ability to dive deep into both code and performance traces.
- Self-motivated and able to thrive in a **fast-moving startup** environment.

**Nice to Have**:

- Experience implementing **LoRA or adapter-based fine-tuning** in inference runtimes.
- Knowledge of **quantization methods** and deploying quantized models efficiently.
- Background in distributed systems or multi-GPU orchestration.
- Contributions to **open-source ML/AI systems**.

**Why Join**:

- Build core IP at the intersection of **AI and systems engineering**.
- Work with a highly technical founding team on problems that are both intellectually challenging and commercially impactful.
- Opportunity to shape the direction of a new AI platform from the ground up
- Competitive compensation (contract or full-time), equity potential, and flexible remote work.


  • Backend Engineer

    hace 1 semana


    Medellín, Colombia Baasi A tiempo completo

    **About Us**: We are a **stealth-mode startup** building new infrastructure for the AI industry. Our mission is to make advanced language models deployable, customizable, and secure across diverse environments. Our platform leverages an existing SaaS codebase for authentication, billing, and user management, and we are extending it with AI-specific features...

  • Ai Engineer

    hace 1 semana


    Medellín, Colombia Helios A tiempo completo

    We're looking for a highly skilled AI Engineer to help us design, build, and scale intelligent systems across our next-generation global HR Tech platform. As an early-stage startup, we're looking for someone who thrives in dynamic environments, is passionate about automation and AI-driven innovation, and brings hands-on experience developing AI/ML solutions...

  • Software Developer

    hace 2 días


    Medellín, Colombia Robotica AI A tiempo completo

    **Título del Puesto** **Desarrollador Senior C# /.NET (APIs, Streaming de Datos, AWS)** **Empresa** **Robotica AI S.A.S** Empresa con sede en Colombia Especializada en **IA, Neuromarketing, Media y Plataformas de Análisis de Datos** **Descripción del Puesto** Robotica AI S.A.S busca un(a) **Desarrollador(a) Senior C# /.NET** con **5 a 7 años de...

  • Software Developer

    hace 14 horas


    Medellín, Antioquia, Colombia Robotica AI A tiempo completo

    Título del PuestoDesarrollador Senior C# / .NET (APIs, Streaming de Datos, AWS)EmpresaRobotica AI S.A.SEmpresa con sede en ColombiaEspecializada en IA, Neuromarketing, Media y Plataformas de Análisis de DatosDescripción del PuestoRobotica AI S.A.S busca un(a) Desarrollador(a) Senior C# / .NET con 5 a 7 años de experiencia para diseñar, desarrollar y...

  • AI Prompt Engineer

    hace 2 semanas


    Medellín, Antioquia, Colombia Token Metrics A tiempo completo

    Token Metrics is seeking a talented AI Prompt Engineer for Content Creation to join our innovative team. As a leader in AI-driven cryptocurrency analytics, we're looking for a creative and technical expert to elevate our content production capabilities across multiple platforms. Job Description As our AI Prompt Engineer for Content Creation, you will be...

  • Senior AI Engineer

    hace 2 semanas


    Medellín, Antioquia, Colombia Lean Tech A tiempo completo

    Company OverviewLean Tech is a rapidly expanding organization situated in Medellín, Colombia. We pride ourselves on possessing one of the most influential networks within software development and IT services for the entertainment, financial, and logistics sectors. Our corporate projections offer many opportunities for professionals to elevate their careers...

  • ai engineer senior

    hace 2 días


    Medellín, Antioquia, Colombia SaleADS A tiempo completo

    El RolBuscamos un AI Engineer Senior que lidere la construcción de nuestros sistemas de inteligencia artificial. Trabajarás junto al equipo de arquitectura para diseñar e implementar los modelos que hacen de SaleADS una plataforma inteligente.El foco principal será construir agentes de IA y trabajar intensivamente con Generative AI. También colaborarás...


  • Medellín Metropolitan Area, Colombia HatchWorks AI A tiempo completo

    Job Title:Senior Full Stack Developer & DevOps Engineer (Hybrid Role)Job Level:SeniorWe are HatchWorks Technologies...Are you passionate about building AI-native solutions and using AI to create better, faster, and smarter software? At HatchWorks AI, we're looking for innovators, technologists, and builders like you to join our team.You'll have the...

  • Generative Ai Engineer

    hace 1 semana


    Medellín, Colombia N-iX A tiempo completo

    N-iX is a global software development service company that helps businesses across the globe create next-generation software products. Founded in 2002, we unite 2,400+ tech-savvy professionals across 40+ countries, working on impactful projects for industry leaders and Fortune 500 companies. Our expertise spans cloud, data, AI/ML, embedded software, IoT, and...


  • Medellín, Colombia AgileEngine A tiempo completo

    A leading tech company in Medellín is seeking a Senior/Lead C# Backend Engineer to modernize core systems by designing and developing highly concurrent, thread-safe applications. The will optimize large-scale data operations and leverage distributed processing with Apache Spark. Candidates should have at least 5 years of experience in C#, strong SQL skills,...