Seior Observability Specialist
hace 2 semanas
**Responsibilities**:
- Design and implement comprehensive observability strategies and architectures for AWS cloud environments, including metrics, logs, and distributed tracing.
- Develop custom dashboards and alerts to monitor key performance indicators (KPIs) and overall system health.
- Automate the deployment and management of observability infrastructure using Infrastructure as Code (IaC) tools.
- Work closely with development, operations, and engineering teams to understand their observability needs and provide effective solutions.
- Participate in incident resolution, providing observability data and analysis to identify root causes and facilitate recovery.
- Implement and manage observability solutions specifically for containerized environments and orchestration with Elastic Kubernetes Service (EKS).
- Evaluate and recommend new observability tools and technologies to enhance our capabilities.
- Document observability configurations, processes, and best practices.
- Train and support other teams in the use of observability tools and techniques.
- Stay up-to-date on the latest trends and best practices in observability and cloud technologies.
**Requirements**:
- Cloud Knowledge and Experience (AWS):
- Proven experience minimum 5 years working with the Amazon Web Services (AWS) cloud platform.
- In-depth knowledge of AWS services relevant to observability, such as CloudWatch (Logs, Metrics, Alarms), X-Ray, and potentially others like AWS Observability Service.
- **Infrastructure as Code (IaC)**:
- Practical experience in deploying and managing infrastructure using Infrastructure as Code (IaC) tools such as Terraform, or similar.
- Ability to write, maintain, and improve IaC code to automate the creation and configuration of observability infrastructure.
- Elastic Kubernetes Service (EKS):
- Deep understanding of Kubernetes concepts and its interaction with AWS.
- Hands-on experience configuring observability tools specifically for Kubernetes environments, such as Prometheus, Grafana, ELK Stack (Elasticsearch, Logstash, Kibana), Jaeger, etc., within EKS.
- General Observability Experience:
- Solid understanding of observability principles and best practices (metrics, logs, distributed tracing).
- Experience with various observability and monitoring tools.
- Ability to develop effective dashboards and alerts based on observability data.
- Capacity to analyze observability data to identify performance and availability issues.
**Additional Technical Skills**:
- Ability to develop scripts and automate tasks using languages such as Python, Bash, etc.
- Knowledge of Linux operating systems.
- Familiarity with Agile and DevOps methodologies.
- Interpersonal Skills:
- Strong problem-solving skills and the ability to analyze complex data.
- Excellent communication and collaboration skills.
- Ability to work independently and as part of a team.
**Nice to have**
- Relevant AWS certifications (e.g., AWS Certified DevOps Engineer - Professional).
- Experience with other container orchestration platforms (e.g., vanilla Kubernetes).
- Knowledge of Site Reliability Engineering (SRE) principles.
- Experience in implementing Service Level Objectives (SLOs) and Service Level Indicators (SLIs).
-
Observability Specialist
hace 2 semanas
Bogotá, Colombia Kastech Software Solutions Group A tiempo completo**Responsibilities**: - Design and implement comprehensive observability strategies and architectures for AWS cloud environments, including metrics, logs, and distributed tracing. - Develop custom dashboards and alerts to monitor key performance indicators (KPIs) and overall system health. - Automate the deployment and management of observability...
-
Observability Specialist
hace 2 semanas
Bogotá, Bogotá D.E., Colombia OlimpIA A tiempo completo¿Te apasiona la tecnología y la innovación?Conoce nuestro ADNNos adaptamos fácilmente a los cambiosCreamos valor constantementePensamos y actuamos como dueñosFanáticos de los resultados y la agilidadEl conocimiento colectivo nos vuelve poderososTu principal reto será:Identificar, evaluar, monitorear y reportar los riesgos que puedan afectar el...
-
AI Specialist
hace 6 días
Bogotá, Bogotá D.E., Colombia Hire Overseas A tiempo completoWe're looking for an AI Specialist (Operations & Optimization) to help monitor, troubleshoot, and improve our AI-driven products. You'll be the go-to person ensuring that models run smoothly, workflows stay healthy, and performance issues are spotted and resolved fast.This role is a mix of analytical problem-solving and hands-on systems work — perfect for...
-
Integration Engineers
hace 1 semana
Bogotá, Bogotá D.E., Colombia Infinite Computer Solutions A tiempo completoRole OverviewWe are seeking aTelecom Deployment Engineerto design, deploy, and manage 5G and 4G Cloud-native Network Functions (CNFs) onRed Hat OpenShift (OCP)/Other Cloud platforms. This role bridges the gap between traditional telecommunications and modern container orchestration, focusing on high-performance networking and carrier-grade reliability.Key...
-
Senior Java Developer
hace 5 días
Bogotá, Colombia Opinov8 A tiempo completo**PROJECT OVERVIEW**: **IN THIS ROLE, YOU WILL**: - Build and enhance scalable, reliable services and APIs; - Work with relational databases to design schemas, write efficient queries, and optimize performance; - Contribute to CI/CD pipelines and ensure smooth release processes; - Contribute to best engineering practices. **IF YOU ARE**: - Bachelor's...
-
Senior Java Developer
hace 5 días
Bogotá, Cundinamarca, Colombia Opinov8 A tiempo completoPROJECT OVERVIEWOur client is developing and maintaining a next-generation platform for both customers and internal users in the Logistics and Transportation Industry. The application operates in the logistics domain, helping customers find the right vendors for logistics purposes. The platform supports shipment management, payments, and invoicing.IN THIS...