Middle Site Reliability Engineer

hace 2 días


Desde casa, Colombia EPAM Systems A tiempo completo

EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential._

We are searching for a Middle Site Reliability Engineer proficient in Azure, eager to join our remote team.

The central part of your role will involve studying and understanding how different parts of a distributed system interact, utilizing a wide variety of skills and tools. You will need a keen problem-solving mindset, be able to work independently and make critical decisions that maintain the reliability and availability of our systems. Knowledge and working experience with Microsoft Azure is crucial, with an emphasis on understanding the various Azure tools and services.

RESPONSIBILITIES
- Keeping a close watch on our system's performance and meeting our Service Level Objectives
- Configuring and deploying various Azure cloud resources, such as AKS, CosmosDB, Key Vault, Redis Cache, Storage, ServiceBus, App Gateway
- Developing and employing potent monitoring and alerting strategies using tools like Azure Monitor
- Cooperating with cross-functional teams to recognize and fix any problems affecting system reliability or availability
- Designing and incorporating solutions to enhance system reliability, availability, and scalability
- Being a part of the on-call rotation, offering support beyond regular business hours
- Participating in activities focused on mentorship and knowledge-sharing to further develop your technical and soft skills

**REQUIREMENTS**:

- A minimum of two years of experience as a Site Reliability Engineer, preferably in a Microsoft Azure setting
- In-depth knowledge of various Azure services including Azure Application Insights, and Azure Monitor
- Familiarity with setting up and deploying Azure cloud resources like AKS, CosmosDB, Key Vault, Redis Cache, Storage, ServiceBus, App Gateway
- Proficient programming skills in Bash and PowerShell
- A good understanding of DevOps best practices inclusive of setting up and managing CI/CD implementation using Azure DevOps or equivalent
- The ability to trace and troubleshoot issues in a distributed system
- Being an excellent team player with strong communication skills across multiple teams
- Fluent in both spoken and written English, with an upper-intermediate level of proficiency

NICE TO HAVE
- Ability to work on-call during weekends

WE OFFER
- Learning Culture - We want you to be the best version of yourself, that is why we offer unlimited access to learning platforms, a wide range of internal courses, and all the knowledge you need to grow professionally
- Health Coverage - Health and wellness are important, that is why we have you and up to four family members in a premiere health plan. We have a couple of options, so you can choose what is best for you and your family
- Visual Benefit - Seeing your work for us would be a sight for sore eyes. We want your vision to always be at 100% which is why we offer up to $200.000 COP for any visual health expenses
- Life Insurance Plan - We have partnered with MetLife to offer a full-coverage Ife insurance plan. So, your family is covered, even if you are gone
- Medical Leave Coverage - We are one of the few companies that cover 100% of your medical leave, for up to 90 days. Your health is the most important thing to us
- Professional Growth Opportunities - We have designed a highly competitive and complete development process, where you will have all the tools to get where you have always wanted to be, personally and professionally
- Stock Option Purchase Plan - As an EPAMer you can be more than just an employee, you will also have the opportunity to purchase stock at a reduced price and become a part owner of our organization
- Additional Income - Besides your regular salary, you will also have the chance to earn extra income by referring talent, being a technical interviewer, and many more ways
- Community Benefit - You will be part of a worldwide community of over 50,000 employees, where you can learn, challenge yourself, stand out, and share your knowledge and experience with multicultural teams


  • Site Reliability Engineer

    hace 2 semanas


    Desde casa, Colombia Definity First A tiempo completo

    We are seeking a skilled and motivated **Site Reliability Engineer** (SRE) to join our dynamic team. As an SRE at Definity First, you will play a crucial role in ensuring the reliability, scalability, and performance of our systems. You will collaborate with cross-functional teams to design, build, and maintain our infrastructure, and you'll have the...


  • Desde casa, Colombia Gorilla Logic A tiempo completo

    **Mid-Level Site Reliability Engineer (SRE)** Gorilla Logic is looking for a Mid-Level Site Reliability Engineer (SRE) responsible for automation, instrumentation, and stability of our client's platforms to achieve operational health and performance. Our environment will require you to work effectively with your teammates, of course. But your real success...


  • Desde casa, Colombia Gorilla Logic A tiempo completo

    **Senior Site Reliability Engineer (SRE)** Gorilla Logic is looking for a Senior Site Reliability Engineer (SRE) responsible for automation, instrumentation, and stability of our client's platforms to achieve operational health and performance. Our environment will require you to work effectively with your teammates, of course. But your real success will be...


  • Desde casa, Colombia Gorilla Logic A tiempo completo

    **Mid Site Reliability Engineer (SRE)** Gorilla Logic is looking for a Mid-Level Site Reliability Engineer (SRE) responsible for automation, instrumentation, and stability of our client's platforms to achieve operational health and performance. Our environment will require you to work effectively with your teammates, of course. But your real success will be...


  • Desde casa, Colombia Gorilla Logic A tiempo completo

    Gorilla Logic is looking for a Senior Site Reliability Engineer (SRE) responsible for automation, instrumentation, and stability of our client’s platforms to achieve operational health and performance. Our environment will require you to work effectively with your teammates, of course. But your real success will be measured by how well you couple critical...


  • Desde casa, Colombia EPAM Systems A tiempo completo

    EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most...


  • Desde casa, Colombia EPAM Systems A tiempo completo

    Join a dynamic team as a Senior Site Reliability Engineer, tasked with maintaining and evolving enterprise applications and infrastructure using Azure DevOps practices and cutting-edge tools.You will be instrumental in delivering robust, scalable solutions that drive company success. Apply if you are ready to contribute your engineering expertise and...


  • Desde casa, Colombia EPAM Systems A tiempo completo

    Become a key member of our team as a Lead Site Reliability Engineer, focusing on advancing enterprise application infrastructure through expert DevOps practices and innovative cloud solutions.You will lead efforts in designing robust, scalable systems utilizing Azure, AWS, Kubernetes, and Terraform. If you are prepared to leverage your leadership and...


  • Desde casa, Colombia CQ Fluency Inc. A tiempo completo

    who we are... In an era of increasing diversity within domestic populations and expanding global markets, our customized AI-enabled human translations empower our clients to effectively communicate their messages. We understand the importance of navigating linguistic and cultural nuances, and our innovative team leverages a tailored suite of cutting-edge...


  • Desde casa, Colombia EPAM Systems A tiempo completo

    Our remote team is on the lookout for a Lead Site Reliability Engineer, with a specialization in cloud infrastructure provisioning and data migration. RESPONSIBILITIES - Collecting user requirements and devising solutions that meet their needs - Synchronizing with cross-functional teams, among which are the storage and networking groups, as you work towards...