More than 100,000 people have found their dream job through Fuzu.

CLOSED FOR APPLICATIONS

Site Reliability Engineer

Closing: Apr 6, 2024

This position has expired

Published: Mar 25, 2024 (31 days ago)

Job Requirements

Education:

Work experience:

Language skills:

Job Summary

Contract Type:

Sign up to view job details.

The Site Reliability Engineer is a Product Team member in the Infrastructure team. You will engineer, manage, and maintain our hosting platform and infrastructure, allowing secure and scalable hosting. You’ll also be central to the future development of our infrastructure services and support both internal teams and external partners. You will work with sys admins, DevOps engineers, and users of the CHT directly – this includes building new features, ensuring support, fixing bugs, testing applications, and ensuring we're working on the most impactful things. You will work with a distributed team based around the world, and you will report to an Engineering Manager.

Product Team’s Core Competencies

As a team, we have adopted a set of “core competencies” for how we show up for each other at work to be great teammates for each other.

  • Reliable - Sets and communicates clear expectations about when something needs to/will be done and does it without prompting.
  • Team Player - Acts in the team's best interest and actively looks for ways to help their colleagues. Makes time to support teammates to be successful.
  • Growth Mindset - Always seeking to improve. Open-minded, teachable, and coachable.
  • Proactive - Sees things that need doing and takes action to keep things moving and make the team successful.
  • Effective Communicator - Communicates regularly, openly, and effectively using the appropriate channels.

Skills Knowledge and Expertise

Required Skills and Qualifications

  • Good understanding of DevOps concepts and best practices
  • 3+ years of experience with Kubernetes, with concrete results
  • Experience in one or more programming languages, preferably Javascript
  • Fluent in English and experience using it in a remote work environment, e.g., over video and text chats
  • Ability to work in a remote and culturally diverse team
  • Detective Skills: Terrific at troubleshooting and debugging.
  • Problem-solving skills
  • Linux system administration, monitoring, security best practices, networking, and logging.
  • You must have valid authorization to work in the country that you are based without requiring sponsorship.
  • Travel Requirement: Candidates should be aware that this role may entail up to 25% travel, including both domestic and international travel to various locations. Most of these locations are in East Africa, West Africa, or Nepal.


Responsibilities

The Site Reliability Engineer is a Product Team member in the Infrastructure team. You will engineer, manage, and maintain our hosting platform and infrastructure, allowing secure and scalable hosting. You’ll also be central to the future development of our infrastructure services and support both internal teams and external partners. You will work with sys admins, DevOps engineers, and users of the CHT directly – this includes building new features, ensuring support, fixing bugs, testing applications, and ensuring we're working on the most impactful things. You will work with a distributed team based around the world, and you will report to an Engineering Manager.

Product Team’s Core Competencies

As a team, we have adopted a set of “core competencies” for how we show up for each other at work to be great teammates for each other.

  • Reliable - Sets and communicates clear expectations about when something needs to/will be done and does it without prompting.
  • Team Player - Acts in the team's best interest and actively looks for ways to help their colleagues. Makes time to support teammates to be successful.
  • Growth Mindset - Always seeking to improve. Open-minded, teachable, and coachable.
  • Proactive - Sees things that need doing and takes action to keep things moving and make the team successful.
  • Effective Communicator - Communicates regularly, openly, and effectively using the appropriate channels.

Skills Knowledge and Expertise

Required Skills and Qualifications

  • Good understanding of DevOps concepts and best practices
  • 3+ years of experience with Kubernetes, with concrete results
  • Experience in one or more programming languages, preferably Javascript
  • Fluent in English and experience using it in a remote work environment, e.g., over video and text chats
  • Ability to work in a remote and culturally diverse team
  • Detective Skills: Terrific at troubleshooting and debugging.
  • Problem-solving skills
  • Linux system administration, monitoring, security best practices, networking, and logging.
  • You must have valid authorization to work in the country that you are based without requiring sponsorship.
  • Travel Requirement: Candidates should be aware that this role may entail up to 25% travel, including both domestic and international travel to various locations. Most of these locations are in East Africa, West Africa, or Nepal.


  • Proactive Monitoring and Team Support
  • Proactively monitor performance and reliability of production Medic systems
  • Produce status pages consumable by non-technical users
  • Consult on technical needs for larger-scale deployments, including local hosting, scalability, etc
  • Provide remote troubleshooting support to active deployments as needed
  • Prioritize urgent troubleshooting problems in live instances
  • Identify possible production problems by checking through or reviewing the issues that have been reported
  • Follow up and investigate questions asked on Slack channels and the CHT forum
  • Keeping in contact with Core Devs and QA teams
  • Provide technical information, explain processes, clarify interactions when requested and ensure proper documentation.
  • Manage upgrades and upgrade processes on production instances.
  • Automate deployments to increase testability and reliability.
  • Automate deployment monitoring and alerting

    Support scaling - Proactively seek new technologies or implementations that solve current problems better or more efficiently

  • Troubleshooting - Prioritize and provide remote troubleshooting support to active deployments as needed.
  • Documentation - Write technical information, explain processes, clarify interactions when requested, and ensure proper documentation.
  • Support shifts—Work dedicated support tasks (not on-call) once every three weeks, primarily assisting other internal teams or external partners.


Applications submitted via Fuzu have 32% higher chance of getting shortlisted.

Don’t miss your chance to work at Medic Mobile . Enter your email to start your application now