Home Jobs Learn Forum For employers

Jobs and Vacancies in Pumwani, Kenya

388 jobs found

Senior Site Reliability Engineer

Nairobi • Kenya

Closed for applications

Senior Platform Engineer

Nairobi • Kenya

Closed for applications

Tezza Business Solutions

Manager - Architecture (MA)

Nairobi • Kenya

Closed for applications

Kenya Wine Agencies Ltd.

DDE Project Manager

Nairobi • Kenya

Closed for applications

Alliance For a Green Revolution Africa (AGRA)

Associate Grants Officer – Kenya (Temporary Role)

Nairobi • Kenya

Closed for applications

DevOps Engineer

Nairobi • Kenya

Closed for applications

Food For Education

Occupational Safety & Health (OSH) Administrator

Nairobi • Kenya

Closed for applications

Owner’s Engineer – Large-Scale Solar PV & BESS

Nairobi • Kenya

Closed for applications

Senior Project Manager – Utility-Scale Solar PV & BESS

Nairobi • Kenya

Closed for applications

Get personalised job alerts directly to your inbox!

Senior Business Development - Mining

Nairobi • Kenya

Closed for applications

Top cities with open vacancies

Jobs in Nairobi, Jobs in Kiambu, Jobs in Ruiru, Jobs in Kikuyu, Jobs in Kitengela, Jobs in Limuru, Jobs in Athi River, Jobs in Juja, Jobs in Garissa, Jobs in Mombasa, Jobs in Kiserian, Jobs in Mandera, Jobs in Thika, Jobs in Wajir, Jobs in Pumwani

Companies hiring now

Aga Khan Hospitals, Mama Ngina University College (MNUC), Mogo Kenya Limited , Oasis Outsourcing, UNEP

Country / Region

Africa Nigeria Uganda

Profession

Industry

Seniority

Entry and Basic-level,Mid-level,Senior-level,

Fuzu

About Fuzu Careers at Fuzu Contact us For Employers

© Fuzu Ltd

Kenya/Nairobi/Information technology, software development, data

Computers + 1 more

Senior Site Reliability Engineer

Closed for applications

Nairobi • Kenya

Job details

Location

Nairobi • Kenya

Contract Type

Description

Requirements

Bachelor's degree in Computer Science, Information Technology, or a related field.
5+ years of experience in Software Engineering, SRE, DevOps, or Platform Engineering, with demonstrable ownership of reliability standards at a team or company level.
Strong coding fluency: Proficiency in Python (or similar) with the ability to read, understand, reason about, and write production-grade automation code.
Cloud & IaC: Hands-on experience with AWS, and a solid understanding of Infrastructure as Code (Terraform or CloudFormation).
Deep Observability Knowledge: Demonstrable experience with monitoring tools (DataDog, Prometheus, ELK stack). Strong understanding of SRE concepts including Golden Signals, high-cardinality data handling, and error budget mathematics.
Systems Thinking: Strong grasp of designing for scale and resilience, including graceful failure, circuit breaking, connection pooling, and multi-AZ deployments.
Proven ability to define and drive reliability standards across multiple teams and drive a blameless post-mortem culture.

Responsibilities

Enablement & RelOps Culture
- Implement the Observability Ladder: Guide teams from basic monitoring to high-signal metric tracking. Work with product teams to define SLAs, SLIs, and SLOs, and build dashboards that track specific error budgets.
- Empower Product Teams: Build frameworks and deployment tooling (e.g., CI/CD, internal tooling integrations) that allow teams to make data-driven decisions on deployment safety and automate rollbacks when error budgets are depleted.
- Champion Reliability: Drive a blameless post-mortem culture focused on actionable takeaways, system improvements, and measurable metrics (MTBF, MTTR).
Frameworks & Automation
- Standardised Alerting & On-Call: Continuously improve company-wide alerting and on-call frameworks to reduce alert fatigue, ensuring alerts are highly actionable and symptom-based.
- Disaster Recovery: Drive evolution of DR strategies from manual processes into fully automated runbooks-as-code, allowing teams to prove and improve service recoverability through autonomous, evidence-based testing.
- Eliminate Toil: Develop systems, automations, and tooling for pre- and post-deployment verification, ensuring our hands-off reliability vision becomes a production reality, via Python (or similar).
- Reliability-as-Code: Lead the drive to manage our entire reliability suite through IaC. Use Terraform to architect, deploy, and configure our observability stack including ELK, Grafana, Loki, Prometheus, and Tracing.

Tags

Information technology, software development, data Computers, software development and services Mid-level Kenya

Start hiring with Fuzu

Recruit better talent faster - on your own or with our support.

Explore recruitment platform

Job search tips from Fuzu

Selected articles on cover letters, CV structure, and interview preparation.