
Equity Bank Kenya
SRE Engineer
Nairobi
• Kenya
Closed for applications

Get personalised job alerts directly to your inbox!
Absa Group Ltd
Product, Insights Analytics Manager
Nairobi
• Kenya
Closed for applications
Top cities with open vacancies
Jobs in NairobiProfession (Banking, microfinance, insurance)
Industry (Information technology, software development, data)
Seniority (Information technology, software development, data, Banking, microfinance, insurance)
© Fuzu Ltd

Equity Bank Kenya
Banking + 2 more
Description
Qualifications
KEY TECHNICAL SKILLS & COMPETENCIES
-
Elasticsearch, Logstash, Kibana (ELK Stack)
-
Microsoft Azure
-
Unix / Linux and Shell Scripting
-
SQL and database concepts
-
Monitoring and observability tools
-
Strong analytical, problem‑solving, and documentation skills
EXPERIENCE REQUIREMENTS
-
Minimum 2 years’ experience in a Site Reliability Engineering, DevOps, or Production Support role
-
Mandatory hands‑on experience with ELK Stack
-
Experience supporting banking or enterprise‑scale applications
ACADEMIC QUALIFICATIONS & CERTIFICATIONS
-
Bachelor’s degree in science, Engineering, Information Technology, or a related field
-
Nice to have: ELK, Azure, or other relevant cloud/observability certifications
Responsibilities
. ELK Engineering and Log Analytics
-
Install, configure, and maintain ELK stack components (Elasticsearch, Logstash, Kibana, Beats) across environments.
-
Design efficient dashboards, graphs, and visualizations that translate application logs into business‑readable insights.
-
Analyze application logs to identify trends, risks, and incidents affecting system performance and availability.
-
Develop customized reports, bar charts, and pie charts to support operational and business decision‑making.
-
Implement ELK‑triggered auto‑healing and remediation scripts to detect and resolve incidents proactively.
2. Toil Reduction and Automation
-
Identify repetitive, manual, and reactive operational tasks and eliminate them through automation.
-
Develop scripts and tools using languages such as Python, Bash, or Go to automate system maintenance and operational workflows.
-
Implement Infrastructure as Code (IaC) using tools such as Terraform or Ansible to ensure consistent, repeatable infrastructure provisioning.
-
Design and implement self‑healing systems capable of automatic recovery from common failures without human intervention.
3. Monitoring, Alerting, and Observability
-
Define and implement Service Level Indicators (SLIs), Service Level Objectives (SLOs), and Service Level Agreements (SLAs) in collaboration with business and development teams.
-
Build and maintain robust monitoring, logging, and observability solutions using tools such as ELK, Prometheus, Grafana, or equivalent platforms.
-
Configure intelligent, actionable alerts that minimize noise and false positives while ensuring rapid incident detection.
-
Continuously improve monitoring coverage and system visibility to support proactive operations.
4. Incident Response and Management
-
Participate in on‑call rotations to respond to critical system alerts and production incidents.
-
Diagnose, mitigate, and resolve incidents to restore services within agreed SLAs.
-
Conduct blameless post‑incident reviews to identify root causes and define preventative actions.
-
Develop and maintain runbooks and playbooks for common incident scenarios to improve response time and consistency.
5. Capacity Planning and Performance Optimization
-
Analyze historical system usage and trends to forecast future capacity requirements.
-
Perform system and database performance tuning in collaboration with development teams.
-
Conduct load and stress testing to identify bottlenecks before they impact production systems.
-
Ensure systems are cost‑efficient, scalable, and capable of supporting business growth.
6. Cross‑Functional Collaboration
-
Work closely with software development teams during solution design to ensure reliability, scalability, and operational readiness.
-
Promote a DevOps and SRE culture through shared ownership of system reliability (“You Build It, You Run It”).
-
Share knowledge, best practices, and documentation to uplift operational maturity across teams.
Start hiring with Fuzu
Recruit better talent faster - on your own or with our support.
Explore recruitment platformJob search tips from Fuzu
Selected articles on cover letters, CV structure, and interview preparation.