
Strathmore University
Education + 1 more
Description
Minimum Academic Qualifications:
- Bachelor’s degree in Computer Science, Software Engineering, Information Systems, or a closely related technical field
Experience:
- Applicants should possess at least 5 years of professional experience in data engineering, with demonstrated responsibility for designing and operating complex data pipelines and data platforms.
- Strong experience designing and implementing data ingestion, transformation, and processing pipelines (ETL/ELT) for large and heterogeneous datasets.
- Proficiency in Python and SQL, and experience with data processing frameworks and tools commonly used in modern data engineering environments
Responsibilities
Data Pipeline Design and Implementation
- Design, implement, and maintain robust data ingestion and processing pipelines for heterogeneous data sources, including soil, weather, agronomic, geospatial, and related contextual datasets.
- Develop scalable ETL/ELT workflows to transform raw data into structured, validated, and analytics-ready formats.
- Ensure pipelines support both batch and, where required, near-real-time data processing.
- Implement data versioning and lineage tracking to support reproducibility and auditability.
Cloud-Based Data Infrastructure
- Design and manage cloud-native data architectures, including data lakes, data warehouses, and analytical storage solutions.
- Optimize data storage and processing for performance, cost efficiency, and scalability.
- Support deployment of data pipelines across development, testing, and pilot environments.
- Collaborate with platform teams to ensure infrastructure aligns with DPI principles and interoperability standards.
Data Quality, Governance, and Reliability
- Implement automated data quality checks, validation rules, and monitoring to ensure accuracy, completeness, and consistency.
- Support enforcement of data governance requirements, including access controls, permissions, and audit logging.
- Work with policy and governance partners to ensure technical implementations align with data protection and consent frameworks.
- Proactively identify and remediate data reliability risks or bottlenecks.
Enablement of AI and LLM-Based Systems
- Prepare and serve data in formats optimized for AI and LLM-based advisory systems, including retrieval-augmented generation (RAG) pipelines and structured knowledge services.
- Support model evaluation, benchmarking, and experimentation workflows.
MLOps Support and Operational Readiness
- Contribute to MLOps workflows by supporting data versioning, pipeline automation, and integration with model deployment and evaluation processes.
- Implement monitoring and logging for data pipelines to support observability and issue diagnosis.
- Support reproducible experimentation through consistent data environments and pipeline automation.
Documentation, Collaboration, and Delivery
- Produce clear technical documentation covering data architectures, pipeline logic, and operational procedures.
Start hiring with Fuzu
Recruit better talent faster - on your own or with our support.
Explore recruitment platformJob search tips from Fuzu
Selected articles on cover letters, CV structure, and interview preparation.