Education + 1 more

Senior Data Engineer

Closed for applications

Location

Contract Type

Minimum Academic Qualifications:

Bachelor’s degree in Computer Science, Software Engineering, Information Systems, or a closely related technical field

Experience:

Applicants should possess at least 5 years of professional experience in data engineering, with demonstrated responsibility for designing and operating complex data pipelines and data platforms.
Strong experience designing and implementing data ingestion, transformation, and processing pipelines (ETL/ELT) for large and heterogeneous datasets.
Proficiency in Python and SQL, and experience with data processing frameworks and tools commonly used in modern data engineering environments

Data Pipeline Design and Implementation

Design, implement, and maintain robust data ingestion and processing pipelines for heterogeneous data sources, including soil, weather, agronomic, geospatial, and related contextual datasets.
Develop scalable ETL/ELT workflows to transform raw data into structured, validated, and analytics-ready formats.
Ensure pipelines support both batch and, where required, near-real-time data processing.
Implement data versioning and lineage tracking to support reproducibility and auditability.

Cloud-Based Data Infrastructure

Design and manage cloud-native data architectures, including data lakes, data warehouses, and analytical storage solutions.
Optimize data storage and processing for performance, cost efficiency, and scalability.
Support deployment of data pipelines across development, testing, and pilot environments.
Collaborate with platform teams to ensure infrastructure aligns with DPI principles and interoperability standards.

Data Quality, Governance, and Reliability

Implement automated data quality checks, validation rules, and monitoring to ensure accuracy, completeness, and consistency.
Support enforcement of data governance requirements, including access controls, permissions, and audit logging.
Work with policy and governance partners to ensure technical implementations align with data protection and consent frameworks.
Proactively identify and remediate data reliability risks or bottlenecks.

Enablement of AI and LLM-Based Systems

Prepare and serve data in formats optimized for AI and LLM-based advisory systems, including retrieval-augmented generation (RAG) pipelines and structured knowledge services.
Support model evaluation, benchmarking, and experimentation workflows.

MLOps Support and Operational Readiness

Contribute to MLOps workflows by supporting data versioning, pipeline automation, and integration with model deployment and evaluation processes.
Implement monitoring and logging for data pipelines to support observability and issue diagnosis.
Support reproducible experimentation through consistent data environments and pipeline automation.

Documentation, Collaboration, and Delivery

Produce clear technical documentation covering data architectures, pipeline logic, and operational procedures.

Recruit better talent faster - on your own or with our support.

Job search tips from Fuzu

Selected articles on cover letters, CV structure, and interview preparation.