
Get personalised job alerts directly to your inbox!

Freelance Software Developer/DevOps Engineer
Nairobi
• Kenya
Top cities with open vacancies
Jobs in NairobiCompanies hiring now
Fuzu RemoteProfession (Human resources, talent development, recruiting, Entry and Basic-level)
Industry (Information technology, software development, data, Entry and Basic-level)
Seniority (Information technology, software development, data, Human resources, talent development, recruiting)
© Fuzu Ltd

Human resources + 2 more
Description
Fuzu Global Workforce is a service line of Fuzu Oy, headquartered in Helsinki, Finland. We deliver global out-staffing services to forward-thinking businesses, connecting them with top-tier talent across data sciences, languages, technology, data annotation, and creative sectors. With strong expertise in global talent and robust support systems, Fuzu ensures seamless remote collaboration, compliance, and scalable delivery. Our fully vetted consultants are managed by Fuzu, providing high-quality work and the flexibility of global remote teams. Join us in reshaping the future of work—where borders no longer limit opportunity or excellence.
We are seeking experienced Software Developers or DevOps Engineers to support an AI evaluation project.
In this role, you will assess AI-generated outputs, validate tool calls, review reasoning quality, and help improve AI models used in developer-focused environments.
This position is ideal for technically strong engineers who are comfortable navigating unfamiliar codebases, analyzing stack traces, and evaluating the correctness of software-related tasks.
2+ years of experience in Software Development or DevOps Engineering
Strong proficiency in Python, including experience with frameworks such as Django
Hands-on experience with the Linux Terminal
Ability to read and interpret stack traces, debugging logs, and error messages
Proven ability to work with large, unfamiliar, or messy codebases
Strong analytical skills and attention to detail
Responsibilities
Evaluate terminal bench and SWE bench outputs generated by AI models
Review and validate tool calls, trajectories, and argumentation used by the AI
Assess the quality and technical soundness of LLM (Large Language Model) responses
Verify whether tool calls were necessary, correct, and efficiently used
Analyze stack traces, error logs, and code behavior to determine correctness
Provide structured feedback to improve AI model performance
Work independently and collaborate with the project team as needed
Exceptional accuracy and attention to detail
Ability to meet deadlines in a fast-paced environment
Commitment to maintaining high work standards
Professionalism in handling confidential materials
Location: Remote
Work Hours: 8 hours per day
Start hiring with Fuzu
Recruit better talent faster - on your own or with our support.
Explore recruitment platformJob search tips from Fuzu
Selected articles on cover letters, CV structure, and interview preparation.