Zurück zur Stellenangebote
XX
Senior Software Engineer & LLM Code Trainer - Remote - LatinAmericaKakeNew York, New York, United States

Dieses Stellenangebot ist nicht mehr verfügbar

XX

Senior Software Engineer & LLM Code Trainer - Remote - LatinAmerica

Kake
  • US
    New York, New York, United States
  • US
    New York, New York, United States

Über

We are looking for a
Senior Software Engineer
to contribute to the development and evaluation of AI training data for a leading expert human data platform for
AI agents and LLMs . In this role, you will work at the intersection of software engineering and artificial intelligence, helping AI labs and companies build better, safer, and more capable models. You will leverage your deep technical expertise to write prompts, produce reference‑quality code solutions, evaluate model outputs, and provide the structured human signal that makes AI systems smarter. This is not a traditional engineering role - it is a unique opportunity for senior engineers who want to shape how the next generation of AI understands, generates, and reasons about code. Key Responsibilities
Create and review coding tasks based on real‑world software engineering scenarios, including debugging, refactoring, code generation, API usage, automated tests, performance, security, and edge cases. Write high‑quality reference solutions that are correct, clear, testable, and aligned with task requirements. Evaluate AI‑generated code and responses using structured rubrics, assessing correctness, clarity, security, performance, maintainability, and instruction‑following. Compare multiple model responses, select the strongest answer, and justify your decision with clear technical reasoning. Identify bugs, hallucinated APIs, missing edge cases, weak explanations, and poor engineering decisions in AI‑generated outputs. Work with terminal‑based development workflows when needed, including running tests, debugging issues, managing dependencies, and navigating repositories. Follow detailed guidelines consistently and participate in calibration activities to ensure high‑quality, reliable evaluations. Core Requirements
5+ years
of professional software engineering experience in a
backend, fullstack, or systems
role. Hands‑on experience with
Terminal‑Bench , with the ability to evaluate AI agent performance on terminal‑based tasks including compiling code, running tests, managing environments, and completing multi‑step software engineering workflows. Comfortable working with
Git , command line/terminal, and common development workflows. Ability to evaluate code critically - not only whether it works, but whether it is well‑designed, secure, and maintainable. Prior experience in AI data production, RLHF, data annotation, or LLM evaluation projects. Excellent written and verbal communication skills in English. Ability to work independently in a remote, asynchronous, fast‑paced environment. High attention to detail and the ability to follow complex, rubric‑based guidelines consistently. Nice‑to‑Have
Experience with Python‑heavy workflows, automated testing frameworks, Docker, Linux, bash, or containerized environments. Experience with repo‑level code reasoning, large codebases, or open-source contributions. Background in backend systems, data engineering, DevOps, infrastructure, security, or large codebase. Additional
US Timezone Overlap: PST (GMT -8) Please Note: Due to the high volume of applications, only shortlisted candidates will be contacted.
#J-18808-Ljbffr
  • New York, New York, United States

Sprachkenntnisse

  • English
Hinweis für Nutzer

Dieses Stellenangebot wurde von einem unserer Partner veröffentlicht. Sie können das Originalangebot einsehen hier.