Software Engineering Manager, Evaluation PlatformProCore CPA • Austin, Texas, United States
Software Engineering Manager, Evaluation Platform
ProCore CPA
- Austin, Texas, United States
- Austin, Texas, United States
About
Lead and grow a team of engineers focused on evaluation infrastructure, quality measurement, and developer tooling for AI agents.
Define the technical vision and roadmap for the Evaluation Platform - covering offline evaluations (batch benchmarks, regression suites) and online evaluations (live traffic quality monitoring, A/B testing).
Partner with AI/ML, Product, and Agent teams to define quality metrics for agents (relevance, accuracy, latency, safety, user satisfaction, token usage) and build automated pipelines to compute them at scale.
Design and deliver user-facing evaluation tools that allow customers and internal teams to assess agent output quality, compare model versions, and identify regressions.
Build frameworks for human-in-the-loop evaluation - annotation workflows, rating interfaces, and inter-rater reliability measurement.
Establish CI/CD quality gates so that new agent versions cannot ship without passing evaluation thresholds.
Drive engineering excellence: code quality, system reliability, test coverage, on-call health, and technical debt management.
Recruit, mentor, and develop engineers - fostering a culture of ownership, curiosity, and rigorous experimentation.
What we're looking for:
5+ years managing engineering teams or technical leads, with 7+ years total in software engineering.
Experience building evaluation, quality measurement, or observability platforms for LLM-based or agentic systems (RAG pipelines, multi-step agents, tool-use agents).
Strong understanding of evaluation methodologies: precision/recall, LLM-as-judge, human annotation, A/B testing, and statistical significance frameworks.
Proven ability to translate ambiguous problem spaces into clear technical strategies and executable roadmaps.
Hands-on technical depth in backend systems, data pipelines, or distributed infrastructure (Python, Go, or similar)
Familiarity with evaluation frameworks such as RAGAS, DeepEval, LangFuse, or custom eval harnesses.
Background in search relevance (NDCG, MRR) or information retrieval quality systems.
Experience with construction-tech, procurement, or enterprise B2B SaaS domains.
Additional Information Base Pay Range: 168,560.00 - 231,770.00 USD Annual This role may also be eligible for Equity Compensation and/or Bonus Incentive Compensation. Procore is committed to offering competitive, fair, and commensurate compensation. Actual compensation will be based on a candidate's job-related skills, experience, education or training, and location. For Los Angeles County (unincorporated) Candidates: Procore will consider for employment all qualified applicants, including those with arrest or conviction records, in accordance with the requirements of applicable federal, state, and local laws, including the City of Los Angeles' Fair Chance Initiative for Hiring Ordinance, the Los Angeles County Fair Chance Ordinance for Employers, and the California Fair Chance Act. A criminal history may have a direct, adverse, and negative relationship on the following job duties, potentially resulting in the withdrawal of the conditional offer of employment: 1. appropriately managing, accessing, and handling confidential information including proprietary and trade secret information, as well as accessing Procore's information technology systems and platforms; 2. interacting with and occasionally having unsupervised contact with internal/external customers, stakeholders, and/or colleagues; and 3. exercising sound judgment.
Languages
- English
Notice for Users
This job comes from a TieTalent partner platform. Click "Apply Now" to submit your application directly on their site.