- +2
- +4
- United States
Über
Software Engineer, Data Infrastructure at LMArenaLocation: SF Bay Area/RemoteType: Full-TimeAbout the Role:LMArena is seeking a Software Engineer to join our team and build the data infrastructure that powers real-world AI evaluation. You'll play a crucial role in designing and building the data pipelines that process and analyze over 3 millions user vote data, directly impacting how we understand and evaluate AI model performance. This role is ideal for someone who thrives in fast-moving environments and interested in building products to ensure accurate and fair evaluation of human preferences across different models, which will shape the direction of future AI development.As an early member of our data engineering team, you'll partner closely with researchers, engineers, and product leadership to retrieve valuable data and insights from human votes and feedback. You'll help us move fast while staying rigorous, improving data quality, scaling our infrastructure to new levels, and deepening our ability to compare frontier models and predict human preferences.Responsibilities:Design and build robust data pipelines to ingest, process, and transform user vote data to features essential for model performance evaluation.Collaborate with researchers and product leadership to understand product goals and necessary data.Design and implement solutions to generate result dashboards and reports, providing useful information for the public, model providers, and researchers.Ensure the integrity, data quality, and reliability of the pipelines.Scale our data infrastructure to accommodate increasing data volumes and evolving analytical needs.Who is LMArena?Created by researchers from UC Berkeley’s SkyLab, LMArena is an open platform where everyone can easily access, explore and interact with the world’s leading AI models. By comparing them side by side and casting votes for the better response, the community helps shape a public leaderboard, making AI progress more transparent, and grounded in real-world usage.Why Join Us?Trusted by organizations like Google, OpenAI, Meta, xAI, and more, LMArena is rapidly becoming essential infrastructure for transparent, human-centered AI evaluation at scale. With over one million monthly users and growing developer adoption, our impact is helping guide the next generation of safe, aligned AI systems—grounded in open access and collective feedback.Our work is regularly referenced by industry leaders pushing the frontier of safe and reliable AI. Sundar Pichai, Jeff Dean, Elon Musk, and Sam Altman.High Impact: Your work will be used daily by the world’s most advanced AI labs.Global Reach: Develop data infrastructure powering millions of real-world evaluations, influencing AI reliability across industries at the top-tierExceptional Team: We are a small team of top talent from Google, DeepMind, Discord, Vercel, UC Berkeley, and Stanford.Requirements:Strong software engineering background with a dedicated focus on data engineering and big data technologies.Proficiency in SQL and at least one programming language commonly used for data analysis (Python (preferred), Scala, R).Hands-on experience with data processing and pipeline frameworks (Apache Spark, Ray Data, etc.) and at least one popular big data analytics platform (Databricks, Snowflake).Demonstrated experience in designing, implementing, optimizing, and debugging production data pipelines.Preferred Qualifications:Prior work in data analytics or datalake platforms.Experience in advanced data analysis tools, such as Delta lake, streaming tables.Exposure to machine learning is a plus.What we offer:210k - 250k + equity. Actual compensation will depend on job-related knowledge, skills, experience, and candidate location.Competitive salary and meaningful equityComprehensive healthcare coverage (medical, dental, vision)The opportunity to work on cutting-edge AI with a small, mission-driven teamA culture that values transparency, trust, and community impactCome help build the space where anyone can explore and help shape the future of AI. #J-18808-Ljbffr
Wünschenswerte Fähigkeiten
- SQL
- Python
- Scala
- R
Berufserfahrung
- Data Infrastructure
- Data Engineer
Sprachkenntnisse
- English