How many jobs are available for IT & Digital Marketing in Europe?

There are 2'751 job openings for IT & Digital Marketing in Europe.

Job Opportunities

Find jobs near you, whether onsite, hybrid, or remote.

Similar Jobs to: Remote Evaluation Lead

Applied Data Scientist, LLM Evaluation United States (Remote) View Role
Driverai
Austin
Full-Time in Austin, TX Remote (any location) - Senior - Product & Engineering - $175k - $275kApplied Data Scientist, LLM Evaluation Introduction At Driver, we’re building systems that turn source cod
Lead AI Data Scientist (Remote)
USAA
Phoenix
USAA is looking for a Lead AI Individual Contributor Data Scientist to drive AI implementations across USAA Federal Savings Bank. This remote-eligible role focuses on leveraging generative AI and adva
Cash Applications Lead — Remote
SSM Health
Kansas City
SSM Health is seeking a leader for cash applications activities. This remote role requires candidates to reside in MO, IL, OK, or WI. You will lead a team, ensure compliance, and support operational t
Creative Content Lead Remote, US
Bobbie Baby, Inc.
New York
Bobbie makes European-style organic infant formula manufactured end-to-end in the U.S., because we believe every parent deserves to feed their baby with confidence, without judgment and without compro
Remote Mortgage Servicing Analytics Lead
The Money Store
Florham Park
The Money Store seeks a Mortgage Servicing Data Analyst to enhance the servicing portfolio through data-driven insights. Located in Florham Park, NJ, this full-time role commands expertise in mortgage
Remote BI & Data Analytics Lead
Program Productions LLC
Scottsdale
Program Productions LLC is seeking a full-time Business Intelligence & Data Analyst to leverage labor and payroll data for operational improvements. This remote role involves developing executive dash
Remote Data Engineering & Analytics Lead
Hobbsnews
Charlotte
Bank of America is seeking a data engineer in Charlotte, NC to develop and deliver complex data solutions. You will work with stakeholders and software engineering teams to implement data requirements
Remote Werkstudent: Digital Marketing & Lead Management
SITS Group
New Bremen
Die Sits Group sucht einen Werkstudenten (m/w/d) im Bereich Digital Marketing und Lead Management. Du wirst Teil eines kreativen Teams, das spannende Kampagnen im Bereich Cybersecurity umsetzt und dab
Remote-Eligible Paid Digital Marketing Lead
Stripe
San Francisco
Stripe is looking for a Paid Digital Marketing Manager to drive paid digital marketing efforts supporting inbound funnel and demand generation. You will implement and optimize scaled digital campaigns
Remote Architect - Design, Lead Projects & Coordination
AECOM
Vernon
AECOM is seeking an Architect to join its team in California with remote/hybrid options available. The role includes performing architectural calculations, preparing specifications, and coordinating d
Remote User Acquisition Lead for Mobile Apps
IDT Corporation
New York
IDT Corporation is seeking an experienced User Acquisition Manager to enhance user growth for their apps, Boss Money and Boss Revolution. This role involves executing and optimizing paid campaigns acr
Strategic PV Sales Lead – Key Accounts (Remote)
IBC SOLAR
New Bremen
IBC SOLAR sucht einen Vertriebsmitarbeiter für Photovoltaik-Anlagen mit einem klaren Fokus auf Kundenbeziehungen und strategischem Vertrieb. Der Kandidat sollte über fundierte Kenntnisse im Bereich Er
Senior QuickBase Developer & Data Analytics Lead - Remote
CVS Health
Wausau
CVS Health Corporation is looking for a Senior Analyst, QuickBase Developer to work from home. This role focuses on supporting Aetna's growth through data analytics, file preparation, and provider ros
Remote Branch Manager: Lead Sales & Digital Growth
Cobalt-Credit-Union
Papillion
Cobalt-Credit-Union is looking for a Virtual Branch Manager in Papillion, Nebraska. This role requires overseeing a sales team and ensuring high-quality digital member experiences. The ideal candidate
Remote Inside Sales Associate - Lead Gen & Growth
Amadeus Hospitality
Dallas
Amadeus Hospitality is seeking an Associate Inside Sales Representative in Dallas, Texas. The role involves generating and qualifying leads to ensure a strong sales pipeline while collaborating with M
Remote CPT Content Lead - Workout Video Creator
Fitness Blender
Indiana
Fitness Blender is seeking a part-time Content Creator – Certified Personal Trainer to create engaging instructional exercise material. Candidates should hold a NCCA Accredited Personal Trainer Certif
Senior Content Ops Lead - Financial Markets (Remote)
Binance
New York
Binance is seeking a Senior Content Operations Manager in New York, NY, to lead financial content strategy. The role involves managing end-to-end content operations for market-related content, analyzi
Senior Azure DevOps & Terraform Lead (Remote)
GlobalLogic
Fort Worth
GlobalLogic is seeking a cloud infrastructure professional in Fort Worth, TX. The role requires extensive experience with Azure and Terraform, including managing complex deployments and migrations. Wi
Remote QA Lead for Web, Mobile & API (EST/CST)
Laotop
New York
Laotop recherche un QA Lead expérimenté pour un projet distant avec un client américain, travaillant sur des applications Web, Mobile et API. Le candidat doit avoir au moins 10 ans d'expérience dans l
Remote Medicine QA Lead for AI Training
YO IT Consulting
New York
YO IT Consulting is seeking a Medicine Quality Assurance Lead for a remote position. This crucial role focuses on overseeing quality and trainer performance in medical AI projects by reviewing generat
Lead Security Architect Remote (EMEA) with Equity
Framework Ventures
New York
Framework Ventures is seeking a Lead Security Architect responsible for defining security strategy and practices. This senior leadership role will strengthen security across applications, infrastructu
Remote Neuroscience QA Lead - AI Data Quality
YO IT Consulting
Raleigh
YO IT Consulting is looking for a Neuroscience Quality Assurance Lead to operate remotely in a contract role. You'll oversee quality and trainer performance across neuroscience and cognitive science A
Regional Sales Lead, Western US (Remote) - Utility Software
Uneek Global
Denver
Uneek Global is seeking a Regional Sales Manager for the Western US to drive enterprise software sales in the energy technology sector. This remote role focuses on acquiring new business, maintaining
Remote Python ML QA Lead: Quality & Training Oversight
YO IT Consulting
Florida
YO IT Consulting is looking for a Python(Machine Learning) Quality Assurance Lead to ensure quality and consistency across AI training projects. This remote role involves reviewing AI-generated Python
Remote Regional Sales Lead - Diabetes Care (OK)
Abbott Laboratories
Oklahoma City
Abbott Laboratories is looking for a Regional Sales Manager for the Oklahoma Region. In this remote position, you will be responsible for leading a team of sales representatives, exceeding sales goals

Applied Data Scientist, LLM Evaluation United States (Remote) View RoleDriverai • Austin, Texas, United States

Apply Now

Applied Data Scientist, LLM Evaluation United States (Remote) View Role

Driverai

Austin, Texas, United States

Austin, Texas, United States

Apply Now

About

Full-Time in Austin, TX Remote (any location) - Senior - Product & Engineering - $175k - $275k
Applied Data Scientist, LLM Evaluation Introduction At Driver, we’re building systems that turn source code into human language. The tech stack includes a core compiler-like engine, a heavily asynchronous/distributed backend server, and a frontend web application that provides a rich user experience.
About Driver We’re an early-stage startup backed by Y Combinator and Google Ventures that combines first principles technical approaches and applied LLM expertise to tackle context engineering at scale. Driver builds the context layer for employees and AI agents alike to use in developing software.
Working at Driver Driver is an early-stage but fast-growing startup. As such, we take advantage of that which startups can excel: delivery speed, flexibility, and enjoying working with a small close-knit team.
Organizational and engineering values at Driver include first-principles thinking, correct by construction, writing things down, experimentation and iteration, pragmatism, commitment to effective communication and transparency, autonomy, and ambition.
Job Overview Title : Applied Data Scientist, LLM Evaluation
Location:
Remote or Austin, Tx
Our value is directly tied to the quality of our content at scale. The platform generates technical documentation across a complex, multi-stage pipeline — producing multiple content types at different levels of abstraction, from individual code elements up to high-level summaries. Today, changes to models, context strategies, or pipeline architecture are evaluated largely through manual review and intuition. There is no systematic way to answer: “Did this change make our output better, worse, or the same — and for which languages, repo sizes, and content types?”
This is a hard problem. LLM outputs are non-deterministic — identical inputs produce different outputs across runs, and small variations at early pipeline stages compound into meaningfully different end-user content downstream. Evaluating quality requires methodology that accounts for this: statistical reasoning over multiple runs, understanding of cascade effects through the pipeline, and rubrics that balance human judgment with automated signals.
This role builds the evaluation function from scratch. You’ll define what “good” means for our generated content, build the infrastructure to measure it, and create the experimental framework that lets the team ship changes with confidence.
What You’ll Do You’ll own the LLM evaluation strategy at Driver — from first principles to production infrastructure. This is a foundational role: you’re not joining an existing eval team, you’re building it. As the function matures, you’ll seed and grow a team around it.
Define quality metrics and build evaluation datasets.
Establish what “good” looks like for each content type across the pipeline. Build and curate gold-standard evaluation datasets across languages and repo archetypes (monorepos, microservices, libraries, applications). Design rubrics that capture accuracy, completeness, usefulness, and readability.
Build benchmarking and experimentation infrastructure.
Create automated evaluation pipelines that score output against reference datasets. Instrument the content generation pipeline to support A/B comparisons — run the same codebase through two strategies and compare results. Build tooling for LLM-as-judge evaluation and regression detection. Integrate evaluation into CI so pipeline changes come with quality evidence.
Develop automated quality signals at scale.
Build quality checks that flag degraded output without requiring human review of every document. Monitor content quality trends over time. Design sampling strategies for human review that maximize signal with minimal annotation effort.
Quantify tradeoffs and inform decisions.
Run experiments on model selection, context strategies, and pipeline architecture changes. Quantify cost/quality/latency tradeoffs. Partner with the engineering team to turn evaluation insights into shipped improvements.
Qualifications Education:
Bachelor’s, Master’s, or PhD in Statistics, Machine Learning, Data Science, Computational Linguistics, or a related quantitative field.
Experience:
Minimum 3 — 5 years in applied science, ML engineering, or data science roles with a focus on evaluation, NLP, or generative AI. 7+ years experience preferred.
Required Technical Skills
Strong statistical foundations: experimental design, hypothesis testing, confidence intervals, effect sizes, power analysis.
Experience designing and running evaluations for LLM or NLP systems — you’ve thought carefully about what “better” means when outputs are open-ended text.
Proficient in Python and the scientific/data stack (pandas, NumPy, scipy, sklearn).
Comfortable working in Jupyter notebooks for exploration and prototyping, and turning that work into automated pipelines.
Experience with LLM-as-judge approaches, inter-annotator agreement, and rubric design for subjective quality assessment.
Familiarity with the practical challenges of non-deterministic systems: variance decomposition, multi-run methodology, distinguishing signal from noise at scale.
Strong data storytelling — you can turn experiment results into clear recommendations that drive engineering and product decisions.
Preferred and Nice-to-Have Technical Skills
Experience with LLM APIs and prompt engineering across multiple providers.
Familiarity with evaluation frameworks (e.g., RAGAS, DeepEval, custom harnesses).
Experience building data pipelines or ETL workflows (Airflow, Dagster, or similar).
Comfort with SQL and working directly against production data stores.
Experience with visualization tools (Matplotlib, Plotly, Streamlit) for building internal dashboards and reports.
Background in code understanding, developer tools, or technical documentation.
Experience building or managing annotation pipelines and human evaluation workflows.
Competitive Compensation Packages - Cash & Equity
Flexible Work Culture
Unlimited Time Off + 12 Paid Company Holidays
Life Insurance & FSA Accounts
401(k) Retirement Accounts - Traditional, Roth, or Both
Quarterly Team Offsites
Driver is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.
#J-18808-Ljbffr

Austin, Texas, United States

Languages

English

Notice for Users

This job comes from a TieTalent partner platform. Click "Apply Now" to submit your application directly on their site.

Job Opportunities

Applied Data Scientist, LLM Evaluation United States (Remote) View Role

Lead AI Data Scientist (Remote)

Cash Applications Lead — Remote

Creative Content Lead Remote, US

Remote Mortgage Servicing Analytics Lead

Remote BI & Data Analytics Lead

Remote Data Engineering & Analytics Lead

Remote Werkstudent: Digital Marketing & Lead Management

Remote-Eligible Paid Digital Marketing Lead

Remote Architect - Design, Lead Projects & Coordination

Remote User Acquisition Lead for Mobile Apps

Strategic PV Sales Lead – Key Accounts (Remote)

Senior QuickBase Developer & Data Analytics Lead - Remote

Remote Branch Manager: Lead Sales & Digital Growth

Remote Inside Sales Associate - Lead Gen & Growth

Remote CPT Content Lead - Workout Video Creator

Senior Content Ops Lead - Financial Markets (Remote)

Senior Azure DevOps & Terraform Lead (Remote)

Remote QA Lead for Web, Mobile & API (EST/CST)

Remote Medicine QA Lead for AI Training

Lead Security Architect Remote (EMEA) with Equity

Remote Neuroscience QA Lead - AI Data Quality

Regional Sales Lead, Western US (Remote) - Utility Software

Remote Python ML QA Lead: Quality & Training Oversight

Remote Regional Sales Lead - Diabetes Care (OK)

Applied Data Scientist, LLM Evaluation United States (Remote) View Role

About

Languages