Data Engineer - AI

Impact Bridge Consulting

Dallas, Texas, United States

Dallas, Texas, United States

Jetzt Bewerben

Über

Data Engineer - AI Dallas, TX (preferred) | Hybrid (Bishop Arts preferred) | Full-time Reports to the Founding AI / Engineering Team
Why this role exists Our Client is an AI-powered contract intelligence platform that validates purchased services invoices against contract terms before payment, turning contracts into enforceable controls within healthcare procure-to-pay workflows.
The platform processes massive volumes of contracts, invoices, vendor records, and transactional data. A single enterprise customer may generate over 30,000 invoice and contract-related documents monthly, all requiring ingestion, extraction, normalization, validation, monitoring, and analytics.
The company’s founding engineering team is currently focused on building higher-level AI systems, semantic layers, ontology frameworks, and enterprise-scale platform architecture. This role exists to own the implementation and operationalization layer underneath that vision, building and maintaining the pipelines, reporting systems, integrations, and scalable data infrastructure that allow the platform to operate reliably at enterprise scale.
This is not a pure analytics role and not a pure research role. It is a hands‑on engineering role for someone who can build production‑grade data pipelines while also understanding how modern AI, ML, LLM, and knowledge graph systems operate.
If you enjoy building scalable data systems, handling messy enterprise data, operationalizing AI pipelines, and creating infrastructure that powers enterprise SaaS products, this role will feel like a strong fit.
What you’ll own Enterprise Data Pipeline Engineering
Build, maintain, and optimize large‑scale ETL/ELT pipelines for contracts, invoices, logs, traces, events, and operational data
Support enterprise‑scale ingestion and processing workflows for healthcare procurement and AP data
Design resilient streaming and batch processing systems
Help operationalize the platform for enterprise‑grade customer workloads
Improve pipeline reliability, observability, scalability, and monitoring
Support distributed data processing workflows across large document and transactional datasets
Reporting + Operational Analytics
Build internal and customer‑facing reporting systems showing document processing status, validation outcomes, exceptions, and operational insights
Create dashboards and analytics layers that provide actionable insights from invoice and contract data
Develop ad hoc reporting capabilities for founders, GTM teams, customers, and investors
Help identify trends, gaps, anomalies, and operational patterns across purchased services spend
Translate raw platform data into usable operational intelligence
AI + ML Data Infrastructure
Support AI and ML pipelines powering contract intelligence and invoice validation workflows
Build infrastructure supporting LLM, ML, and semantic data workflows
Work alongside engineers building ontology layers, semantic layers, and knowledge graph systems
Help structure and operationalize datasets for AI‑driven applications
Support vector database, semantic retrieval, and modern AI architecture workflows
Understand how data flows through MLOps and LLMOps environments
Platform Data Foundations
Help maintain and improve the company’s core data architecture
Support enterprise‑grade logging, tracing, monitoring, and event management systems
Build scalable data lake and storage workflows
Improve system reliability and operational visibility as customer scale increasesCollaborate closely with AI engineers and platform leadership on implementation and execution.
What Success Looks Like (First 90 Days) First 45 Days
Ramp quickly on the AI platform, pipeline architecture, and customer workflows
Understand how contracts, invoices, validation systems, and analytics pipelines interact
Identify gaps in pipeline reliability, reporting, and data quality
Begin contributing production‑ready improvements to core pipelines and operational systems
By 90 Days
Core reporting and analytics workflows are operational and scalable
Enterprise pipeline reliability and monitoring improve measurably
Data quality and processing visibility improve across customer workflows
Internal teams can access cleaner operational reporting and analytics
Founders and customer‑facing teams can generate custom reporting more efficiently
AI and semantic systems receive more reliable and structured downstream data
You are independently building and maintaining production data workflows with minimal oversight
The profile that tends to win here
You are first and foremost a strong engineer who can build and maintain production systems
You have experience working with enterprise‑scale or mid‑market data environments, not only early‑stage startups
You’ve worked with large‑scale transactional, operational, or machine‑generated datasets
You understand modern AI/ML ecosystems well enough to support them operationally
You are comfortable dealing with ambiguity and evolving infrastructure
You think systematically about scalability, reliability, and maintainability
You can move comfortably between infrastructure, pipelines, analytics, and operational engineering
You are highly analytical and naturally curious about patterns, anomalies, and data quality
You move quickly, fail fast, and care deeply about accuracy and operational quality
Qualifications
4–8+ years of experience in Data Engineering, Platform Engineering, or Backend/Data Infrastructure roles
Strong experience building ETL/ELT pipelines in production environments
Experience with distributed data processing systems
Experience handling streaming and batch data workflows
Strong SQL and Python skills
Experience with modern cloud infrastructure (AWS, GCP, or Azure)
Experience working with data lakes and large‑scale operational datasets
Experience handling logs, traces, events, and telemetry‑style data
Familiarity with ML pipelines, vector databases, or modern AI data architectures
Understanding of MLOps and/or LLMOps concepts
Experience building reporting systems, dashboards, and operational analytics workflows
Comfortable working in fast‑moving startup environments with evolving requirements
Strongly Preferred:
Experience supporting AI/LLM‑driven products
Exposure to knowledge graphs, semantic layers, or ontology systems
Experience in enterprise SaaS environments
Experience with observability and monitoring tooling
Familiarity with healthcare, procurement, AP automation, or invoice processing systems
Experience building customer‑facing analytics systems
Experience supporting high‑volume document processing systems
Compensation + benefits
Competitive base salary + variable comp + potential for future equity
Opportunity to help build foundational infrastructure at an early‑stage AI company
High ownership and direct technical impact
Flexible and remote‑friendly environment
Opportunity to work on cutting‑edge AI + enterprise data infrastructure problems.
#J-18808-Ljbffr

Dallas, Texas, United States

Sprachkenntnisse

English

Hinweis für Nutzer

Dieses Stellenangebot stammt von einer Partnerplattform von TieTalent. Klick auf „Jetzt Bewerben”, um deine Bewerbung direkt auf deren Website einzureichen.

Jetzt Bewerben