Dieses Stellenangebot ist nicht mehr verfügbar
Senior Lead Data Engineer
Toyota Deutschland GmbH
- Plano, Texas, United States
- Plano, Texas, United States
Über
An important part of the Toyota family is Toyota Financial Services (TFS), the finance and insurance brand for Toyota and Lexus in North America. While TFS is a separate business entity, it is an essential part of this world-changing company—delivering on Toyota's vision to move people beyond what's possible. At TFS, you will help create best-in-class customer experience in an innovative, collaborative environment.
Who we’re looking for At TFS, we’re building next-generation products that redefine mobility for millions of customers worldwide. We’re looking for a
Sr Lead Data Engineer —an individual contributor at the principal level—who brings deep expertise in data engineering, streaming architectures, and analytics platforms, combined with technical leadership to make data a reliable, scalable foundation for the entire engineering organization.
This isn’t a management role. It’s for the engineer who thinks in pipelines and data contracts: the one who can design a Lakehouse architecture, build a real-time streaming platform, ensure data quality at scale, and make it all self-service for the teams that depend on it. You’ll work at the intersection of backend engineering, ML/AI, and analytics—making sure the data that powers our products, models, and decisions is trustworthy, timely, and accessible. If you want to build the data backbone of a modern engineering org — not just move files around — this is the role.
This position is based in Plano, TX. The selected candidate will be expected to reside within a commutable distance of this location.
What you’ll be doing
Serve as the technical authority for data architecture across the organization, making high-impact decisions on data lake design, streaming topologies, storage formats, partitioning strategies, and data modeling patterns.
Design, build, and maintain production-grade data pipelines—batch and real-time—from ingestion and transformation to serving and consumption.
Own the data platform: build and evolve the foundational infrastructure that engineering, ML/AI, and analytics teams depend on for reliable, governed, and performant data access.
Partner closely with
ML/AI engineers
to ensure training data, feature pipelines, and model serving data are accurate, fresh, and efficiently delivered— you are the upstream enabler for every model in production.
Collaborate with
backend and full-stack engineers
to design event-driven architectures, define data contracts, and ensure application data flows cleanly into the data platform.
Lead technical design reviews, architecture discussions, and RFC processes for data initiatives—driving alignment across engineering teams.
Identify and resolve systemic data issues: pipeline failures, data quality degradation, schema drift, latency in streaming systems, cost inefficiencies in storage and computing, and gaps in data observability.
Define and champion data engineering best practices: data modeling, schema evolution, data contracts, testing strategies, lineage tracking, cataloging, and governance.
Design and implement data quality frameworks—validation rules, anomaly detection, freshness checks, and alerting—so downstream consumers can trust the data without asking.
Collaborate closely with Engineering Managers, Product, Data Science, and Analytics to shape data roadmaps and ensure the platform evolves with business needs.
Mentor and grow engineers at all levels through code reviews, pairing, design feedback, and technical guidance on data engineering topics.
Contribute to hiring by conducting technical interviews and helping define what great looks like for data engineering at TFS.
Proactively communicate technical risks, tradeoffs, and recommendations to both engineering and non-technical stakeholders.
What you bring
Bachelor’s degree in Computer Science, Data Engineering, Information Systems, or related field, or equivalent practical experience.
7+ years of software or data engineering experience, including 3–5 years focused specifically on data platform and pipeline engineering at scale, with a track record of operating at a principal or staff engineer level.
Deep expertise in designing and building
data lake and Lakehouse architectures on AWS , including:
S3
as the foundation for data lake storage, with strong opinions on partitioning, file formats (Parquet, Avro, ORC), and lifecycle management.
AWS Glue
for ETL/ELT jobs, crawlers, and the Data Catalog.
Amazon Athena
for serverless SQL analytics over the data lake.
Lake Formation
for fine-grained access control, governance, and cross-account data sharing.
Amazon Redshift
or
Redshift Serverless
for data warehousing and high-performance analytical queries.
Amazon EMR
or
EMR Serverless
for large-scale Spark, Hive, or Presto workloads.
Production experience with
real-time and streaming data architectures , including:
Amazon Kinesis
(Data Streams, Data Firehose) for real-time ingestion and delivery.
Amazon MSK
(Managed Kafka) or self-managed Kafka for event streaming at scale.
EventBridge ,
SQS , or
SNS
for event-driven integration with application services.
Lambda
for lightweight stream processing and event transformation.
Apache Flink
(via Amazon Managed Service for Apache Flink) or Spark Structured Streaming for stateful stream processing.
Strong proficiency in
Python
and
SQL — you write production-quality pipeline code, not just ad-hoc scripts, and you can optimize a complex query as fluently as you can design a DAG.
Experience with
workflow orchestration tools :
Step Functions ,
Apache Airflow
(via Amazon MWAA), or similar—you know how to build reliable, observable, and recoverable pipeline DAGs.
Solid understanding of
data modeling
for both analytical and operational use cases: star schemas, slowly changing dimensions, wide tables, event sourcing, and CDC (change data capture) patterns.
Experience with
data quality and governance tooling
and practices: Great Expectations, Deequ, or custom validation frameworks—plus data cataloging, lineage tracking, and access control.
Strong understanding of
Infrastructure as Code
using
AWS CDK , CloudFormation, or Terraform for data infrastructure.
Experience with
observability and monitoring
for data systems: pipeline health dashboards, data freshness tracking, SLA monitoring, and alerting on failures or anomalies (CloudWatch, Datadog, or similar).
Strong understanding of
security best practices
for data: IAM policies, Lake Formation permissions, encryption at rest and in transit, data masking, and PII handling.
Deep experience debugging complex issues across data systems—pipeline failures, data skew, schema mismatches, streaming lag, and storage cost runaway.
Experience with testing strategies for data pipelines: data validation, schema contract testing, integration testing, and pipeline idempotency.
Strong written and verbal communication— you can write a clear RFC, lead a design review, and explain a data architecture tradeoff to a non-technical stakeholder.
Added bonus if you have
Master’s degree in Computer Science, Data Engineering, or related field.
Experience in the financial services, banking, or insurance industry.
Experience with
open table formats : Apache Iceberg, Delta Lake, or Apache Hudi for ACID transactions, time travel, and schema evolution on the data lake.
Experience with
feature store
design and implementation for ML/AI use cases (SageMaker Feature Store, Feast, or custom).
Familiarity with
dbt
or similar transformation frameworks for analytics engineering and data modeling.
Experience with
real-time analytics
serving layers: Amazon OpenSearch, DynamoDB, or ElastiCache for low-latency data access.
Experience designing
multi-account AWS data architectures
with proper governance and guardrails (AWS Organizations, Control Tower, cross-account data sharing via Lake Formation).
Hands‑on experience with
data mesh
or
data product
patterns—decentralized ownership with centralized governance.
Experience with
CDC (change data capture)
tools: AWS DMS, Debezium, or similar for streaming database changes into the data lake.
Experience with
cost optimization
for data workloads: storage tiering, compute right-sizing, spot instances for Spark, and query optimization.
Experience with
GenAI data pipelines : preparing training datasets, building RAG knowledge bases, embedding generation, and vector store population.
AWS certifications (Data Analytics Specialty, Solutions Architect, Database Specialty).
Experience with
CI/CD pipelines
for data infrastructure and pipeline deployment (CodePipeline, GitHub Actions, or similar).
Experience contributing to or maintaining open-source data engineering projects.
Experience defining engineering standards, writing ADRs, or leading org-wide technical initiatives.
What we’ll bring
A work environment built on teamwork, flexibility, and respect.
Professional growth and development programs to help advance your career, as well as tuition reimbursement.
Team Member Vehicle Purchase Discount.
Toyota Team Member Lease Vehicle Program (if applicable).
Comprehensive health care and wellness plans for your entire family.
Toyota 401(k) Savings Plan featuring a company match, as well as an annual retirement contribution from Toyota, regardless of whether you contribute.
Paid holidays and paid time off.
Referral services related to prenatal services, adoption, childcare, schools, and more.
Tax-Advantaged Accounts (Health Savings Account, Health Care FSA, Dependent Care FSA).
Relocation Assistance (if applicable).
Belonging at Toyota Our success begins and ends with our people. We embrace all perspectives and value unique human experiences. Respect for all is our North Star. Toyota is proud to have 10+ different Business Partnering Groups across 100 different North American chapter locations that support team members’ efforts to dream, do and grow without questioning that they belong.
Applicants for our positions are considered without regard to race, ethnicity, national origin, sex, sexual orientation, gender identity or expression, age, disability, religion, military or veteran status, or any other characteristics protected by law.
#J-18808-Ljbffr
Sprachkenntnisse
- English
Hinweis für Nutzer
Dieses Stellenangebot wurde von einem unserer Partner veröffentlicht. Sie können das Originalangebot einsehen hier.