XX
Senior ML Data Infrastructure EngineerRedolentUnited States

This job offer is no longer available

XX

Senior ML Data Infrastructure Engineer

Redolent
  • US
    United States
  • US
    United States

About

TITLE: ML Data Infrastructure Engineer LOCATION: Sunnyvale CA or Remote Duration: 12+ Months Rate: DOE
Key skills - GCP ML Infrastructure,
BigQuery, Dataflow, Airflow ( Cloud composer), Vertext AI , Datapipeline, ML Training
Role Overview:
We're seeking an experienced engineer to build our
ML data infrastructure platform . You'll create the systems and tools that enable efficient data preparation, feature engineering, and dataset management for machine learning. This role focuses on the data foundation that powers our ML capabilities.
Key Responsibilities: Design and implement
scalable data processing pipelines for ML training
and validation Build and maintain feature stores with support for both batch and real-time features Develop data quality monitoring, validation, and testing frameworks Create systems for dataset versioning, lineage tracking, and reproducibility Implement automated data documentation and discovery tools Design efficient data storage and access patterns for
ML workloads Partner with data scientists to optimize data preparation workflows Technical Requirements: 7+ years of software engineering experience, with 3+ years in data infrastructure Strong expertise in
GCP's data and ML infrastructure :
o
BigQuery
for data warehousing o Dataflow for data processing o
Cloud
Storage for data lakes o
Vertex AI
Feature Store o Cloud
Composer
(managed
Airflow ) o
Dataproc
for Spark workloads
Deep expertise in data processing frameworks (Spark, Beam, Flink) Experience with feature stores (Feast, Tecton) and data versioning tools Proficiency in Python and SQL Experience with data quality and testing frameworks Knowledge of data pipeline orchestration (Airflow, Dagster)
Nice to Have: • Experience with streaming systems (Kafka, Kinesis) • Experience with GCP-specific security and IAM best practices • Knowledge of Cloud Logging and Cloud Monitoring for data pipelines • Familiarity with Cloud Build and Cloud Deploy for CI/CD • Experience with streaming systems (Pub/Sub, Dataflow) • Knowledge of ML metadata management systems • Familiarity with data governance and security requirements • Experience with dbt or similar data transformation tools
  • United States

Languages

  • English
Notice for Users

This job was posted by one of our partners. You can view the original job source here.