À propos
Design and build scalable ETL/ELT pipelines using Apache Airflow, Apache Spark, and GCP Dataflow. Develop and maintain BigQuery data models, schemas, and performance-optimized SQL queries. Build and maintain data pipelines feeding AI/ML feature stores and forecasting models. Collaborate with AI Developers to ensure high-quality, low-latency data access for model training. Manage and optimize Cloud Composer DAGs and pipeline orchestration. Implement data quality monitoring, alerting, and lineage tracking. Participate in data platform architecture decisions and documentation. Required Qualifications: 3+ years (Intermediate) or 5+ years (Specialist) of data engineering experience. Hands-on experience with Apache Airflow for pipeline orchestration. Proficiency in Apache Spark for large-scale data processing. Strong SQL skills including complex query optimization and BigQuery-specific capabilities. Experience with GCP data services: BigQuery, Cloud Storage, Pub/Sub, Dataflow. Solid understanding of ETL/ELT patterns and data warehousing principles. Preferred Qualifications: GCP Professional Data Engineer certification. Experience supporting ML/AI data infrastructure (feature engineering, training datasets). Familiarity with real-time streaming (Kafka, Dataflow/Flink). Retail or large-scale consumer data experience.
Compétences linguistiques
- English
Avis aux utilisateurs
Cette offre provient d’une plateforme partenaire de TieTalent. Cliquez sur « Postuler maintenant » pour soumettre votre candidature directement sur leur site.