À propos
Key Responsibilities • Design, develop, and maintain ETL/ELT data pipelines for batch and real-time data ingestion, transformation, and loading using Spark (PySpark/Scala) and streaming technologies (Kafka, Flink). • Build and optimize scalable data architectures, including data lakes, data warehouses (BigQuery), and streaming platforms. • Optimize Spark jobs, SQL queries, and data processing workflows for speed, efficiency, and cost-effectiveness. • Implement data quality checks, monitoring, and alerting systems to ensure data accuracy and consistency.
Required Qualifications • Total IT Experience: Minimum 8 years. • Scala: Minimum 2 years of experience. • GCP: 4+ years of recent GCP experience. • Programming: Strong proficiency in Python, SQL. • Big Data: Expertise in Apache Spark (Spark SQL, DataFrames, Streaming). • Streaming: Experience with messaging queues like Apache Kafka, or Pub/Sub. • Cloud: Familiarity with GCP, Azure data services. • Databases: Knowledge of data warehousing (Snowflake, Redshift) and NoSQL databases.
Preferred Qualifications • Programming: Proficiency in Scala/Java. • Tools: Experience with Airflow, Databricks, Docker, Kubernetes.
Certifications
Compétences linguistiques
- English
Avis aux utilisateurs
Cette offre provient d’une plateforme partenaire de TieTalent. Cliquez sur « Postuler maintenant » pour soumettre votre candidature directement sur leur site.