Dieses Stellenangebot ist nicht mehr verfügbar
Über
Locations: Dallas, TX / Irving, TX / Basking Ridge, NJ Employment Type: Contract / C2C Experience Required: 12+ Years Project Duration: 12 months Work Arrangement: Work from Client Office – 3 days a week Interview: Three rounds of Video interview Telecom domain experience is a plus. Must have: Must have minimum 2 years of GraphQL schema work exp. Must have minimum 2 years of GCP Spanner, Python and BigQuery work exp. Job Description: Key Responsibilities: Design and manage the GraphQL schema Build highly optimized resolver functions that bridge the GraphQL schema directly to data warehouses Implement GraphQL Subscriptions to stream live data, event changes, or real-time metrics using message brokers like Apache Kafka Design and implement scalable data pipelines using GCP-native services (Dataflow, Dataproc, Pub/Sub, Cloud Composer/Airflow) Architect and optimize BigQuery datasets, tables, and queries for analytical workloads at scale Design and manage Cloud Spanner schemas for globally distributed, strongly consistent transactional data Build and maintain data models, transformations, and orchestration workflows using Cloud Workflows and related tools Develop backend data services and ETL/ELT scripts in Python Integrate and manage Firestore for real-time, document-oriented data use cases Implement data governance, lineage, and quality frameworks using tools like Dataplex or Data Catalog Collaborate on infrastructure-as-code using Terraform for GCP resource provisioning Monitor pipeline health, optimize costs, and troubleshoot production issues Redesign and optimize existing data pipelines and architectures as needed to ensure high performance and scalability. Oversee the end-to-end data delivery process, from ingestion to transformation and reporting. Perform code reviews, enforce best practices, and ensure the quality and consistency of the codebase. Manage workflows and scheduling with tools like Apache Airflow to ensure smooth execution of pipelines. Use GCP technologies like Dataflow, Apache Beam, BigQuery, Dataproc, and other services for data transformation, storage, and analysis. Develop scalable solutions with Apache Spark, Hadoop, and other distributed systems on GCP. Monitor and optimize the performance of data pipelines and processes. Collaborate with DevOps and Cloud teams for CI/CD integration and infrastructure optimization. Provide mentorship to junior engineers and manage the team's performance effectively. Required Skills and Experience: 12+ years of experience in data engineering with at least 5+ years on Google Cloud Platform (GCP). Must have minimum 2 years of GraphQL schema work exp. Must have minimum 2 years of GCP Spanner, Python and BigQuery work exp. Solid understanding of network/telecom domains, including relevant data types and use cases. Expertise in GCP tools, including: BigQuery for schema design, partitioning, clustering, query optimization, cost governance, data warehousing and analytics; Dataproc for managing Apache Spark and Hadoop clusters; Airflow for orchestration of workflows and pipelines; Data streaming with Python; Strong experience in Apache Spark and Hadoop ecosystems; Production experience with Cloud Spanner — schema design, interleaving, transaction patterns, and performance tuning; Solid understanding of GCP data services: Dataflow, Pub/Sub, Cloud Storage, Dataproc, Cloud Composer; Experience with Cloud Workflows for serverless orchestration; Hands-on experience with Firestore (Native mode preferred) for NoSQL/document storage patterns; Strong SQL skills and understanding of data warehousing concepts; Experience with CI/CD pipelines (Cloud Build, GitHub Actions) and version control (Git); Ability to design, develop, and optimize ETL/ELT pipelines for large-scale data; Hands-on experience in data modeling and data architecture design; Strong problem-solving skills and ability to redesign existing solutions if needed; Excellent communication and interpersonal skills to collaborate with both technical and business teams; Bachelor's Degree in Computer Science, Engineering, or a related field; OR equivalent combination of education and relevant experience. Preferred Skills: Familiarity with real-time data streaming technologies. Experience with dbt for transformation layer on BigQuery Familiarity with streaming architectures (exactly once semantics, late data handling) Knowledge of data mesh or data Lakehouse patterns Exposure to Vertex AI or ML pipelines for MLOps workflows GCP Professional Data Engineer certification
Sprachkenntnisse
- English
Hinweis für Nutzer
Dieses Stellenangebot wurde von einem unserer Partner veröffentlicht. Sie können das Originalangebot einsehen hier.