GCP Data Engineer

Compunnel

United States

United States

Apply Now

About

The GCP Data Engineer is responsible for designing and developing large-scale data processing systems within the Google Cloud Platform (GCP). This role involves curating comprehensive datasets about users, groups, and permissions, as well as implementing a scalable data pipeline to ensure timely updates and transparency in data access. Key Responsibilities Design, develop, and implement scalable, high-performance data solutions on GCP. Curate and manage datasets detailing user permissions and group memberships. Redesign data pipelines to improve scalability and reduce processing time. Ensure data access permission changes are reflected in Tableau dashboards within 24 hours. Collaborate with technical and business users to manage data sharing across multiple projects. Utilize GCP tools and technologies to optimize data processing and storage. Re-architect data pipelines for BigQuery datasets used in GCP IAM dashboards to enhance scalability. Run and customize Data Loss Prevention (DLP) scans. Build bidirectional integrations between GCP and Collibra. Explore and potentially implement Dataplex and custom format-preserving encryption for de-identifying data in lower environments.
Required Qualifications
Bachelors degree in Computer Engineering or a related field. 5+ years of experience as a Data Engineer in GCP, including Python, Java, Spark, and SQL. Expertise in Googles Identity and Access Management (IAM) API. Proficiency in Linux/Unix with strong scripting skills (Shell, bash). Experience with big data technologies such as HDFS, Spark, Impala, and Hive. Familiarity with version control platforms like GitHub and CI/CD tools such as Jenkins and Terraform. Proficiency with Airflow for workflow orchestration. Strong knowledge of GCP platform tools, including Pub/Sub, Cloud Storage, Bigtable, BigQuery, Dataflow, Dataproc, and Composer. Experience with web services and APIs (RESTful and SOAP). Hands?on experience with real?time streaming and batch processing tools like Kafka, Flume, Pub/Sub, and Spark. Ability to work with different file formats such as Avro, Parquet, and JSON. Expertise in pipeline creation, automation for data acquisition, and metadata extraction.
Preferred Qualifications
Coding skills in Scala. Knowledge of Apache packages and hybrid cloud architectures. Strong experience in API orchestration and choreography for consumer apps. Proven ability to collaborate with scrum teams and contribute to Agile processes using Jira and Confluence. Familiarity with Hadoop ecosystems and cloud platforms. Experience in managing and scheduling batch jobs and data quality control metrics.
#J-18808-Ljbffr

United States

Languages

English

Notice for Users

This job comes from a TieTalent partner platform. Click "Apply Now" to submit your application directly on their site.

Apply Now