XX
GCP Data Engineer - RemoteComputing Concepts, Inc.United States
XX

GCP Data Engineer - Remote

Computing Concepts, Inc.
  • US
    United States
  • US
    United States

About

Summary: Strong experience architecting enterprise data platforms on Google Cloud (GCP). The architect will work as a strategic technical partner to design and build a GCP BigQuery-based Data Lake & Data Warehouse ecosystem. The role requires deep hands-on expertise in data ingestion, transformation, modeling, enrichment, and governance, combined with a strong understanding of clinical healthcare data standards, interoperability, and cloud architecture best practices. Key Responsibilities: Data Lake & Data Platform Architecture (GCP) * Architect and design an enterprise-grade GCP-based data lakehouse leveraging BigQuery, GCS, Dataproc, Dataflow, Pub/Sub, Cloud Composer, and BigQuery Omni. * Define data ingestion, hydration, curation, processing and enrichment strategies for large-scale structured, semi-structured, and unstructured datasets. * Create data domain models, canonical models, and consumption-ready datasets for analytics, AI/ML, and operational data products. * Design federated data layers and self-service data products for downstream consumers. Data Ingestion & Pipelines * Architect batch, near-real-time, and streaming ingestion pipelines using GCP Cloud Dataflow, Pub/Sub, and Dataproc. * Set up data ingestion for clinical (EHR/EMR, LIS, RIS/PACS) datasets including HL7, FHIR, CCD, DICOM formats. * Build ingestion pipelines for non-clinical systems (ERP, HR, payroll, supply chain, finance). * Architect ingestion from medical devices, IoT, remote patient monitoring, and wearables leveraging IoMT patterns. * Manage on-prem → cloud migration pipelines, hybrid cloud data movement, VPN/Interconnect connectivity, and data transfer strategies. Data Transformation, Hydration & Enrichment * Build transformation frameworks using BigQuery SQL, Dataflow, Dataproc, or dbt. * Define curation patterns including bronze/silver/gold layers, canonical healthcare entities, and data marts. * Implement data enrichment using external social determinants, device signals, clinical event logs, or operational datasets. * Enable metadata-driven pipelines for scalable transformations. Data Governance & Quality * Establish and operationalize a data governance framework encompassing data stewardship, ownership, classification, and lifecycle policies. * Implement data lineage, data cataloging, and metadata management using tools such as Dataplex, Data Catalog, Collibra, or Informatica. * Set up data quality frameworks for validation, profiling, anomaly detection, and SLA monitoring. * Ensure HIPAA compliance, PHI protection, IAM/RBAC, VPC SC, DLP, encryption, retention, and auditing. Cloud Infrastructure & Networking * Work with cloud infrastructure teams to architect VPC networks, subnetting, ingress/egress, firewall policies, VPN/IPSec, Interconnect, and hybrid connectivity. * Define storage layers, partitioning/clustering design, cost optimization, performance tuning, and capacity planning for BigQuery. * Understand containerized processing (Cloud Run, GKE) for data services. Stakeholder Collaboration * Work closely with clinical, operational, research, and IT stakeholders to define data use cases, schema, and consumption models. * Partner with enterprise architects, security teams, and platform engineering teams on cross-functional initiatives. * Guide data engineers and provide architectural oversight on pipeline implementation. Hands-on Leadership * Be actively hands-on in building pipelines, writing transformations, building POCs, and validating architectural patterns. * Mentor data engineers on best practices, coding standards, and cloud-native development. Required Skills & Qualifications Technical Skills (Must-Have) * 10+ years in data architecture, engineering, or data platform roles. * Strong expertise in GCP data stack (BigQuery, Dataflow, Composer, GCS, Pub/Sub, Dataproc, Dataplex). * Hands-on experience with data ingestion, pipeline orchestration, and transformations. * Deep understanding of clinical data standards: * HL7 v2.x, FHIR, CCD/C-CDA * DICOM (for scans and imaging) * LIS/RIS/PACS data structures * Experience with device and IoT data ingestion (wearables, remote patient monitoring, clinical devices). * Experience with ERP datasets (Workday, Oracle, Lawson, PeopleSoft). * Strong SQL and data modeling skills (3NF, star/snowflake, canonical and logical models). * Experience with metadata management, lineage, and governance frameworks. * Solid understanding of HIPAA, PHI/PII handling, DLP, IAM, VPC security. Cloud & Infrastructure * Solid understanding of cloud networking, hybrid connectivity, VPC design, firewalling, DNS, service accounts, IAM, and security models. * Cloud Native Data movement services * Experience with on-prem to cloud migrations.
  • United States

Languages

  • English
Notice for Users

This job comes from a TieTalent partner platform. Click "Apply Now" to submit your application directly on their site.