Retour aux emplois
XX
Data Engineer - Azure & Microsoft Fabric PlatformPGW AUTO GLASS, LLCUnited States
XX

Data Engineer - Azure & Microsoft Fabric Platform

PGW AUTO GLASS, LLC
  • US
    United States
  • US
    United States

À propos

Data Engineer - Azure & Microsoft Fabric Platform
PGW Auto Glass (PGWAG) is seeking a highly motivated Data Engineer to help modernize and scale our enterprise analytics platform on Microsoft Azure and Microsoft Fabric. This role will focus on designing, developing, and maintaining cloud-native data engineering pipelines that support enterprise reporting, real-time analytics, AI/ML initiatives, pricing optimization, and operational intelligence across our network of branches and distribution centers throughout the United States and Canada. The ideal candidate will possess strong experience in Azure-based data engineering, real-time event streaming, batch processing, Lakehouse architectures, and modern analytics platforms. This role sits at the intersection of Data Engineering, Cloud Architecture, Real-Time Analytics, and AI enablement. The candidate will work closely with Pricing, Supply Chain, Operations, IT Infrastructure, and Executive Leadership teams to build scalable and resilient analytics solutions using Microsoft Fabric, Azure Databricks, Event Streaming technologies, and Power BI. We are seeking a mid-level Data Engineer to design, scale, and maintain our dual-engine enterprise data platform on Microsoft Azure and Microsoft Fabric. This role balances both batch and real-time processing architectures, ensuring seamless data flow from transactional systems into analytics-ready storage and semantic models. This position is critical to PGWAG's Cloud Modernization, Data Warehouse Modernization, and AI enablement initiatives. Key Responsibilities & Duties: Design, develop, and maintain scalable enterprise data pipelines using: Microsoft Fabric Azure Data Factory Fabric Data Factory Azure Databricks Azure Event Hubs OneLake Fabric Lakehouse Fabric Data Warehouse Build analytics-ready datasets supporting: Pricing Analytics Supply Chain Analytics POS Sales Analytics Customer Behavior Analytics Executive Dashboards AI/ML workloads Dual-Engine Data Pipelines: Build and manage parallel processing architectures using: Azure Data Factory for structured batch processing Azure Event Hubs / Kafka for real-time event ingestion Support ingestion patterns including: Batch ETL/ELT Change Data Capture (CDC) / Database mirroring Streaming ingestion API-based integrations SaaS integrations Develop near real-time analytics solutions using Eventstream and Real-Time Intelligence capabilities in Microsoft Fabric. Stream & Batch Processing: Develop and optimize PySpark workloads using: Azure Databricks Fabric Spark Spark Structured Streaming Process: High-volume historical datasets XML/JSON log files Streaming transactional events Operational telemetry data Build scalable transformation logic for both streaming and batch architecture. Data Modeling & Transformation: Model and transform enterprise data using: ANSI SQL T-SQL dbt (Data Build Tool) Lakehouse design principles Design: Star schemas Snowflake schemas Semantic models Curated analytical datasets Support enterprise-wide self-service analytics initiatives using governed semantic layers. Storage & Lakehouse Architecture: Maintain scalable Azure Data Lake Storage (ADLS Gen2) environments. Implement and optimize: Delta Lake table formats ACID-compliant storage patterns Schema evolution and enforcement Partitioning and performance tuning Support enterprise Lakehouse architecture using Microsoft Fabric OneLake. Power BI & Analytics Enablement: Partner with Analytics and Business teams to deliver: Power BI dashboards Executive scorecards KPI reporting Self-service analytics solutions Build and maintain: Semantic models Direct Lake datasets Row-level security Data governance standards Support Copilot-enabled analytics and AI-assisted reporting capabilities. Infrastructure, Automation & DevOps: Deploy and maintain cloud infrastructure using: Terraform Azure Resource Manager (ARM) Infrastructure-as-Code principles Automate CI/CD workflows using: Azure DevOps Git Docker Author and orchestrate enterprise workflows using: Azure Data Factory Fabric Pipelines Managed Apache Airflow Control-M integrations where applicable Data Observability & Reliability: Implement automated monitoring and alerting for: Batch failures Streaming interruptions Data quality issues Schema drift Pipeline latency Build checksum and reconciliation frameworks between source systems and analytics platforms. Support enterprise data governance and operational resiliency initiatives. Qualifications & Skills: Required Technical Skills: Cloud & Data Platforms: Microsoft Azure Microsoft Fabric Azure Data Lake Storage Gen2 (ADLS Gen2) Azure Databricks Azure Data Factory Azure Event Hubs Azure Synapse Analytics / Fabric Warehouse Programming & Query Languages: Python PySpark ANSI SQL T-SQL Streaming & Batch Technologies: Apache Spark Structured Streaming Apache Kafka Azure Stream Analytics Event-driven architectures Data Transformation & Storage: dbt (Data Build Tool) Delta Lake Lakehouse architecture Data warehousing concepts Data Modeling: Star Schema Snowflake Schema Semantic Layer Design Enterprise Data Modeling Preferred Qualifications: Bachelor's or Master's degree in: Data Science Computer
  • United States

Compétences linguistiques

  • English
Avis aux utilisateurs

Cette offre provient d’une plateforme partenaire de TieTalent. Cliquez sur « Postuler maintenant » pour soumettre votre candidature directement sur leur site.