This job offer is no longer available
About
Job Description
The enriched datasets are consumed by multiple downstream systems and teams, including the Customer Data Platform (CDP) and other analytics/research stakeholders.
The platform is Azure-native and built primarily on Databricks (processing + some ML workloads) and Snowflake (analytics/warehouse).
A major focus is building reliable, governed, vendor agnostic datasets while ensuring privacy/compliance, data governance, and cost efficiency.
Key Responsibilities
As a Data Engineer, you will: Data Ingestion & Pipeline Development Build and enhance ingestion pipelines for large batch and event-driven paths.
Integrate data from Third party enrichment vendors.
Integrate data from Digital platforms via Conversion API integrations.
Integrate data from Rewards/Promotions systems for offer issuance/redemption/consumption data.
Data Quality, Reliability & Operations: Implement strong data validation, idempotency, replay/backfill strategies, and deduplication to prevent quality drift.
Own monitoring, alerting, dashboarding, and operational readiness.
Troubleshoot failures with root cause analysis, not just reruns: Interpret Spark logs, diagnose performance issues.
Improve stability and SLA adherence.
Governance & Compliance: Apply privacy, compliance, and governance requirements across pipelines and datasets. Support governance standards such as Unity Catalog, lineage, access controls; Managing PII vs non PII access.
Documentation of tables, schemas, catalogs, and cluster usage.
Cost Governance & Performance Optimization: Design pipelines with cost awareness from day one: Cluster sizing, workload tuning, efficient compute/storage usage. Trade-off decisions balancing cost vs quality vs SLA.
Collaboration & Ownership: Work in a small, fast-moving team; be self-driven and ownership-oriented. Raise and manage data quality escalations when issues are detected.
Contribute to evolving architecture (product is early-stage; first live month was recent).
Must-Have Skills
Recruiters should prioritize candidates with hands-on experience.
Databricks: notebooks/jobs, performance tuning fundamentals, medallion patterns.
Spark fundamentals: partitioning, skew/shuffle optimization, understanding failures via logs.
Snowflake: data modeling/usage for analytics/warehousing workloads.
Azure ecosystem: Azure Data Factory, Azure-native integrations, and services exposure.
Data engineering reliability patterns: validation, idempotency, replay/backfills, deduplication, auditability.
Data governance: Unity Catalog lineage, access control patterns.
TekWissen Group is an equal opportunity employer supporting workforce diversity.
#J-18808-Ljbffr
Languages
- English
Notice for Users
This job was posted by one of our partners. You can view the original job source here.