About
Key Responsibilities • Architect, design, and build large-scale ETL/ELT data pipelines using Databricks (PySpark/Spark SQL). • Optimize cluster usage, Spark code, and Delta Lake pipelines for high performance and cost efficiency. • Implement Delta Lake features including ACID transactions, schema evolution, and time travel. • Lead migration of on prem or legacy workloads to Azure Databricks. • Design data models, ensure data integrity, and implement governance using Unity Catalog. • Build and orchestrate data workflows via ADF, Databricks Workflows, or other orchestration tools. • Integrate Databricks with Azure services such as ADLS, Event Hub, Synapse, Azure SQL, Key Vault, and Function Apps. • Conduct code reviews, mentor junior engineers, and enforce engineering best practices. • Troubleshoot and resolve production issues in Spark jobs, workflows, and pipelines. • Work closely with software engineering teams; contribute to APIs or services built using C#/ASP.NET when required.
Required Skills • 6+ years of experience in data engineering, with at least 3 years on Azure Databricks. • Strong hands-on expertise with PySpark, Spark SQL, Apache Spark internals, and performance tuning. • Deep understanding of Delta Lake, data versioning, and Lakehouse architecture. • Strong SQL skills: complex queries, performance tuning, and analytical functions. • Experience integrating Databricks with ADF, ADLS Gen2, Azure SQL, Event Hub, and other Azure components. • Hands-on experience with CI/CD pipelines using Git, GitHub Actions, Azure DevOps, or similar tools. • Solid understanding of data modeling, distributed computing, and data security best practices. • Ability to design scalable, maintainable, and reusable data frameworks.
Languages
- English
Notice for Users
This job comes from a TieTalent partner platform. Click "Apply Now" to submit your application directly on their site.