Über
Responsibilities and essential job functions include but are not limited to the following:
Demonstrate deep knowledge of the data engineering domain to build and support non-interactive (batch, distributed) & real-time, highly available data, data pipeline, and technology capabilities
Build fault-tolerant, self-healing, adaptive, and highly accurate data computational pipelines
Provide consultation and lead the implementation of complex programs
Develop and maintain documentation relating to all assigned systems and projects
Tune queries running over billions of rows of data running in a distributed query engine
Perform root cause analysis to identify permanent resolutions to software or business process issues
Bachelor's degree in computer science, management information systems, or related discipline, or equivalent work experience
Strong/expert Spark (PySpark) Using Jupyter Notebooks, Colab or DataBricks (preferred)
Hands-on data pipeline development, ingest patterns in Azure
Orchestration tools, ADF or Airflow
SQL
Denormalized Data modeling for big data systems
Collaborative, able to work remotely, and still be an engaging team member.
Strong analytical and design skills.
Skills:
SPARK , PYSPARK , JUPYTER
Sprachkenntnisse
- English
Hinweis für Nutzer
Dieses Stellenangebot stammt von einer Partnerplattform von TieTalent. Klicken Sie auf „Jetzt Bewerben“, um Ihre Bewerbung direkt auf deren Website einzureichen.