Cette offre d'emploi n'est plus disponible
À propos
Responsible for developing, expanding, and optimizing our data and data pipeline architecture Support software developers, database architects, data analysts, and data scientists on data initiatives and ensure optimal data delivery architecture is consistent throughout ongoing projects Creates new pipeline and maintains existing pipeline, updates Extract, Transform, Load (ETL) process Implements large dataset engineering: data augmentation, data quality analysis, data analytics, data profiling, and develop data strategy recommendations Operate large-scale data processing pipelines and resolve business and technical issues pertaining to processing and data quality Assemble large, complex sets of data that meet non-functional and functional business requirements Identify, design, and implement internal process improvements including re-designing data infrastructure for greater scalability, optimizing data delivery, and automating manual processes Build analytical tools to utilize the data pipeline, providing actionable insight into key business performance metrics Work with stakeholders including data, design, product, and government stakeholders and assist them with data-related technical issues Write unit and integration tests for all data processing code Work with DevOps engineers on CI, CD, and IaC Read specs and translate them into code and design documents Perform code reviews and develop processes for improving code quality. Requirements:
Minimum of 8 years of experience as a Data Engineer or in hands-on software development At least 4 years using Python, Java, and cloud technologies for building and maintaining data pipelines Bachelor’s degree in Computer Science, Information Systems, Engineering, Business, or a related scientific or technical discipline required. In lieu of a degree, candidates may qualify with 10 years of general information technology experience, including at least 8 years of specialized experience. Expert data pipeline builder and data wrangler Self-sufficient and comfortable supporting the data needs of multiple teams, systems, and products. Experienced in designing data architecture for shared services, scalability, and performance Experienced in designing data services including API, metadata, and data catalog Ability to build and optimize data sets, ‘big data’ data pipelines and architectures Ability to perform root cause analysis on external and internal processes and data to identify opportunities for improvement. Excellent analytic skills associated with working on unstructured datasets Ability to build processes that support data transformation, workload management, data structures, dependency, and metadata Demonstrated understanding and experience using software and tools including big data tools like Spark and Hadoop; relational databases including MySQL and Postgres; workflow management and pipeline tools such as Apache Airflow, and AWS Step Function; AWS cloud services including Redshift, RDS, EMR, and EC2; stream-processing systems like Spark-Streaming and Storm; and object function/object-oriented scripting languages including Scala, Java, and Python. Experience with Agile methodology, using test-driven development Experience with GitHub and Atlassian Jira/Confluence. Excellent command of written and spoken English. Benefits:
highly competitive salaries full healthcare benefits
Compétences linguistiques
- English
Avis aux utilisateurs
Cette offre a été publiée par l’un de nos partenaires. Vous pouvez consulter l’offre originale ici.