Data Engineer - Lead

Houston Staffing

United States

United States

Postuler Maintenant

À propos

Lead Data Engineer
We are looking for an experienced lead data engineer to oversee the design, implementation, and management of advanced data infrastructure in Houston, Texas. This role requires expertise in architecting scalable solutions, optimizing data pipelines, and ensuring data quality to support analytics, machine learning, and real-time processing. The ideal candidate will have a deep understanding of lakehouse architecture and medallion design principles to deliver robust and governed data solutions. Responsibilities: Develop and implement scalable data pipelines to ingest, process, and store large datasets using tools such as Apache Spark, Hadoop, and Kafka. Utilize cloud platforms like AWS or Azure to manage data storage and processing, leveraging services such as S3, Lambda, and Azure Data Lake. Design and operationalize data architecture following medallion patterns to ensure data usability and quality across bronze, silver, and gold layers. Build and optimize data models and storage solutions, including databricks lakehouses, to support analytical and operational needs. Automate data workflows using tools like Apache Airflow and Fivetran to streamline integration and improve efficiency. Lead initiatives to establish best practices in data management, facilitating knowledge sharing and collaboration across technical and business teams. Collaborate with data scientists to provide infrastructure and tools for complex analytical models, using programming languages like Python or R. Implement and enforce data governance policies, including encryption, masking, and access controls, within cloud environments. Monitor and troubleshoot data pipelines for performance issues, applying tuning techniques to enhance throughput and reliability. Stay updated with emerging technologies in data engineering and advocate for improvements to the organization's data systems. Requirements: Bachelor's degree in computer science, engineering, or a related field with 10+ years of experience in data engineering, or a master's degree with 5+ years of relevant experience. Proven expertise in designing and implementing medallion architecture within a databricks lakehouse environment. Proficiency in big data technologies such as Apache Spark, Hadoop, and Kafka. Extensive experience with cloud platforms like AWS and Azure, including integration of storage and compute services. Strong programming skills in Python, Java, or Scala, with hands-on experience in data modeling and stored procedures. Knowledge of tools and platforms like Apache Airflow, Databricks, and Dataiku. Familiarity with ETL processes and machine learning model deployment. Excellent problem-solving skills and ability to optimize data systems for performance and scalability.

United States

Compétences linguistiques

English

Avis aux utilisateurs

Cette offre provient d’une plateforme partenaire de TieTalent. Cliquez sur « Postuler maintenant » pour soumettre votre candidature directement sur leur site.

Postuler Maintenant