Zurück zur Stellenangebote
XX
Data EngineerKasmo GlobalUnited States

Dieses Stellenangebot ist nicht mehr verfügbar

XX

Data Engineer

Kasmo Global
  • US
    United States
  • US
    United States

Über

Data Engineer
Type: Onsite (Hybrid 3 to 4 days to office) Locations: McLean VA, Richmond VA, Dallas TX A Data Engineer with Python, PySpark, and AWS expertise is responsible for designing, building, and maintaining scalable and efficient data pipelines in cloud environment Responsibilities
Design, develop, and maintain robust ETL/ELT pipelines using Python and PySpark for data ingestion, transformation, and processing. Work extensively with AWS cloud services such such as S3, Glue, EMR, Lambda, Redshift, Athena, and DynamoDB for data storage, processing, and warehousing. Build and optimize data ingestion and processing frameworks for large-scale data sets, ensuring data quality, consistency, and accuracy. Collaborate with data architects, data scientists, and business intelligence teams to understand data requirements and deliver effective data solutions. Implement data governance, lineage, and security best practices within data pipelines and infrastructure. Automate data workflows and improve data pipeline performance through optimization and tuning. Develop and maintain documentation for data solutions, including data dictionaries, lineage, and technical specifications. Participate in code reviews, contribute to continuous improvement initiatives, and troubleshoot complex data and pipeline issues Required Skills
Strong programming proficiency in Python, including libraries like Pandas and extensive experience with PySpark for distributed data processing. Solid understanding and practical experience with Apache Spark/PySpark for large-scale data transformations. Demonstrated experience with AWS data services, including S3, Glue, EMR, Lambda, Redshift, and Athena. Proficiency in SQL and a strong understanding of data modeling, schema design, and data warehousing concepts. Experience with workflow orchestration tools such as Apache Airflow or AWS Step Functions. Familiarity with CI/CD pipelines and version control systems (e.g., Git). Excellent problem-solving, analytical, and communication skills, with the ability to work effectively in a team environment. Preferred Skills
Experience with streaming frameworks like Kafka or Kinesis. Knowledge of other data warehousing solutions like Snowflake
  • United States

Sprachkenntnisse

  • English
Hinweis für Nutzer

Dieses Stellenangebot wurde von einem unserer Partner veröffentlicht. Sie können das Originalangebot einsehen hier.