This job offer is no longer available
Lead AWS Data Engineer
FUSTIS LLC
- Sacramento, California, United States
- Sacramento, California, United States
About
PySpark ramp-up Glue job hands-on proof Dimensional modeling Core Responsibilities
Develop and maintain PySpark-based ETL pipelines for batch and incremental data processing Build and operate AWS Glue Spark jobs (batch and event-driven), including:
Job configuration, scaling, retries, and cost optimization Glue Catalog and schema management
Design and maintain event-driven data workflows triggered by S3, EventBridge, or streaming sources Load and transform data into Amazon Redshift, optimizing for:
Distribution and sort keys Incremental loads and upserts Query performance and concurrency
Design and implement dimensional data models (star/snowflake schemas), including:
Fact and dimension tables Slowly Changing Dimensions (SCDs) Grain definition and data quality controls
Collaborate with analytics and reporting teams to ensure the warehouse is BI-ready Monitor, troubleshoot, and optimize data pipelines for reliability and performance Required Technical Experience
Strong PySpark experience (Spark SQL, DataFrames, performance tuning) Hands-on experience with AWS Glue (Spark jobs, not just crawlers) Experience loading and optimizing data in Amazon Redshift Proven experience designing dimensional data warehouse schemas Familiarity with AWS-native data services (S3, IAM, CloudWatch) Production ownership mindset (debugging, failures, reprocessing) Skills
SPARK SQL DATA SERVICES MODELING PYSPARK AWS
#J-18808-Ljbffr
Languages
- English
Notice for Users
This job was posted by one of our partners. You can view the original job source here.