Dieses Stellenangebot ist nicht mehr verfügbar
Über
Job Responsibilities
Core Responsibilities
·
Maintain and monitor existing production data pipelines (little to no new pipeline build)
·
Manage and troubleshoot AWS Glue jobs
·
Validate and verify data outputs (including SageMaker checks)
·
Perform cost optimization across AWS workloads and SQL queries
·
Consolidate data from 20+ sandbox tables into a single scalable "gold” table
·
Build datasets supporting:Daily snapshots, Monthly snapshots, Client- and crew-level metrics
·
Translate business metrics into reliable transformation logic
Tech Stack (Must-Have)
·
PySpark
·
Python
·
SQL
·
AWS, with emphasis on: AWS Glue, Cloud cost awareness
·
Experience supporting production ETL pipelines
·
Strong query optimization for performance and cost efficiency
Nice-to-Have / Preferred
·
Experience with Amazon SageMaker (monitoring / validation)
·
Prior experience in the Vanguard environment
·
Hybrid background as Data Analyst → Data Engineer
·
Exposure to cloud cost-optimization initiatives
Experience Level
·
5–8 years overall IT experience
·
Strong hands-on data engineering background
·
Experience working with analyst-driven or business-led data use cases
Non-Technical Traits (Important)
·
Strong business understanding of data, metrics, and tables
·
Comfortable with ambiguous or evolving requirements
·
Able to collaborate closely with analysts during discovery/research
·
Works well in Kanban / Agile environments
·
Strong communicator and team collaborator
Ideal Candidate Profile (Quick Check)
·
Can hit the ground running
·
More focused on pipeline maintenance and optimization than greenfield builds
·
Strong AWS + PySpark engineer with business-facing experience
·
Comfortable delivering value quickly in a short-term assignment
·
Highly collaborative, delivery-focused team
Sprachkenntnisse
- English
Hinweis für Nutzer
Dieses Stellenangebot wurde von einem unserer Partner veröffentlicht. Sie können das Originalangebot einsehen hier.