XX
Senior Data EngineerSTScIUnited States
XX

Senior Data Engineer

STScI
  • US
    United States
  • US
    United States

Über

Senior Data Engineer
The Space Telescope Science Institute (STScI), operated by the Association of Universities for Research in Astronomy (AURA), is NASA's science operations center for missions including the Hubble and James Webb Space Telescopes. We are seeking a Senior Data Engineer to join our Data Management Division. We're looking for a talented and experienced professional to help manage the backend data pipelines, MPP database system and ensure high-performance, reliable data access for our advanced astronomical public data archive, the Mikulski Archive for Space Telescopes (MAST) one of the world's most advanced astronomical public data archives serving missions such as HST, JWST, Roman, and TESS. This position can support hybrid work (around twice a quarter, in the office). Candidates must reside in or be willing to relocate to our local market. (MD, DE, VA, PA, DC & WV). This position requires US Citizenship or Permanent Residence to meet ITAR requirements. Responsibilities PostgreSQL/MPP Platform Performance & Operations Own performance tuning, operations, and reliability of PostgreSQL and Greenplum MPP databases; design and implement schema architecture, advanced indexing, partitioning strategies, query optimization, vacuum/analyze processes, and large-scale performance troubleshooting to ensure sub-second to low-latency queries on massive datasets. Airflow Pipelines Design, develop, deploy, monitor, and troubleshoot complex data pipelines using Apache Airflow to process, transform, and load large-scale datasets efficiently and reliably. Kubernetes/Infra with Platform Team Support Collaborate with the platform/infra team to deploy, scale, and manage containerized workloads on Kubernetes, support IaC with Terraform, and contribute to platform reliability and CI/CD practices. Build, maintain, and continuously improve data systems supporting scientific research, including relational databases, cloud-based Lakehouse architecture, and Parquet-based storage for efficient columnar analytics. Ensure data accuracy, accessibility, observability, and reliability through proactive monitoring, alerting, and incident response Work with scientists, data engineers, and cross-functional teams to translate requirements into robust, scalable platform solutions. Establish and evangelize best practices in data architecture, pipeline design, MPP/Lakehouse optimization, platform reliability, and infrastructure as code. Required Technical Skills Advanced expertise in PostgreSQL and Greenplum MPP, including deep schema design, indexing strategies, query optimization, partitioning, performance tuning, and operational management at scale. Strong proficiency with Apache Airflow for complex workflow orchestration, DAG development, scheduling, monitoring, error handling, and operational troubleshooting. Hands-on experience with AWS cloud services (e.g., S3, EC2, EKS/ECS, IAM, VPC, and related data ecosystem services). Excellent Python programming skills for automation, scripting, tool development, and systems integration; proficiency in SQL and SQL performance tuning. Demonstrated analytical/problem-solving abilities with strong communication skills to explain complex technical concepts to non-technical stakeholders. Nice to Have
Experience with Lakehouse technologies: Trino and Apache Iceberg Infrastructure as Code (IaC) using Terraform for provisioning and managing cloud resources. Hands-on CI/CD pipeline setup and management for data tools and platform components. Kubernetes operational experience beyond basics Required Qualifications Bachelor's or master's degree in computer science, Information Technology, or a related discipline 8+ years of professional experience in Linux-based environments, with deep expertise in data engineering, data management, and scalable distributed data architectures
  • United States

Sprachkenntnisse

  • English
Hinweis für Nutzer

Dieses Stellenangebot stammt von einer Partnerplattform von TieTalent. Klicken Sie auf „Jetzt Bewerben“, um Ihre Bewerbung direkt auf deren Website einzureichen.