AWS Data Engineer
Tata Consultancy Services
- Irving, Texas, United States
- Irving, Texas, United States
À propos
Job Title AWS Certified Data Engineer
Responsibilities
Design, develop, and deploy scalable and cost‑effective data solutions on AWS, leveraging services such as S3 (for data lakes), EC2, EMR, Glue, Athena, Lambda, Redshift, and Kinesis.
Build and maintain robust ETL/ELT data pipelines using PySpark for data ingestion, transformation, and loading into various data stores, including those utilizing open table formats like Iceberg.
Develop and optimize big data processing jobs using PySpark on AWS EMR or AWS Glue, handling large datasets efficiently and integrating with Iceberg table formats.
Design, implement, and manage data warehousing solutions, including schema design, data modeling, and query optimization, focusing on Hive and modern data lake table formats like Iceberg for historical data and analytical queries.
Implement secure and robust cloud infrastructure components, including VPCs, subnets, routing, and security groups, to ensure proper connectivity and isolation for data solutions.
Design, deploy, and manage containerized data processing applications on Amazon Elastic Kubernetes Service (EKS).
Optimize AWS resources and big data applications (Spark, Hive, Iceberg) for performance, cost, and efficiency.
Implement best practices for data security, access control, and compliance within AWS, including IAM policies, S3 bucket policies, and encryption.
Set up monitoring, alerting, and logging for data pipelines and AWS infrastructure; troubleshoot and resolve issues promptly.
Develop and maintain automation scripts using Python and shell scripting for infrastructure provisioning, deployment, and operational tasks.
Work closely with data scientists, analysts, and other engineering teams to understand data requirements and deliver reliable data solutions.
Qualifications
Hold at least one AWS certification (e.g., AWS Certified Solutions Architect – Associate, AWS Certified Data Analytics – Specialty, AWS Certified Developer – Associate).
Hands‑on experience with key AWS services for data processing and storage, including S3, EC2, EMR, Glue, Athena, Lambda, VPC, subnets, routing, security groups, and EKS.
Strong proficiency in PySpark for developing complex data transformations and analytics.
Practical experience with Apache Iceberg for managing and querying data lakes.
In‑depth knowledge and practical experience with Apache Hive for data storage, querying, and schema management.
Expert‑level proficiency in Python (Boto3, scripting, data manipulation).
Proficient in shell scripting for automation and operational tasks.
Strong SQL skills for data querying and manipulation.
Solid understanding of ETL/ELT processes, data modeling, distributed computing, and data governance.
Bachelor of Computer Science.
Good to Have Skills
Experience with Kubernetes for deploying and managing containerized applications.
Experience with CI/CD tools and practices (e.g., AWS CodePipeline, GitHub Actions, GitLab CI).
Experience with workflow orchestration tools like Apache Airflow.
Proficient using Git for source code management.
Exposure to other big data technologies such as Apache Kafka, Flink, or Presto.
Certifications
AWS Certified Solutions Architect – Associate/Professional
AWS Certified Data Analytics – Specialty
AWS Certified Developer – Associate
Salary Salary Range: $125,000 to $140,000 per year
#J-18808-Ljbffr
Compétences linguistiques
- English
Avis aux utilisateurs
Cette offre provient d’une plateforme partenaire de TieTalent. Cliquez sur « Postuler maintenant » pour soumettre votre candidature directement sur leur site.