Data Engineer

Texas State Library and Archives Commision

United States

United States

Ähnliche Jobs finden

Über

The Enterprise Data Management Services team is looking for a strong, self-motivated Technology Developer to become an integral part of the Data Resiliency program.
Responsibilities include but are not limited to:
Developing and enhancing application components for supporting ML/AI models and data ingestion processes, with focus on code resiliency and stability. Interacting & leading a team of developers, and interacting with business partners and develop processes to ensure that ML/AI models are production-ready Developing, enhancing, modifying and/or maintaining applications Working in a fast paced agile environment, under minimal supervision, with guidance from senior team members Participating in analysis on operational issues Participating in peer reviews for designs, code, and other work productsStrong knowledge of Oracle, SQL, RDBMS along with
Python, Hadoop, Hive, Spark
Experience in developing Hive & DBMS based applications
Python programming background (scripting and object-oriented design) Coding experience with "big data" (Spark/PySpark, SQL, Hadoop, ETL development)
Experience implementing statistical models in python (Jupyter notebooks, scipy, numpy, pandas, Scikit-learn) Machine learning experience or knowledge
Overview: Assess requirement and evaluate existing solutions Build Process to interact with HDFS and Oracle using Python/ PySpark and Oracle PL/ SQL Create Workflows, jobs and schedule them using Autosys Works across development teams to contribute to the story refinement and delivery of data requirements through the delivery life cycle Leverages architecture components in solution development, codes solutions to integrate, clean, transform, and control data as per acceptance criteria Develops and executes test plans to produce quantitative results, identifies test issues and errors, and triages underlying causes Drives complex information technology projects to ensure on-time delivery and adheres to team delivery and release processes Identifies, defines, and documents data engineering requirements, communicating required information for deployment, maintenance, support, and business functionality Ability to work independently with solid analytical skills Ability to work with the team; excellent team player with great attitude Data Resiliency Capabilities
Top 3 skills: 1. Oracle & PL/SQL Knowledge (Expert level) 2. Hadoop ecosystem, Hive Tables 3. Python/PySpark
Preferred Skills : 1. Autosys 2. Agile
Other Required Skills: Strong knowledge of Oracle, SQL, RDBMS along with Python, Hadoop, Hive, Spark Experience in developing Hive & DBMS based applications Python programming background (scripting and object-oriented design) Coding experience with "big data" (Spark/PySpark, SQL, Hadoop, ETL development) Experience implementing statistical models in python (Jupyter notebooks, scipy, numpy, pandas, Scikit-learn) Machine learning experience or knowledge

United States

Sprachkenntnisse

English

Hinweis für Nutzer

Dieses Stellenangebot wurde von einem unserer Partner veröffentlicht. Sie können das Originalangebot einsehen hier.

Ähnliche Jobs finden