Data Engineer (AI) - Cincinnati
Medpace
- Cincinnati, Ohio, United States
- Cincinnati, Ohio, United States
About
Responsibilities
Utilize skills in handling unconventional data such as unstructured content from web-based sites and varying content (documents, images, etc.) into different data lakes using software solutions such as Snowflake, Azure, SQL, and Python.
Provide the handling of (and where needed training of) Large Language Models (LLMs) in the Extract, Transform, and Load (ETL) of large corpora of data into a data lake.
Participate in NLP extraction of unstructured data into structured meta‑data using tools such as semantic understanding and meaning with Python and REST APIs.
Support ensuring the data flow of any external content conforms to the latest US and EU AI Acts concerning AI, including security, confidentiality, and privacy of PHI.
Collect, analyze, and document user requirements while working with AI engineers to align data sources to downstream integration within systems.
Create software applications that support the understanding and visualization of data flows from inception to derivation, maintaining version control by following the software development lifecycle process (requirements gathering, design, development, testing, release, and maintenance).
Participate in software validation through development, review, and/or execution of test plans, cases, or scripts.
Communicate with team members regarding projects, development, tools, and procedures.
Provide end‑user support, including setup, installation, and maintenance for applications.
Qualifications
Bachelor's Degree in Computer Science, Data Science, or a related field.
1‑3+ years of experience in Data Engineering.
Background with AI tools that support data extraction and natural language processing, handling varied unstructured content into structured meta‑data.
Knowledge of developing dimensional data models from unstructured content and awareness of the advantages and limitations of Star Schema and Snowflake schema designs.
Solid ETL development and reporting knowledge based on a deep understanding of business processes and measures.
Knowledge of Snowflake cloud data warehouse and Azure cloud is preferred.
Knowledge of REST APIs.
Good knowledge of SQL Server databases and Python programming language is required.
Knowledge of C# is a bonus, as is experience with Azure Data Fabric.
Excellent analytical, written, and oral communication skills.
Medpace Perks
Flexible work environment
Competitive compensation and benefits package
Competitive PTO packages
Structured career paths with opportunities for professional growth
Company‑sponsored employee appreciation events
Employee health and wellness initiatives
#J-18808-Ljbffr
Languages
- English
Notice for Users
This job comes from a TieTalent partner platform. Click "Apply Now" to submit your application directly on their site.