Über
DS Technologies Inc is looking for Data Engineer role for one of our premier clients.
Job Title: Data Engineer
Location: San Francisco Bay Area, CA (or) New York City, NY (Onsite) Position Type: Full-Time
Experience: 4-7 Years Note: Any VISA'S are accepted
Primary focus : Model reproduction, feature engineering logic, performance validation, and ensuring alignment with client's’s established modelling frameworks.
• Rebuild and port existing client's Python based models into customer’s Databricks platform.
• Develop, train, and validate predictive models using Python, PySpark, and ML frameworks such as scikitlearn, XGBoost, and Spark MLlib.
• Develop, validate and reproduce feature engineering logic and ensure parity with client's models.
• Train, retain, validate, and benchmark model performance using customer provided datasets while maintain performance parity with baseline models.
• Work with data engineers to define feature requirements and ensure datasets support model needs.
• Perform model diagnostics, bias checks, stability checks, and accuracy assessments.
• Prepare model documentation, validation summaries, and stakeholder ready insights.
• Support scoring pipeline design and ensure reproducibility across Dev/QA/Prod.
• Collaborate with compliance and platform teams to ensure adherence to governance.
• Perform model diagnostics, hyperparameter tuning, and stability analysis.
• Evaluate model performance across population segments and time periods.
• Work with platform and engineering teams to support scoring pipeline deployment across Dev/QA/Prod.
Qualifications:
• 4–6 years of experience in applied machine learning or data science.
• Strong hands-on experience with Python, scikit-learn, XGBoost, LightGBM, CatBoost, or similar libraries.
• Experience developing ML models in Databricks with Python or PySpark.
• Strong knowledge of feature engineering, model training workflows, and evaluation techniques.
• Experience working with large structured datasets (financial or transactional data preferred).
• Ability to write clear documentation and communicate technical results to non-technical stakeholders.
• 4+ years of hands-on experience developing, deploying, and maintaining machine-learning models.
• Advanced proficiency in Python (NumPy, pandas, scikit-learn, PyTorch or TensorFlow).
• Strong statistical and mathematical foundation, including regression, classification, probability, optimization, etc.
• Experience building end-to-end ML pipelines: data ingestion, cleaning, feature engineering, modeling, evaluation, deployment.
• Experience working within client environments, including adapting to unfamiliar infrastructure, constraints, and security requirements.
• Experience with cloud platforms (AWS, Azure, or GCP) and on-prem environments.
• Advanced SQL ability and experience with big-data tools (Spark, Databricks, Hadoop).
Sprachkenntnisse
- English
Hinweis für Nutzer
Dieses Stellenangebot stammt von einer Partnerplattform von TieTalent. Klicken Sie auf „Jetzt Bewerben“, um Ihre Bewerbung direkt auf deren Website einzureichen.