Dieses Stellenangebot ist nicht mehr verfügbar

Data Scientist - INDIA

Vytwo

United States

United States

Über

Job Description
Job Description
Role: Data Scientist - INDIA Location: Hyderabad / Noida, INDIA *Consultants local to INDIA are eligible.
Category: Data Science Structured Data / Text Data (NLP & GenAI)
About the Role
We are seeking a highly skilled
Data Scientist (37 years of experience)
to join our team and work across two major data science domains:
Structured Data (8090%)
Predictive analytics, forecasting, cost estimation, likelihood modeling, and batchoriented machine learning pipelines. Text / Unstructured Data (NLP & GenAI)
Building lowlatency realtime systems using deep learning, LLMs, prompt engineering, and agentic AI frameworks. This role requires strong expertise in Big Data processing, modern ML tools, and the ability to build scalable, production-ready data science solutions.
Key Responsibilities
Structured Data Machine Learning & Analytics
Build, deploy, and optimize ML models for predictive analytics, forecasting, classification, and regression. Perform large-scale feature engineering using
PySpark
and Big Data tools. Work on batch pipelines, model versioning, and experiment tracking. Develop cost estimation and risk/likelihood models using statistical and ML techniques. Text Data / NLP / GenAI
Build NLP pipelines using deep learning frameworks such as
PyTorch ,
TensorFlow , or similar. Develop realtime, lowlatency inference systems for text classification, embeddings, semantic search, summarization, and retrieval. Create prompts, context graphs, and agentic workflows for LLM-based systems. Apply knowledge of prompt engineering, context engineering, and autonomous agent frameworks to production systems. Core Data Science Engineering & MLOps
Work in
Databricks
for ETL, feature engineering, ML training, and orchestration. Use
Azure
services for model deployment, data pipelines, and infrastructure. Collaborate using Git-based workflows; leverage tools like
GitHub Copilot ,
Claude Code , etc. Implement model monitoring, observability, drift detection, and performance tracking. Required Skills & Experience
Core Skills
Strong hands-on experience with
Databricks
(Delta Lake, MLflow, Job Orchestration). Excellent PySpark skills for large-scale distributed data processing. Proficiency in
Azure
cloud services (ADF, Azure ML, AKS, Databricks on Azure). Strong understanding of ML algorithms, statistical methods, and data analysis. Experience with
deep learning
frameworks:
PyTorch TensorFlow Transformers (HuggingFace)
Experience with
model monitoring
and ML observability. Ability to write clean, optimized code and leverage AI code assistants. NLP / GenAI Specific Skills
Prompt engineering (task prompts, chain of thought, tool calling, retrieval prompts). Context engineering (retrieval pipelines, RAG, memory management, context structuring). Knowledge of LLM-based
agentic frameworks
(LangChain, Semantic Kernel, CrewAI, AutoGen, etc.). Experience with vector databases and embedding models is a plus. Good to Have Skills
Experience with
containerization
(Docker, Kubernetes, AKS). Experience deploying models to production (REST APIs, real-time endpoints). Knowledge of streaming technologies (Kafka, EventHub, Spark Streaming). Understanding of CI/CD for ML (Azure DevOps / GitHub Actions). Who You Are
A problem solver who is comfortable working with both structured and unstructured data. Someone who enjoys using modern AI tools to accelerate development. A data scientist who writes clean, production-grade code. A collaborator who thrives in cross-functional teams and fast-paced environments.
Flexible work from home options available.

United States

Sprachkenntnisse

English

Hinweis für Nutzer

Dieses Stellenangebot wurde von einem unserer Partner veröffentlicht. Sie können das Originalangebot einsehen hier.