Data Scientist - Machine Learning EngineeringKalamata Capital LLC • United States

Jetzt Bewerben

Data Scientist - Machine Learning Engineering

Kalamata Capital LLC

United States

United States

Jetzt Bewerben

Über

About Us Kalamata Capital Group is an innovative financial technology company dedicated to using data-driven insights to enhance small business growth. We are on the lookout for a talented Data Scientist to create predictive models, conduct thorough exploratory data analyses, and build robust data pipelines that facilitate pivotal business decisions throughout our organization. Summary The ideal candidate will be an experienced data scientist with significant technical expertise in machine learning, data engineering workflows, and statistical modeling. In this role, you will collaborate closely with engineering, product, and analytics teams to design, validate, and deploy machine learning solutions that enhance decision-making efficiency. The successful applicant will possess strong skills in Pandas, PySpark, and MongoDB and demonstrate the capability to write clean, reproducible, production-ready code. Effective communication of complex analytical insights to non-technical stakeholders is essential. Key Responsibilities Exploratory Analysis & Data Profiling:
Perform exploratory data analysis on vast and intricate datasets utilizing Pandas and PySpark; evaluate data quality and structure. Model Development:
Create, optimize, and assess supervised and unsupervised machine learning models (e.g., tree-based methods, regression, boosting algorithms). Pipeline Engineering:
Design and establish dependable, maintainable machine learning pipelines and preprocessing workflows tailored for production settings. Data Management:
Execute queries and integrate MongoDB datasets; develop efficient schemas and aggregation pipelines supporting analytical and operational tasks. Visualization:
Generate insightful visualizations with seaborn, plotly, and matplotlib to aid in model diagnostics and business storytelling. Reproducible Code:
Develop clean, modular, and well-documented Python code (PEP8 compliant) and manage version control through Git. Model Explainability:
Utilize model interpretation tools like SHAP and LIME to assess feature impact and enhance transparency. Cross-Functional Collaboration:
Collaborate with engineering, analytics, and product teams to transform business needs into actionable model-driven solutions. Documentation:
Create comprehensive technical documents, reports, and model documentation for internal stakeholders. Required Skills & Qualifications Education & Experience:
M.S. in Computer Science, Machine Learning, Computational Biology, or a related quantitative field along with 3+ years of relevant experience, or a similar combination of education and applied work. Solid understanding of Linear Algebra, Probability, and Statistics. Technical Expertise:
Advanced proficiency in Pandas and PySpark for data cleaning, reshaping, merging, feature engineering, and workflow optimization. Extensive experience with MongoDB, including querying, indexing, and aggregation pipelines. Deep understanding of supervised/unsupervised machine learning methodologies and tools (scikit-learn, XGBoost). Strong grasp of optimization, regularization, loss functions, and evaluation metrics (AUC, precision, recall, RMSE). Core Skills:
Demonstrated experience delivering end-to-end machine learning projects (data ingestion, modeling, evaluation, and optional deployment). Capability to write clean, reproducible code and maintain organized notebooks/scripts. Excellent communication abilities to translate analyses into business insights. Willingness to relocate to the New York metro area. Preferred (Bonus) Skills Familiarity with AWS tools (Glue, S3, DMS). Experience with deep learning frameworks (PyTorch, TensorFlow). Experience in deploying models using FastAPI, Flask, AWS, or GCP. Knowledge of SQL, data warehousing, or data versioning. Understanding software engineering best practices (testing, CI/CD, code review). Provide a link to GitHub, GitLab, or a portfolio showcasing analytical/ML code. Flexible work from home options available.

United States

Sprachkenntnisse

English

Hinweis für Nutzer

Dieses Stellenangebot stammt von einer Partnerplattform von TieTalent. Klicken Sie auf „Jetzt Bewerben“, um Ihre Bewerbung direkt auf deren Website einzureichen.

Jetzt Bewerben