XX
Data Scientist 2 - Computer Analytics & ModelingUSA JobsUnited States

Dieses Stellenangebot ist nicht mehr verfügbar

XX

Data Scientist 2 - Computer Analytics & Modeling

USA Jobs
  • US
    United States
  • US
    United States

Über

Data Scientist 2
The Computing, Analytics, and Modeling (CAM) Group within the Environmental Molecular Sciences Division at PNNL is seeking a motivated Data Scientist 2 to contribute to cutting edge AI solution for computational and modeling research across the BER mission space. The role requires experience in designing and implementing AI-based agents and agentic workflows, along with a solid understanding of key tools such as LangChain, LangGraph, and Model Context Protocol (MCP). Candidates should have a proven ability to leverage AI to accelerate the software lifecycle and improve data exploration and retrieval, as well as experience supporting various stages of the data lifecycle, including data modeling, harmonizing data models, managing distributed or federated data, and organizational data governance. Additional expertise in metabolic modeling techniques such as flux balance analysis and metabolic control analysis, and familiarity with structural biology data, particularly cryo-electron tomography, is highly desirable. Knowledge of causal inference methods and their application to complex biological systems will further strengthen the candidate's profile. Responsibilities include designing, developing, documenting, testing, and debugging new and existing software systems, hardware/software interfaces, and/or applications according to industry established software engineering principals and best practices. The Data Scientist 2 will work collaboratively within a team to execute on the full system development lifecycle including analyzing user needs to determine technical requirements; developing technical specifications based on conceptual design and requirements; developing well-crafted and documented source code; integrating hardware using software; automating manual tasks; and consulting with the end user to prototype, configure, refine, test, and debug programs or systems to meet needs. The Data Scientist 2 will identify and evaluate new technologies or methods for implementation and continuous improvement. The Data Scientist 2 will drive the design and implementation of agentic workflows for scientific automation, leveraging frameworks such as LangChain, LangGraph, and the Model Context Protocol (MCP). They will develop, maintain, and support open-source scientific software, using CI/CD practices and containerized, reproducible workflows on HPC systems. The role also involves advancing CryoET data analysis capabilities through the integration of AI methods, physics-based simulations, and structural biology toolkits, as well as expanding systems biology modeling capabilities, including metabolic modeling, whole-cell modeling, and causal-reasoning approaches. The Data Scientist 2 will communicate technical findings, including contributing to and leading the preparation of scientific output including reports, manuscripts, visualizations, stakeholder presentations, and software. Minimum Qualifications: BS/BA and 2 years of relevant experience -OR- MS/MA -OR- PhD Preferred Qualifications: Degree in Computer Science, Electrical and Computer Engineering, Bioinformatics, Statistics, Physics, Mathematics or a related field. Agentic AI & Tools: Design and implementation of single and multi agent AI workflows for scientific automation; proficiency with LangChain, LangGraph, and Model Context Protocol (MCP); experience with LLM reasoning frameworks (e.g., ReAct) and orchestration for data analysis and metabolic engineering. Cryo ET 3D Vision & Intelligent Retrieval: Expertise in 3D computer vision for cryo electron tomography, topologically aware protein classification, sim to real transfer, and reconstruction level signal characterization, paired with development of advanced search and retrieval systems for complex biological datasets and specialized workflows (e.g., post translational modification discovery). AI & Autonomous Science: Interest in foundation models, multi-agent systems, and autonomous science frameworks; experience applying computational and modeling approaches across the BER mission space. Structural & Multiomic Data: Skilled in structural biology data analysis (especially cryo-electron tomography and protein classification workflows) and multiomic approaches for metabolic and circadian regulation. Programming & Frameworks: Proficiency in Python, PyTorch, TensorFlow, and OpenCV; experience with version control (e.g., Git) and collaborative development practices. Data & Model Development: Skilled in preparing data for machine learning, including signal processing and feature extraction for high-dimensional datasets; familiarity with HPC environments and distributed ML training. Data Lifecycle Management & Engineering: Experience developing and maintaining open source scientific software with CI/CD and containerized, reproducible HPC workflows; expertise in data modeling, harmonization, and management of distributed and federated data systems. Domain-Specific AI Applications: Experience creating agentic workflows for scientific discovery and integrating AI into biological data analysis pipelines. Biological Modeling & Analysis: Expertise in metabolic modeling (flux balance analysis, metabolic control analysis) and whole-cell modeling for spatio-temporal energy metabolism in microbial systems. Advanced Causal Methods: Experience with causal inference and advanced causal analysis techniques for biosystems design (e.g., Causal Component Analysis, Y0-based identification). Problem-Solving & Adaptability: Strong ability to tackle complex scientific and data challenges with attention to detail; adaptable to emerging technologies and innovative AI-driven approaches. Collaboration & Communication: Effective in interdisciplinary environments, with proven skills in written and verbal communication across computational, biological, and data science domains. Leadership & Initiative: Demonstrated ability to identify opportunities, advocate for them, and integrate agentic AI with deterministic workflows (e.g., ADEPT-Bio framework). Remote & Distributed Work: Proven success working in highly distributed teams and fostering collaboration in virtual settings. Intellectual Curiosity: Enthusiasm for interdisciplinary research and continuous learning in advanced computational and biological sciences.
  • United States

Sprachkenntnisse

  • English
Hinweis für Nutzer

Dieses Stellenangebot wurde von einem unserer Partner veröffentlicht. Sie können das Originalangebot einsehen hier.