Archer Data Scientist

emergemarket.com

Livermore, California, United States

Livermore, California, United States

Jetzt Bewerben

Über

Remote - California 123 Main St Livermore, CA 94551, USA
Archer is a leading provider of integrated risk management (IRM) solutions that enable customers to improve strategic decision-making and operational resilience with a modern technology platform that supports qualitative and quantitative analysis driven by both business and IT impacts. As true pioneers in GRC software, Archer remains solely dedicated to helping customers manage risk and compliance domains, from traditional operational risk to emerging issues such as ESG. With over 20 years in the risk management industry, the Archer customer base represents one of the largest pure risk management communities globally, with more than 1,200 customers including more than 50% of the Fortune 500. Learn more at www.ArcherIRM.com .
Data Scientist – LLM & Data Pipeline Engineering (LegalTech / RegTech AI) Overview We are seeking an experienced
Data Scientist
with a strong background in
AI model integration, data pipeline development,
and
knowledge base (KB) engineering
to support our next-generation LegalTech / RegTech AI platform.
This role blends
applied machine learning ,
data engineering , and
software development , focusing on building scalable pipelines that connect
large language models (LLMs)
to structured and unstructured data through
retrieval-augmented generation (RAG)
and
vector database
architectures.
The ideal candidate is passionate about operationalizing AI — from
training and fine-tuning models
to
deploying intelligent retrieval systems
in AWS cloud environments.
Key Responsibilities
Design, train, and evaluate LLM-based pipelines for document understanding, obligation extraction, and regulatory reasoning.
Implement and optimize
RAG architectures , combining LLMs with
vector databases
for semantic retrieval.
Develop and maintain model fine-tuning workflows, embedding generation, and knowledge distillation.
Collaborate with ML Ops teams to integrate AI models into production-ready APIs and services on
AWS .
Measure and improve model precision, recall, latency, and interpretability.
1.5 Agentic and MCP Knowledge Integration
Design and maintain
agentic multi-component processes (MCPs)
that enable context-aware reasoning across multiple data sources and agents.
Implement AI agents capable of dynamic tool use, autonomous task decomposition, and multi-context knowledge retrieval.
Develop pipelines that support
agent memory ,
self-reflection , and
knowledge synthesis
across distributed systems and knowledge bases.
Collaborate with engineering teams to integrate MCP-driven agents with retrieval, analytics, and workflow orchestration layers, ensuring compliance with regulatory reasoning frameworks.
Build and manage
end-to-end data pipelines
for ingestion, transformation, embedding, and indexing of legal and compliance data.
Orchestrate data workflows leveraging AWS services (e.g.,
S3 ,
Lambda ,
Glue ,
SageMaker ,
Step Functions ,
RDS ).
Develop scalable ETL/ELT processes to feed both relational ( PostgreSQL ) and vector databases (e.g.,
Pinecone ,
FAISS ,
Weaviate ,
Elastic Vector Search ).
Ensure data lineage, reproducibility, and version control across AI and analytics pipelines.
Automate retraining and evaluation pipelines for continuous learning from user feedback.
3. Knowledge Base & Information Retrieval
Architect and maintain intelligent
Knowledge Bases (KBs)
to support AI-driven search, summarization, and compliance reasoning.
Implement advanced retrieval techniques using
ElasticSearch / Elastic Vector Search
and embedding-based retrieval.
Align KB structures with business ontologies and regulatory taxonomies to support explainable AI outputs.
Collaborate with domain experts and PMs to enrich KB metadata and enhance model context relevance.
4. AWS & Deployment
Deploy and scale AI pipelines using AWS services such as
SageMaker ,
Lambda ,
ECS/EKS ,
API Gateway , and
CloudFormation/Terraform .
Implement model and data monitoring solutions for drift detection, latency management, and cost optimization.
Collaborate with DevOps to maintain secure, reliable, and compliant cloud environments.
5. Cross-Functional Collaboration
Partner with engineering, product, and compliance teams to align AI models with regulatory and data governance requirements.
Work closely with QA and Professional Services teams to validate AI outputs and improve client-facing performance.
Document architectures, experiment results, and data flows to ensure transparency and reproducibility.
Preferred Experience
Experience building
AI products for LegalTech, RegTech, or compliance automation .
Familiarity with
agentic AI frameworks
(e.g., OpenAI MCP, CrewAI, LangGraph, or AutoGen).
Background in
document intelligence systems ,
multi-agent orchestration , or
knowledge graph integration .
Experience with
LangChain ,
LlamaIndex , or similar frameworks for RAG orchestration.
Hands-on knowledge of
MLOps
tools and
data versioning
(DVC, MLflow, Weights & Biases).
Understanding of
governance, interpretability , and
ethical AI
Qualifications
5+ years
of experience in
data science, ML engineering, or AI-driven software development .
Strong programming skills in
Python
(NumPy, Pandas, PyTorch/TensorFlow, LangChain, or equivalent).
Experience with
vector databases
and
retrieval systems
(Pinecone, FAISS, Weaviate, Qdrant, or Elastic Vector Search).
Hands-on experience with
RAG pipelines ,
embedding models , and
LLM orchestration
(OpenAI, Bedrock, Hugging Face, etc.).
Solid understanding of
data pipelines ,
ETL frameworks , and
cloud-native deployment on AWS .
Familiarity with
Elasticsearch ,
PostgreSQL , and
API integration
patterns.
Knowledge of
ML lifecycle management , including model training, evaluation, and monitoring.
Soft Skills
Strong problem-solving and system design capabilities.
Excellent communication skills for cross-disciplinary collaboration.
Passion for structured documentation, reproducibility, and experimentation.
Adaptable mindset with focus on performance, scalability, and reliability.
Success Indicators
Scalable and well-documented RAG pipelines supporting production of AI workloads.
High model accuracy, retrievability, and latency efficiency.
Reliable data flow from ingestion to inference with minimal manual intervention.
Increased explainability and compliance assurance across AI outputs.
Additional Information About Archer’s Culture and Work Environment: Our people, team collaboration and dynamic leadership is the centerpiece of our great culture and the reason for Archer’s 25 years of success. Over the years, many companies and global organizations have been faced with tough decisions. Layoffs, reorganizations, acquisitions, and mergers. Yet, throughout these challenging times, Archer has exemplified strong innovation and growth and a commitment to our employees. Why is this possible? Collaboration is the key to our success. It inspires great innovation and innovative ideas. It is why Archer's is a household name in the GRC space. Companies, from F500 – F1000, come to Archer first - for our thought leadership and for our ability to meet customers where they are. As we continue to grow and evolve, our focus will remain the same: continue innovating, support our customers and employees and continue driving the risk management industry to new levels.
Please note this job description is not designed to cover or contain a comprehensive listing of activities, duties or responsibilities that are required of the employee for this job. Duties, responsibilities and activities may change at any time with or without notice at management discretion based on business need.
Archer is committed to the principle of equal employment opportunity for all employees and applicants for employment and to providing employees with a work environment free of discrimination and harassment. All employment decisions at Archer are based on business needs, job requirements and individual qualifications, without regard to race, color, religion, national origin, sex (including pregnancy), age, disability, sexual orientation, gender identity and/or expression, marital, civil union or domestic partnership status, protected veteran status, genetic information, or any other characteristics protected by federal, state or local laws. Archer will not tolerate discrimination or harassment based on any of these characteristics. This policy applies to all terms and conditions of employment, including recruiting, hiring, placement, promotion, termination, layoff, recall, transfer, leaves of absence, compensation, and training. All Archer employees are expected to support this policy and contribute to an environment of equal opportunity.
If you need a reasonable accommodation during the application process, please contact talent-acquisition@archerirm.com. All employees must be legally authorized to work in Country they are applying for. Archer and its approved consultants will never ask you for a fee to process or consider your application for a career with Archer. Archer reserves the right to amend or withdraw any job posting at any time, including prior to the advertised closing date.
Pay Transparency Notice: We’re committed to fair and transparent pay practices. In line with state pay transparency laws, the salary range for this role is available upon request. Please contact our Talent Acquisition team at Talent-Acquisition@archerirm.com for the range and related compensation details. Actual pay may vary based on location, experience, skills, and internal equity.
Equal Opportunity Employer This employer is required to notify all applicants of their rights pursuant to federal employment laws.For further information, please review the Know Your Rights notice from the Department of Labor.
#J-18808-Ljbffr

Livermore, California, United States

Sprachkenntnisse

English

Hinweis für Nutzer

Dieses Stellenangebot stammt von einer Partnerplattform von TieTalent. Klick auf „Jetzt Bewerben”, um deine Bewerbung direkt auf deren Website einzureichen.

Jetzt Bewerben