Machine Learning Engineer, LLM Fine-TuningFIRST SOFTSOLUTIONS INC • United States

Jetzt Bewerben

Machine Learning Engineer, LLM Fine-Tuning

FIRST SOFTSOLUTIONS INC

United States

United States

Jetzt Bewerben

Über

We are actively hiring for
Machine Learning Engineer
Location:
San Jose, CA (Onsite)
Skills: LLM Fine-Tuning (Verilog/RTL Applications) AWS (primary; Bedrock + SageMaker)
Own the technical roadmap
for Verilog/RTL-focused LLM capabilities-from model selection and adaptation to evaluation, deployment, and continuous improvement.
Lead a hands-on team
of applied scientists/engineers: set direction, unblock technically, review designs/code, and raise the bar on experimentation velocity and reliability. Fine-tune and customize models
using state-of-the-art techniques (LoRA/QLoRA, PEFT, instruction tuning, preference optimization/RLAIF) with robust HDL-specific evals:
Compile-/lint-/simulate-based pass rates, pass@k for code generation, constrained decoding to enforce syntax, and "does-it-synthesize" checks.
Design privacy-first ML pipelines on AWS :
Training/customization and hosting using
Amazon Bedrock
(including
Anthropic
models) where appropriate;
SageMaker
(or EKS + KServe/Triton/DJL) for bespoke training needs. Artifacts in
S3
with
KMS
CMKs; isolated
VPC
subnets &
PrivateLink
(including
Bedrock VPC endpoints ),
IAM
least-privilege,
CloudTrail
auditing, and
Secrets Manager
for credentials. Enforce encryption in transit/at rest, data minimization, no public egress for customer/RTL corpora.
Stand up dependable model serving : Bedrock model invocation where it fits, and/or low-latency self-hosted inference (vLLM/TensorRT-LLM), autoscaling, and canary/blue-green rollouts. Build an evaluation culture : automatic regression suites that run HDL compilers/simulators, measure behavioral fidelity, and detect hallucinations/constraint violations; model cards and experiment tracking (MLflow/Weights & Biases). Partner deeply
with hardware design, CAD/EDA, Security, and Legal to source/prepare datasets (anonymization, redaction, licensing), define acceptance gates, and meet compliance requirements. Drive productization : integrate LLMs with internal developer tools (IDEs/plug-ins, code review bots, CI), retrieval (RAG) over internal HDL repos/specs, and safe tool-use/function-calling. Mentor & uplevel : coach ICs on LLM best practices, reproducible training, critical paper reading, and building secure-by-default systems. 10+ years
total engineering experience with
5+ years
in ML/AI or large-scale distributed systems;
3+ years
working directly with transformers/LLMs. Proven track record
shipping LLM-powered features
in production and leading ambiguous, cross-functional initiatives at Staff level. Deep hands-on skill with
PyTorch ,
Hugging Face Transformers/PEFT/TRL , distributed training (DeepSpeed/FSDP), quantization-aware fine-tuning (LoRA/QLoRA), and constrained/grammar-guided decoding. AWS expertise
to design and defend secure enterprise deployments, including:
Amazon Bedrock
(model selection,
Anthropic
model usage, model customization, Guardrails, Knowledge Bases, Bedrock runtime APIs, VPC endpoints) SageMaker
(Training, Inference, Pipelines),
S3 ,
EC2/EKS/ECR ,
VPC/Subnets/Security Groups ,
IAM ,
KMS ,
PrivateLink ,
CloudWatch/CloudTrail ,
Step Functions ,
Batch ,
Secrets Manager .
Strong software engineering fundamentals: testing, CI/CD, observability, performance tuning; Python a must (bonus for Go/Java/C++). Demonstrated ability to
set technical vision
and influence across teams; excellent written and verbal communication for execs and engineers.

United States

Sprachkenntnisse

English

Hinweis für Nutzer

Dieses Stellenangebot stammt von einer Partnerplattform von TieTalent. Klicken Sie auf „Jetzt Bewerben“, um Ihre Bewerbung direkt auf deren Website einzureichen.

Jetzt Bewerben