XX
Machine Learning Engineer, LLM Fine-TuningFIRST SOFTSOLUTIONS INCUnited States
XX

Machine Learning Engineer, LLM Fine-Tuning

FIRST SOFTSOLUTIONS INC
  • US
    United States
  • US
    United States

Über

We are actively hiring for
Machine Learning Engineer
Location:
San Jose, CA (Onsite)
Skills: LLM Fine-Tuning (Verilog/RTL Applications) AWS (primary; Bedrock + SageMaker)
Own the technical roadmap
for Verilog/RTL-focused LLM capabilities-from model selection and adaptation to evaluation, deployment, and continuous improvement.
Lead a hands-on team
of applied scientists/engineers: set direction, unblock technically, review designs/code, and raise the bar on experimentation velocity and reliability. Fine-tune and customize models
using state-of-the-art techniques (LoRA/QLoRA, PEFT, instruction tuning, preference optimization/RLAIF) with robust HDL-specific evals:
Compile-/lint-/simulate-based pass rates, pass@k for code generation, constrained decoding to enforce syntax, and "does-it-synthesize" checks.
Design privacy-first ML pipelines on AWS :
Training/customization and hosting using
Amazon Bedrock
(including
Anthropic
models) where appropriate;
SageMaker
(or EKS + KServe/Triton/DJL) for bespoke training needs. Artifacts in
S3
with
KMS
CMKs; isolated
VPC
subnets &
PrivateLink
(including
Bedrock VPC endpoints ),
IAM
least-privilege,
CloudTrail
auditing, and
Secrets Manager
for credentials. Enforce encryption in transit/at rest, data minimization, no public egress for customer/RTL corpora.
Stand up dependable model serving : Bedrock model invocation where it fits, and/or low-latency self-hosted inference (vLLM/TensorRT-LLM), autoscaling, and canary/blue-green rollouts. Build an evaluation culture : automatic regression suites that run HDL compilers/simulators, measure behavioral fidelity, and detect hallucinations/constraint violations; model cards and experiment tracking (MLflow/Weights & Biases). Partner deeply
with hardware design, CAD/EDA, Security, and Legal to source/prepare datasets (anonymization, redaction, licensing), define acceptance gates, and meet compliance requirements. Drive productization : integrate LLMs with internal developer tools (IDEs/plug-ins, code review bots, CI), retrieval (RAG) over internal HDL repos/specs, and safe tool-use/function-calling. Mentor & uplevel : coach ICs on LLM best practices, reproducible training, critical paper reading, and building secure-by-default systems. 10+ years
total engineering experience with
5+ years
in ML/AI or large-scale distributed systems;
3+ years
working directly with transformers/LLMs. Proven track record
shipping LLM-powered features
in production and leading ambiguous, cross-functional initiatives at Staff level. Deep hands-on skill with
PyTorch ,
Hugging Face Transformers/PEFT/TRL , distributed training (DeepSpeed/FSDP), quantization-aware fine-tuning (LoRA/QLoRA), and constrained/grammar-guided decoding. AWS expertise
to design and defend secure enterprise deployments, including:
Amazon Bedrock
(model selection,
Anthropic
model usage, model customization, Guardrails, Knowledge Bases, Bedrock runtime APIs, VPC endpoints) SageMaker
(Training, Inference, Pipelines),
S3 ,
EC2/EKS/ECR ,
VPC/Subnets/Security Groups ,
IAM ,
KMS ,
PrivateLink ,
CloudWatch/CloudTrail ,
Step Functions ,
Batch ,
Secrets Manager .
Strong software engineering fundamentals: testing, CI/CD, observability, performance tuning; Python a must (bonus for Go/Java/C++). Demonstrated ability to
set technical vision
and influence across teams; excellent written and verbal communication for execs and engineers.
  • United States

Sprachkenntnisse

  • English
Hinweis für Nutzer

Dieses Stellenangebot stammt von einer Partnerplattform von TieTalent. Klicken Sie auf „Jetzt Bewerben“, um Ihre Bewerbung direkt auf deren Website einzureichen.