Zurück zur Stellenangebote
XX
Sr. Technical Solutions ArchitectSoftchoiceUnited States
XX

Sr. Technical Solutions Architect

Softchoice
  • US
    United States
  • US
    United States

Über

Sr. Technical Solutions Architect
We are seeking a Senior Technical Solutions Architect — AI to serve as a hands-on, platform-agnostic technical architect for our strategic AI engagements. This person sits at the intersection of customer strategy, applied AI engineering, and modern software delivery. They translate ambiguous business problems into working prototypes, scalable reference architectures, and production-grade solutions across public-cloud hyperscaler AI platforms and sovereign (on-premise / private) AI environments. The ideal candidate is equally comfortable whiteboarding an agentic architecture with a CIO, writing the proof-of-concept code that proves it works, and guiding a client engineering team through the secure path to production. They are vendor-fluent but vendor-neutral — recommending the right tool for the workload, the data, the risk profile, and the budget. Solutioning & Architecture Design end-to-end AI solutions spanning Generative AI (RAG, CAG, GraphRAG, fine-tuning, model distillation) and agentic AI (tool-using agents, multi-agent orchestration, MCP-based integrations). Architect across all major hyperscaler AI stacks — AWS (Bedrock, SageMaker, Q), Microsoft Azure (Azure AI Foundry, Azure OpenAI), and Google Cloud (Vertex AI, Gemini) — and recommend the right platform per workload rather than defaulting to a single provider. Architect sovereign / on-premise AI solutions using stacks such as NVIDIA AI Enterprise (NIM, NeMo, Blueprints), Dell AI Factory, HPE Private Cloud AI, Red Hat OpenShift AI, Run:ai, and open-source model serving (vLLM, TGI, Ollama) — for clients with data residency, regulatory, IP, or air-gapped requirements. Develop reusable reference architectures, decision frameworks, and trade-off analyses (cost, latency, accuracy, governance, sovereignty) that scale across the practice. Rapid Prototyping Build working prototypes — not just slides. Translate client problem statements into functional demos and pilots in days, not months. Stand up RAG, CAG, and agentic workflows quickly using frameworks such as LangChain / LangGraph, LlamaIndex, CrewAI, AutoGen, Semantic Kernel, and MCP-compliant agent toolchains. Integrate vector stores (Pinecone, Weaviate, Milvus, Chroma, pgvector, OpenSearch), graph stores (Neo4j, Neptune), and hybrid retrieval pipelines as the use case demands. Run rigorous, repeatable evals on prototypes (groundedness, faithfulness, latency, cost-per-task, tool-use accuracy) so recommendations are evidence-based. AI-Native Engineering & Modernization Lead solutioning for AI-native software engineering engagements: AI-assisted development, code refactoring at scale, tech debt burndown, legacy modernization, test generation, and documentation regeneration. Architect Secure SDLC (SSDLC) practices into every AI-built or AI-assisted codebase — threat modeling, SAST/DAST integration, SBOM generation, dependency hygiene, secrets management, and supply-chain security. Advise clients on integrating AI coding agents (Claude Code, Cursor, GitHub Copilot Workspace, Devin, and others) into their existing SDLC and DevSecOps toolchains without compromising guardrails. Define MLOps / LLMOps / AgentOps patterns: model and prompt versioning, evaluation pipelines, observability (traces, token usage, drift), guardrails, and human-in-the-loop review. AI Security Conduct AI-specific threat modeling for every solution — covering adversarial inputs, prompt injection, jailbreaking, model inversion, training data extraction, and indirect injection via tool outputs or retrieved documents — and translate findings into concrete mitigations in the architecture. Design multi-layer guardrail architectures: input sanitization and intent classification, output filtering (PII redaction, toxicity screening, factual grounding checks), content safety policies, and fallback / refusal handling — covering both hosted API models and self-hosted open-weight deployments. Enforce least-privilege access control for agentic systems: scope tool permissions, define agent authorization boundaries, audit and log all tool invocations, and ensure agents cannot escalate privileges or exfiltrate data outside approved boundaries. Maintain end-to-end AI supply chain security: vet third-party model weights and datasets for backdoors or poisoning, validate fine-tuned model integrity, enforce cryptographic signing of model artifacts, and apply model cards and datasheets as governance artifacts. Align AI solutions to applicable compliance frameworks — NIST AI RMF, OWASP LLM Top 10, ISO/IEC 42001, EU AI Act, and relevant sector-specific regulations — and produce the risk documentation, impact assessments, and audit trails clients need to satisfy internal governance and external regulators. Client Engagement & Enablement Serve as the senior technical voice in client conversations — from executive briefings through deep technical design sessions. Partner with sales, delivery, and practice leadership to scope statements of work, estimate effort, and de-risk delivery. Mentor architects, engineers, and consultants across the broader AI practice; raise the technical bar through code reviews, internal enablement, and reusable assets. Stay ahead of the field — evaluate emerging models, frameworks, and protocols (e.g., MCP, A2A, ACP, new agent frameworks, new sovereign AI stacks) and bring well-reasoned points of view back to the practice. What you'll bring to the table: 8+ years of progressive experience in software engineering, solutions / Enterprise architecture, or applied AI/ML, with at least 2+ years in a hands-on Generative AI or agentic AI role. Demonstrated ability to rapidly prototype AI solutions and ship working code — not just designs or documents. Deep, hands-on experience with at least one of the three major hyperscaler AI platforms (AWS, Azure, GCP) and a working understanding of the second and third. Production experience designing and shipping RAG and/or agentic systems, including practical familiarity with chunking strategies, embedding model selection, retrieval evaluation, and orchestration patterns. Working knowledge of MCP (Model Context Protocol) and modern agent-tool integration patterns; ability to design MCP servers and clients, and to reason about when MCP is the right abstraction versus alternatives. Strong understanding of CAG (Cache-Augmented Generation), RAG variants (naive, hybrid, GraphRAG, agentic RAG), and the trade-offs between each. Proficiency in Python ; comfort in at least one additional language (TypeScript/JavaScript, Go, Java, or C#). Experience integrating with enterprise systems: REST/GraphQL APIs, event streams (Kafka, EventBridge), identity (OIDC, SAML, OAuth2), and enterprise data platforms (Snowflake, Databricks, Fabric, BigQuery). Excellent written and verbal communication; able to move fluidly between executive narrative and engineering whiteboard. Foundational fluency in AI security concepts : able to identify and articulate risks such as prompt injection, data poisoning, model extraction, and inference-time attacks, and to reason about appropriate mitigations for each in the context of a given architecture and risk tolerance. Strongly Preferred Software development background with real production experience across the SDLC and Secure SDLC (SSDLC) — including CI/CD, infrastructure as code (Terraform, Pulumi, Bicep), containers and Kubernetes, and DevSecOps tooling. Experience leading code refactoring, technical debt remediation, and legacy modernization programs — ideally with AI-assisted approaches. Experience designing sovereign / on-premise AI deployments: NVIDIA NIM / NeMo, OpenShift AI, Run:ai, vLLM at scale, GPU capacity planning, and on-prem vector / graph stores. Background in security and governance : prompt injection defense, output filtering, data loss prevention, model risk management, NIST AI RMF, ISO/IEC 42001, and EU AI Act readiness; familiarity with the OWASP LLM Top 10, adversarial ML attack taxonomies (MITRE ATLAS), and red-teaming / evaluation techniques for LLMs; experience translating these frameworks into practical control designs rather than checkbox compliance. Experience fine-tuning, distilling, or post-training open-weight models (Llama, Mistral, Qwen, Gemma) for enterprise use cases. Industry experience in regulated verticals (financial services, healthcare, public sector, defense) where sovereignty and compliance are non-negotiable. Relevant certifications (AWS / Azure / GCP AI specialty, CKA/CKAD, CISSP, NVIDIA-certified) — useful, but capability is weighted more heavily than credentials. Education Bachelor's degree in Computer Science, Engineering, Mathematics, or a related technical field, or equivalent demonstrable experience . Advanced degree is welcomed but not required. What Sets a Great Candidate Apart A pragmatic, opinionated point of view on when not to use GenAI or agents — and the judgment to steer clients toward the right answer even when it isn't the flashy one. Curiosity that runs ahead of the market: already experimenting with the next protocol, the next model, the next orchestration pattern before clients ask. Comfort with ambiguity — the ability to walk into a half-formed problem, frame it, prototype against it, and leave the client with a clearer path forward than they had that morning. Location & Travel Remote-friendly with periodic travel to client sites and internal events (estimated 0–15%). Compensation
  • United States

Sprachkenntnisse

  • English
Hinweis für Nutzer

Dieses Stellenangebot stammt von einer Partnerplattform von TieTalent. Klick auf „Jetzt Bewerben”, um deine Bewerbung direkt auf deren Website einzureichen.