Retour aux emplois
XX
Senior Cloud Optimization EngineerStratusGridSaint Paul, Illinois, United States

Cette offre d'emploi n'est plus disponible

XX

Senior Cloud Optimization Engineer

StratusGrid
  • US
    Saint Paul, Illinois, United States
  • US
    Saint Paul, Illinois, United States

À propos

StratusGrid delivers a high-trust, high-competence customer experience around cloud optimization, and we’re productizing that expertise into our agentic platform, Stratusphere. This role is for an engineer who can manually assess, design, and execute cloud optimization work in real customer environments today, while also helping us validate, calibrate, and improve the agent-driven workflows that will increasingly do that work tomorrow.
You will operate as a solutions-driven technical partner to customer stakeholders, navigate objections and organizational dynamics, and translate technical choices into win-win business outcomes. You’ll use AI daily to move faster and deliver better results, and you’ll partner closely with Product and Engineering to evaluate Stratusphere-generated savings opportunities and execution plans, passing or failing them against real-world constraints, improving accuracy and safety, and accelerating the roadmap based on what you learn in the field.
Responsibilities
Customer Outcomes & Optimization Delivery:
Manually assess customer AWS/Azure environments (and eventually GCP), identify optimization opportunities, quantify impact, propose solutions, and execute approved changes safely and efficiently, delivering measurable savings and operational improvements.
Stratusphere Output Review & Calibration:
Review cost-savings opportunities, recommendations, and execution plans generated by Stratusphere. Validate assumptions, safety, feasibility, and expected impact; approve or reject proposed work; and provide structured feedback that improves agent accuracy, reliability, and customer readiness over time.
Build Trust & Navigate Stakeholders:
Build and nurture strong customer relationships through clear communication, genuine care, and consistent follow-through. Confidently navigate technical discussions with both technical and non-technical stakeholders, overcome objections by bringing options and recommendations, and make approvals easy.
Big-Picture Decision Support:
Connect day-to-day optimization work to customer goals and StratusGrid’s strategy. Explain the business and technical implications of decisions (cost, risk, reliability, performance, operational overhead), and guide stakeholders toward win-win outcomes.
Agent Improvement Feedback Loop:
Capture patterns in agent errors and blind spots (e.g., missing context, risky sequencing, unclear rollback, incomplete stakeholder info). Propose rubric changes, training examples, and workflow improvements; partner with Product/Engineering to measurably improve quality and safety.
Pull Request–Quality Change Proposals:
Produce clear, decision-ready deliverables that quantify business value up front, document risks and mitigations, and demonstrate safety/rollback ability, so customers are making decisions, not providing direction. Where Stratusphere drafts artifacts, you will refine and elevate them to StratusGrid standards.
Execution with Reliability & Urgency:
Own work end-to-end with a strong sense of responsibility. Capture commitments, hit deadlines, communicate status proactively, elevate early, and close the loop visibly—no surprises.
AI-Enabled Delivery:
Use AI tools daily to accelerate investigation, documentation, analysis, and implementation planning, raising quality, reducing cycle time, and improving customer outcomes while maintaining sound engineering judgment.
Agent Work Product Evaluation & Feedback:
Evaluate, score, and provide actionable feedback on agent-driven outputs (findings, plans, execution steps, customer comms) to continuously improve Stratusphere’s reliability, safety, and usefulness.
Product Partnership & Roadmap Input:
Partner with Product to convert customer problems and recurring friction into clear problem statements, capability gaps, and roadmap recommendations, bringing evidence and pattern recognition from the field.
Cross-Functional Collaboration:
Work in high-visibility channels with Engineering, Product, and Customer teams; share context broadly; ask questions early; and support a safe culture where we learn fast without introducing risk through isolation.
Operational Excellence & Standards:
Follow StratusGrid customer experience standards, change control processes, documentation expectations, and work-system hygiene to ensure consistency, traceability, and scalability.
Requirements
Cloud Platform Expertise (AWS + Azure):
Proven ability to operate in production AWS and Azure environments, including multi-account/subscription structures, governance constraints, and enterprise-grade patterns. (GCP familiarity is a plus; willingness to ramp is required.)
Hands-On Optimization Execution:
Compute: rightsizing, scheduling, autoscaling, instance family shifts, Spot strategies where appropriate
Storage: lifecycle/tiering, orphaned volumes/snapshots, retention optimization
Networking: egress/data transfer analysis, NAT/GW cost drivers, topology-aware recommendations
Commitments: Understanding of Savings Plans/RIs/Reservations strategy, coverage, and utilization improvements
Infrastructure-as-Code & Change Safety:
Strong IaC skills (Terraform preferred; CloudFormation and/or Bicep/ARM valued), with disciplined Git workflows, PR-based delivery, and an instinct for rollback plans, validation steps, and minimizing blast radius.
Automation & Scripting:
Proficiency in automation using a modern programming language (Python, TypeScript, Go, etc); comfort with AWS/Azure CLIs and SDKs to enumerate resources, collect metadata/metrics, and operationalize remediation at scale.
Identity, Access, and Guardrails:
Ability to work effectively within constrained access and compliance requirements.
Observability-Driven Validation:
Ability to use metrics/logs (CloudWatch, Azure Monitor/Log Analytics; Prometheus/Grafana a plus) to assess risk, validate performance impact, and confirm outcomes post-change.
Cloud Networking & Architecture Literacy:
Practical understanding of VPC/VNet constructs, routing, DNS, load balancing, private connectivity patterns, and how architecture decisions affect cost, reliability, and security.
Customer-Facing Communication:
Exceptional written and verbal communication, able to translate complex technical topics into decision-ready narratives for technical and business stakeholders, with clarity, completeness, and empathy. Ability to listen to customer concerns and turn them into actionable solutions.
Solution Mindset & Organizational Navigation:
Proven ability to work through ambiguity, navigate formal and informal org dynamics, align stakeholders, and drive work to resolution without offloading effort to customers.
Strong Ownership & Urgency:
Track record of meeting commitments, proactively communicating status/risks, escalating early, and maintaining high standards of reliability and follow-through.
AI as a Daily Tool:
Demonstrated habit of using AI tools to accelerate analysis and improve quality (while maintaining rigorous verification, security awareness, and sound judgment).
Travel:
Willingness and ability to travel periodically (as needed) for customer engagements, team planning sessions, or onsite work.
Remote-Work-Ready:
Equipped to work effectively in a distributed team environment, including a reliable high-speed internet connection, a professional and distraction-limited workspace, and the ability to consistently communicate, collaborate, and execute independently.
Nice-to-have / Differentiators
Experience rightsizing EKS/AKS/GKE clusters with a focus on compute strategy and cost optimization. Ability to implement horizontal and vertical pod autoscalers (HPA & VPA) without sacrificing system stability.
CI/CD pipeline experience (GitHub Actions, GitLab CI, Azure DevOps) and policy-as-code exposure (OPA/Sentinel/Azure Policy)
Experience with enterprise landing zones (AWS Control Tower / Azure Landing Zones)
Experience querying large billing datasets (Athena/BigQuery/ADX/Power BI). Proven ability to translate raw billing data into actionable recommendations to balance business needs with cloud costs.
A deep understanding of how cost-saving measures may affect the security of cloud environments. Must be proficient in maintaining least privilege and data protection standards when recommending and executing cost optimization opportunities.
Strong grasp of FinOps fundamentals and cost drivers; expert use of native cost tooling and reporting to build credible baselines, forecasts, and realized-savings narratives (e.g., AWS Cost Explorer/CUR, Azure Cost Management exports, budgets, alerts).
About StratusGrid StratusGrid is building Stratusphere™, a multi-agent platform that turns cloud complexity into measurable outcomes. Stratusphere coordinates specialized infrastructure agents to observe AWS and Azure environments, simulate policy-safe plans, and execute approved changes with auditability and rollback-ready safety, so savings compound over time, security improves, delivery accelerates, and teams spend less time on toil and fire drills.
We’re a team of builders and operators who care about trustworthy automation and real business impact. Our work sits at the intersection of cloud engineering, product, and customer outcomes, helping customers make infrastructure changes they can measure, explain, and stand behind.
At StratusGrid, we recognize the importance of inclusion and the value of diverse perspectives. We are committed to equal employment opportunity regardless of race, color, national or ethnic origin, age, religion, disability, sexual orientation, gender, or any other characteristic.
#J-18808-Ljbffr
  • Saint Paul, Illinois, United States

Compétences linguistiques

  • English
Avis aux utilisateurs

Cette offre a été publiée par l’un de nos partenaires. Vous pouvez consulter l’offre originale ici.