QA Engineer - Load Testing Specialist (2 months contract)Monolithai • London, England, United Kingdom
QA Engineer - Load Testing Specialist (2 months contract)
Monolithai
- London, England, United Kingdom
- London, England, United Kingdom
Über
release focused on improving concurrency and high request load handling.
This fast-paced, short-term engagement requires someone who can quickly understand complex distributed systems, design comprehensive load tests, and work collaboratively with a rapidly growing engineering team to ensure our new environment meets performance requirements.
Primary Responsibilities
Design and Implement Automated Load Testing Framework
Develop comprehensive load tests for FastAPI endpoints, Temporal workflows/activities, and AWS service interactions
Create realistic test scenarios simulating concurrent workflow execution patterns, including graph-based workflow orchestration
Build automated test suites that measure system behavior under varying concurrency levels and request loads
Performance Analysis and Bottleneck Identification
Monitor and analyze system performance across the entire stack (API layer, Temporal workers, AWS services)
Identify concurrency limitations in Temporal workflow execution, AWS service limits (Athena, ECS), and inter‑component communication
Document performance characteristics including response times, throughput limits, and failure modes under load
Collaborate on Non‑Functional Requirements (NFR) Definition
Work with Customer Success and Product teams to understand business requirements and translate them into measurable performance criteria
Iterate on acceptable concurrency thresholds, latency targets, and throughput requirements
Validate that proposed NFRs are realistic and achievable given architectural constraints
System Documentation and Knowledge Extraction
Understanding of the existing system through code review, discussions with the development team, and exploratory testing
Create clear documentation of test methodologies, results, and recommendations for future testing
Recommendation and Optimization Guidance
Provide actionable recommendations for removing identified bottlenecks
Suggest configuration optimizations for Temporal (worker pools, task queues) and AWS services (Athena concurrency, ECS capacity)
Rapid Communication and Status Reporting
Maintain daily/frequent communication with the Tech Lead regarding project progress, blockers, and findings
Quickly elevate issues that could impact the aggressive timeline
Present findings belo recommendations to technical and non‑technical stakeholders
Cross‑Component Integration Testing
Test complex scenarios involving graph execution triggering node workflows across multiple system boundaries
Validate S3 read/write operations under concurrent load
Ensure inter‑component communication (API → Temporal, Temporal Activity → API triggers) performs reliably at scale
Key Performance Indicators
Test Coverage and Execution
Complete automated load test suite covering all critical components within first 3 weeks
Execute baseline and progressive load tests identifying maximum sustainable concurrency levels
Bottleneck Identification and Impact
Identify and document top 5‑7 performance bottlenecks with clear impact analysis
Provide actionable remediation recommendations with estimated effort and impact for each bottleneck
NFR Definition and Validation
Collaborate with stakeholders to define measurable NFRs within first 2 weeks
Validate that the system meets or document gaps against agreed NFR criteria by project end
Documentation and Knowledge Transfer
Deliver comprehensive test documentation, results analysis, and system performance characteristics
Conduct knowledge transfer593 sessions ensuring team can maintain and extend testing framework
Project Velocity and Communication
Meet weekly milestone targets in this fast‑paced 2‑month engagement
Maintain proactive communication rhythm (daily stand‑ups, weekly detailed reports to Tech Lead)
Required Qualifications Experience:
4+ years of experience in QA/performance testing roles
2+ years of hands‑on experience with load testing distributed systems and microengeanceamp; architectures
Proven experience with load testing tools (e.g., k6, JMeter, Locust, Gatling, Artillery)
Experience testing workflow orchestration systems (Temporal, Airflow, Prefect, or similar)
Demonstrated ability to test systems integrating with AWS services (particularly Athena, ECS, S3)
Technical Skills:
Strong proficiency in Python (required for test automation and working with FastAPI, Temporal)
Experience with REST APIகர் testing and performance validation
Understanding of distributed systems concepts: concurrency, queueing询,eturn, backpressure, rate limiting
Familiarity with AWS infrastructure and service limits
ateway Experience with monitoring and observability tools (Prometheus, Grafana, Datadog, or similar)
Proficiency sombra with Git and CI/CD pipelines
Ability to read and understand code in order to design effective tests
Immediate Availability:
Ability to start in early January 2025 and commit to focused 3‑month engagement
Availability for full‑time contract work during project duration
Preferred Qualifications
Direct experience with Temporal (workflows, activities, workers)
Experience with containerized workloads and Docker/ECS
Prior歎work in fast‑paced startup or scale‑up environments
Experience with infrastructure‑as‑code (Terraform, CloudFormation)
Background in Site Reliability Engineering (SRE) or DevOps practices
Previous contract/consulting experience with rapid knowledge acquisition
Experience with graph‑based workflow systems or DAG execution engines
Knowledge of AWS service limits and optimization strategies
Essential Soft Skills
Self‑Direction and Initiative – Ability to operate independently in an ambiguous, fast‑moving environment with minimal documentation; Proactive problem‑solving mindset; Comfortable making pragmatic decisions quickly in a time‑constrained project
Communication and Collaboration – Exceptional communication skills for extracting knowledge through conversations with existing team members; Ability to translate technical findings into clear, actionable recommendations for diverse audiences; Comfortable asking clarifying questions and challenging assumptions respectfully; Strong written communication for documentation and status updates
Adaptability and Learning Agility – Quick learner who can rapidly understand complex, poorly documented systems; Flexible and comfortable with changing priorities in a 15‑person team that is doubling in size; Thrives in fast‑paced environments with aggressive timelines; Comfortable with "good enough" when perfection isn’t achievable under constraints
Pragmatism and Results Orientation – Focused on delivering practical, actionable outcomes within tight timeframes; Understands balance between thoroughness and speed in a 2‑month engagement; Comfortable with "good enough" when perfect isn’t achievable within constraints
Stakeholder Management – Skilled at managing expectations with technical leadership about realistic timelines and trade‑offs; Diplomatic when delivering difficult news about performance limitations or bottlenecks; Collaborative approach when working with CS and Product on NFR definition
Key Challenges in This Role
Rapid Knowledge Acquisition with Limited Documentation
The existing system lacks comprehensive documentation; requires quick building of understanding загруз through code review, system exploration, and frequent discussions with the development team
Success requires comfort with ambiguity and strong investigative skills
Aggressive Timeline with High Impact
A 3‑month timeline to design tests, execute comprehensive load testing, identify bottlenecks, and deliver actionable recommendations is extremely tight
Must balance thoroughness with pragmatism; prioritize ruthlessly to ensure critical areas are covered
Complex Distributed System with Multiple Integration Points
The system involves multiple layers (FastAPI, Temporal, AWS services) with complex inter‑component communication patterns (graph → node workflows)
Must understand the entire stack to design realistic, comprehensive load tests that expose real‑world bottlenecks
#J-18808-Ljbffr
Sprachkenntnisse
- English
Hinweis für Nutzer
Dieses Stellenangebot stammt von einer Partnerplattform von TieTalent. Klicken Sie auf „Jetzt Bewerben“, um Ihre Bewerbung direkt auf deren Website einzureichen.