Senior Performance & Reliability QAE
Rewst
Location
Remote, Tampa Office
Employment Type
Full time
Location Type
Remote
Department
R&D
Compensation
- $130K – $150K • Offers Equity
Company Description:
Rewst is a tool designed for Managed Service Providers (MSPs) to streamline and automate their processes, saving valuable time and effort. Our platform helps MSPs achieve big automation wins, resulting in increased productivity and efficiency. We value the flexibility of remote work and enjoy opportunities to collaborate in person on occasion.
As a Performance & Reliability Quality Assurance Engineer at Rewst, you’ll serve as the primary QA resource for our Platform Squad. In this role, you will own performance, load, resilience, and scalability testing strategies for the core systems that power our automation platform. You will design, execute, and automate performance tests for distributed cloud services while ensuring observability, reliability, and production readiness. This role requires strong technical expertise, collaboration across engineering teams, and a passion for preventing issues before they reach customers.
You will play a key role in improving platform stability, accelerating CI/CD maturity, and ensuring that high-quality, resilient services are delivered with every release.
Responsibilities:
Performance & Load Testing Ownership: Design, build, and execute scalable performance and load testing strategies to validate infrastructure changes, multi-tenant behavior, database performance, and resource utilization under realistic and extreme conditions.
Reliability & Scalability Validation: Ensure systems scale efficiently by testing Kubernetes autoscaling, connection pooling, queue saturation, and regional workloads.
Automation Integration: Develop and automate performance regression tests, smoke tests, and reliability checks integrated into CI/CD pipelines to support anytime deployment readiness.
Observability Testing: Validate monitoring, alerting, and dashboard accuracy for production readiness. Ensure early-warning signals exist for performance degradation and capacity risks.
Cross-Functional Collaboration: Partner closely with Platform, SRE, and DevOps engineers to ensure testing aligns with architectural decisions and operational requirements. Support incident postmortems and translate learnings into preventative test coverage.
Tooling & Framework Improvements: Evaluate and implement performance tools, metrics generation, and automation frameworks that improve testing efficiency and coverage.
Environment & Chaos Testing: Perform stress, failover, and chaos testing to assess system resilience and validate reliability under failure scenarios.
Continuous Improvement: Drive enhancements in testing processes, quality metrics, and automation capabilities to increase confidence in production deployments and reduce incident frequency.
Required Skills and Qualifications:
Experience: 7–10+ years in QA engineering, SRE, or software development with a focus on performance and reliability testing
Performance Tools: Expertise with tools such as LoadRunner, k6, JMeter, or similar frameworks
Programming: Proficiency in Javascript/Typescript for scripting and automation; experience with Bash and CI workflows
Distributed Systems Understanding: Hands-on experience testing containerized applications and cloud-native architectures
Cloud Proficiency: Strong working knowledge of AWS services (EKS, RDS, S3, networking fundamentals)
CI/CD Integration: Experience integrating tests into pipelines (GitHub Actions, CircleCI, Jenkins, etc.)
Database Testing: Familiarity with PostgreSQL performance metrics, query optimization, and connection pooling, or other relatable database experience
Communication Skills: Ability to clearly articulate findings through documentation and collaboration with engineering teams
Analytical Approach: Strong root cause investigation skills and a proactive mindset toward issue prevention
Above & Beyond:
Familiarity with Kubernetes autoscaling (KEDA), Helm, and container orchestration best practices
Experience validating observability tooling such as Prometheus, Grafana, Loki, Mimir, SigNoz, or Sentry
Exposure to RabbitMQ, distributed messaging, and multi-tenant workload simulations
Experience with high-availability systems and disaster recovery testing
Previous work in a startup or high-growth SaaS environment
Robotic Process Automation (RPA) experience
Why Join Us?
Play a crucial role in increasing platform reliability while reducing major incidents
Opportunity to make a significant impact in a fast-growing startup environment
Collaborative and inclusive culture that values creativity, diversity, and innovation
Competitive compensation package including benefits and equity opportunities
Opportunity to build new testing capabilities from the ground up and make a lasting impact
Why Join Us:
Opportunity to make a significant impact in a fast-growing startup environment
Collaborative and inclusive culture that values creativity, diversity, and innovation
Competitive compensation package, including equity options and benefits
Flexible work arrangements and a supportive work-life balance
Compensation Range: $130K - $150K