About the roleYou'll own Qase's entire infrastructure end-to-end, working directly with our VP of Engineering. We're a 20-person engineering team building a test management platform used by engineering teams worldwide. We currently have one infra engineer — you'll take the lead, set direction, and pragmatically evolve the platform for both internal development and external customers.
What you'll do- Own and evolve our core stack: Kubernetes (EKS), AWS, Aurora PostgreSQL, ClickHouse
- Drive infrastructure decisions pragmatically — understanding trade-offs between cost, reliability, and speed
- Build and improve our internal platform so engineers can ship faster with full observability
- Design multi-tenant architecture: dedicated clusters, account isolation, cost optimization across AWS accounts
- Improve observability across the stack (monitoring, logging, tracing)
- Support our growing investment in AI agents — infrastructure for agentic systems is becoming a key priority
- Manage the full cycle: CI/CD, infrastructure as code, reliability, security hardening, cost management
What we're looking for- Strong hands-on experience with AWS (especially EKS, RDS/Aurora), Kubernetes, and Terraform
- Experience with observability tools (Datadog, Grafana, or similar)
- Solid understanding of Linux operations and debugging
- Experience managing multi-cluster or multi-tenant Kubernetes environments
- Track record of making pragmatic infrastructure decisions in startups or scale-ups — not just executing, but designing solutions
- Strong communication skills in English - you'll work directly with the VP of Engineering and across engineering teams
Nice to have- Experience with ClickHouse or similar OLAP databases
- Go development experience
- Background in B2B SaaS platforms
- Experience with AI/agent infrastructure
- Team lead experience or mentoring junior engineers
What makes this interesting- Full ownership of infrastructure — no layers between you and decisions
- Direct line to VP of Engineering
- Real architectural challenges: multi-tenant isolation, geo-distribution, cost optimization
- Greenfield opportunity to shape how the platform scales
- Growing AI/agents investment — you'll build the infrastructure that powers it