Keep Quanloop’s platforms fast, reliable and secure. You will own monitoring and stability, automate repetitive work and help teams ship safely with solid CI/CD. This is a hands‑on role in our CORE environment, partnering with engineers and architects to turn operational excellence into everyday practice.
Job Responsibilities
- Operate and harden Linux (and some Windows) servers; manage users, permissions and patching
- Build and maintain containerised workloads with Docker and Kubernetes
- Implement and improve CI/CD pipelines for safe, frequent releases
- Configure monitoring, logging and alerting with actionable, low‑noise signals
- Troubleshoot performance and availability issues across app, OS, network and database layers
- Automate routine tasks with scripts and Infrastructure as Code; document runbooks
- Manage backups, restore tests and disaster‑recovery procedures
- Apply “security first” principles: least‑privilege IAM, secrets management, TLS and baseline configs
- Support capacity planning, scalability testing and cost‑efficient architecture choices
- Collaborate with Product, Engineering and Security on incident reviews and continuous improvement
Key Technologies
- Linux administration (Windows familiarity welcome)
- Docker, Kubernetes
- CI/CD (Git‑based workflows; Jenkins/GitLab/GitHub Actions)
- Monitoring and logging (e.g., Prometheus/Grafana, ELK/EFK, OpenTelemetry)
- Networking fundamentals, DNS, load balancers, firewalls
- Infrastructure as Code (e.g., Terraform/Ansible)
- Cloud (GCP preferred; AWS acceptable)
Qualifications
- 3+ years in systems administration/SRE with production services
- Strong Linux skills; containerisation (Docker) and orchestration (Kubernetes)
- Practical CI/CD experience and sound Git workflows
- Solid grasp of observability (metrics, logs, traces) and alert design
- Competence in scripting/automation (Bash/Python or similar) and IaC
- Understanding of security hardening, IAM and secrets management
- Clear communicator who documents well and works effectively with developers
We encourage applications from all qualified candidates and provide reasonable accommodations on request (email [email protected]).
Other Skills
- Calm, methodical troubleshooting under time pressure
- Ownership mindset with tidy runbooks and post‑incident follow‑through
- Performance tuning and cost‑aware decision‑making
- Familiarity with database operations (backups, basic tuning) is a plus
- Google Professional Cloud DevOps Engineer certification (or readiness to obtain) is a plus