Job description:
For our client we are looking for an Operations Engineer (DevOps / SRE) (m/f/d):
Project Description:
Our client is building an internal, service-oriented, cloud-native platform that enables product development teams with self-service capabilities to develop, run, and operate their software products. The platform covers core capabilities such as application infrastructure, service lifecycle management, build & delivery, data services, and operational services, running in a hybrid cloud setup (private cloud plus selected public clouds).
In this role, you will contribute to stability, reliability, and continuous improvement of a Kubernetes-based platform environment and its DevOps toolchain—covering incident support, automation, and operational excellence.
Project Parameters:
Start: 20.04.2026
Duration: 20.04.2026 – 31.12.2026 (option to extend)
Workload: Full-time
Location: Remote and on-site in Berlin (up to 50% on-site during ramp-up)
Tasks:
- Support incident management, including critical incident consulting, root cause analysis, and advanced troubleshooting across production and staging environments
- Operate and improve Kubernetes-based deployments, including Helm, Infrastructure as Code, and GitOps-driven operations
- Maintain and enhance CI/CD pipelines and ensure reliable operation of artifact/image registries, security policies, and configuration management practices
- Drive observability and automation; consult stakeholders to strengthen operational stability
- Execute change management and coordinate software version updates in a controlled manner
- Identify and resolve performance bottlenecks, improve rollout/deployment processes, and establish/improve operational processes
- Ensure availability and integrity of critical services and support stakeholders on complex operational challenges
Your Experience & Skills:
- Proven experience in DevOps Engineering / Operations / SRE, including operations, process optimization, and change management
- Strong hands-on expertise in Kubernetes and cloud-native ecosystems; deep knowledge of Helm, Infrastructure as Code, and GitOps principles
- Experience operating deployment tools (preferred: Harbor, OpenTofu Controller, Kyverno-Operator, ArgoCD)
- Hands-on experience with Git and CI tools (preferred: GitLab, Harness)
- Solid Linux administration plus scripting skills (e.g., Bash, Python, Go) and experience with observability tooling
- Experience with enterprise-scale and/or multi-cluster environments is a plus
- Experience with documentation/management tooling (preferred: Jira Service Management, Jira, Confluence)
- Language: Fluent English (at least C1, written and spoken)
Nice to bring:
- FluxCD, Velero, JFrog Artifactory, Backstage