Job description:
For our client we are looking for an Operations Engineer (f/m/d) Private Cloud with Kubernetes.
Start: 12.01.2026
Duration: 6 months,
- wish for a long-term prolongation
Capacity: 80-100%
Location: 75% Remote, 25% Berlin or Frankfurt (1 week Berlin/Frankfurt / 3 weeks remote in rotation), up to 50% onsite in peak times
Language: English is a must, German is a plus
Team:
Developer 4 Platform (d4p) the focus is on providing products and services that empower Engineering and Operations teams under the program, supporting the needs of Product Engineering. The Platform Team (D4P) is committed to delivering, managing, and optimizing essential DevOps tools, which facilitate seamless continuous integration (CI), continuous development (CD), and delivery across the platform and its services.
Objectives / Tasks:
- Consulting on CI/CD pipelines and ensure operational readiness for deployments
- Ensure operational stability and responsiveness for D4P
- Reduce operational toil and improve service reliability
- Ensure platform operations adhere to security and compliance
- Ensure documentation in accurate and up to date
Skills (must-have):
- At least of 5 years of operational experience with self-managed Kubernetes clusters, self-managed services providing Kubernetes clusters and productive applications or systems in on premise environments
- Deep understanding and expertise in networking concepts, including protocols, load balancing, and security
- Profound knowledge and implementation experience with CI/CD processes, tooling (e.g. GitLab, Jenkins, Tekton, Argo Workflows, and Argo CD), concepts and associated quality and security assurance for software delivery
- Fundamental understanding of core operations processes (incident management, change management, problem management, IT Service Management) as well as SRE concepts
- Experience in gathering operational insights from monitoring or observability including SLI/SLA/SLO management and tracking.
- Hand-on experience in documenting procedures properly and enforcing clear runbooks or playbooks.
- Hands-on experience with monitoring and logging tools (e.g., Prometheus, Grafana, Datadog).
Skills (should-have)
- Project experience in software engineering (in Go Lang, C/C+
- or Python) with significant experience in building RESTful services in distributed environments