Remote: Monitoring and Observability Expert with Prometheus, Grafana etc.

Job description:

For our client we are looking for a Monitoring Expert (f/m/d).
Frame data:
Start: sasp
Duration: 31.03.2025++
Capacity: fulltime preferred, min. 50% (20hrs/week)
Location: remote
Tasks:
We are looking for an expert in Monitoring and Observability who will be involved in designing, implementing, and managing comprehensive monitoring solutions using Prometheus, Grafana, SNMP Exporter, Streaming Telemetry, OpenTelemetry, and other related technologies. A background in time-series databases, network monitoring, and dashboards is required.
Responsibilities:
- Design, implement, and manage Prometheus-based monitoring solutions, including configurations and alert rules.
- Develop and maintain interactive and visually appealing Grafana dashboards.
- Configure snmp modules/jobs to scrape snmp metrics for different network technologies in a very optimized way.
- Strong knowledge in Git to be able to clone working branches, develop and commit into the main branch. Or other approaches but show strong hold on Git usage.
- Identify and onboard new metrics from various systems and applications, developing data pipelines for metrics collection and storage.
- Optimize and scale monitoring environments to handle large volumes of metrics and ensure comprehensive monitoring coverage.
- Implement and manage Streaming Telemetry solutions for real-time data collection and monitoring.
- Integrate and manage OpenTelemetry for comprehensive tracing and observability across services.
- Troubleshoot and resolve issues related to data collection, monitoring configurations, and dashboard performance.
- Collaborate with DevOps, development, and operations teams to ensure proper instrumentation of applications and infrastructure.
- Document configurations, procedures, and provide training to team members and stakeholders.
Skills:
- Familiarity with network monitoring tools and practices.
- Extensive experience with Prometheus and related technologies (Alertmanager, Pushgateway, etc.).
- Strong knowledge of time-series databases and monitoring concepts.
- Proficiency in writing Prometheus queries (PromQL).
- Strong experience with Grafana and its ecosystem.
- Proficiency in creating and managing Grafana dashboards and panels.
- Knowledge of data visualization principles and best practices.
- Familiarity with monitoring and observability tools and practices.
- Strong knowledge of SNMP protocols and network device management.
- Experience with SNMP-Exporter and its integration with Prometheus.
- Strong in snmp modules creations and scrape configs for various network technologies.
- Strong Git experience.
- Strong understanding of metrics and monitoring concepts.
- Experience with metrics collection tools (Prometheus, Telegraf, Collectd, etc.).
- Experience with Streaming Telemetry solutions for real-time monitoring.
- Experience with OpenTelemetry for tracing and observability.
- Familiarity with Linux/Unix systems and scripting languages (Bash, Python).
- Experience with containerization and orchestration tools (Docker, Kubernetes).
Qualification:
- Bachelor’s degree in Computer Science, Engineering, or related field.
- 5
- years of experience in monitoring and observability roles
- Proficiency in tools like Prometheus, Grafana, PromQL, Alert Manager, Alert Framework, Github, SNMP-exporter, Streaming-Telemetry, Otel
- Strong coding and scripting skills.
- Excellent problem-solving abilities and attention to detail.
- Strong communication and teamwork skills.

Be a part of our comminity

Join us on Telegram or Discord to get instant notifications about the newest freelance projects and talk to some of the smartest software engineers in the world.