O

Job Summary

Omegahires

🌍 North America 🏠 Remote ⏱ Part-time 💼 Mid-level 🗓 7 weeks ago

Job Title: Grafana Cloud Engineer Location: Dallas, TX (Hybrid) Duration: 12 Months Job Summary We are seeking an experienced Grafana Cloud Engineer with strong expertise in designing, implementing, and optimizing enterprise observability solutions using Grafana Cloud. The ideal candidate will have hands-on experience with metrics, logs, traces, dashboards, alerting, and cloud-native monitoring architectures. Key Responsibilities Grafana Cloud Implementation & Administration Design, deploy, and manage Grafana Cloud environments for enterprise monitoring solutions. Configure and manage data sources such as Prometheus, Loki, Tempo, InfluxDB, Elasticsearch, and cloud monitoring tools. Integrate monitoring platforms with AWS CloudWatch, Azure Monitor, and GCP Operations Suite. Implement secure authentication and authorization using SSO, OAuth, LDAP, or Azure AD. Optimize Grafana architecture for scalability, performance, and cost efficiency. Observability & Monitoring Architecture Design and implement end-to-end observability stacks covering metrics, logs, and distributed tracing. Develop monitoring strategies aligned with SRE and DevOps practices (SLIs, SLOs, SLAs). Configure alerting systems using Grafana Alerting, Alertmanager, and Loki alerts with escalation workflows. Dashboarding & Visualization Develop custom Grafana dashboards for application and infrastructure monitoring. Translate business and technical monitoring requirements into actionable visualizations. Standardize dashboard templates and monitoring frameworks across teams. Integration & Automation Integrate Grafana Cloud with Kubernetes clusters, CI/CD pipelines, and cloud platforms. Automate observability deployments using Terraform, Helm, Ansible, or GitOps workflows. Implement OpenTelemetry instrumentation for distributed tracing. Troubleshooting & Optimization Diagnose and resolve issues related to metrics, logs, traces, and data source configurations. Optimize queries using PromQL, LogQL, and SQL for performance and cost efficiency. Ensure high availability and reliability of monitoring platforms. Documentation & Knowledge Sharing Develop architecture documentation, runbooks, and best practice guides. Train internal teams on Grafana dashboards, alerting, and observability practices. Act as a Subject Matter Expert (SME) for Grafana Cloud initiatives. Required Qualifications 5+ years of experience implementing and managing Grafana or Grafana Cloud. Strong hands-on experience with: Grafana Mimir Loki Tempo Prometheus Alertmanager Experience with PromQL, LogQL, and SQL query optimization. Hands-on experience implementing OpenTelemetry instrumentation. Experience with cloud platforms such as AWS, Azure, or GCP. Strong knowledge of Kubernetes, microservices architecture, and cloud-native observability. Experience with DevOps automation tools including Terraform, Helm, Git, and CI/CD pipelines. Strong understanding of SRE principles, monitoring design, and performance engineering. Excellent collaboration skills for working with cross-functional teams. Preferred Qualifications Grafana Cloud or Prometheus certifications. Experience designing multi-tenant monitoring solutions. Integration experience with ServiceNow, PagerDuty, Opsgenie, or Jira. Knowledge of RBAC, security frameworks, and compliance standards. Experience with other observability tools such as Splunk or New Relic. Ability to work independently in fast-paced environments and support hands-on implementation.

Share this job: