We’ve implemented a Prometheus/Grafana/Thanos monitoring setup here and are scraping metrics from our Singlestore clusters using the integrated exporter and promtheus. I’ve been looking at the documentation and found some dashboards but these seem to require that Grafana be running on the memsql cluster and use memsql as a datasource, which is not an option for us. I tried importing the dashboards to see if I could perhaps modify them to use the metrics coming in from prometheus but the doing so locks up my browser tab.
We’d additionally like to get some insight or advice on metrics coming from the exporter which would be good for alerts. Do you have any documentation?
Any information would be appreciated!
Thanks,
Garret
Can you please clarify what your roadblock is? Is it that you can’t run Grafana on the SingleStore cluster since you already have a Grafana instance. Or is the issue that you can’t leverage SingleStore as the data source for the metrics?
If it is the former and you already have a Grafana instance ready that can connect to the metrics SingleStore cluster, you can skip the step to install Grafana on the master aggregator and follow the instructions found here to add the SingleStore metrics data source once you have it set up.
Then you can download the dashboards from here and import them into Grafana.
Hi @garret.coffman. Thank you for trying out SingleStore monitoring.
Our Grafana dashboards indeed use a singlestore cluster to monitor another cluster (or itself). I am not sure why they lock up your browser, but the dashboard uses a MySQL datasource named ‘monitoring’ , and (just a wild guess ) perhaps configuring that will unblock.
Monitoring SingleStore is highly dependent on workload, but there are some high level metrics that could make sense out of the box to watch for sudden or gradual changes in traffic or errors.
Likewise, if you would like to track host metrics the memsql_sysinfo_* subsystem will be useful.
One of our grafana dashboards is all about memory allocations, which can reveal some causes of max memory errors. We currently don’t have a more detailed document, but it is on our roadmap.
I’m install grafana via operator on openshift( grafana v 7.5.17 )
I downloaded 4 dashboard for SingleStore ( the memory dashboard did not display chart when other dashboard display enough information)
Dashboards were built on Grafana 9, so wondering if the errors are due to version compatibility. Can you please share the error message to further debug ?