Monitoring GrapheneDB instance

GrapheneDB provides an endpoint to use the open-source monitoring tool Prometheus that will allow you to monitor the underlying server of your deployment. The provided metrics (CPU, memory, disk, etc.) will help you gather information on how the load affects the available resources and decide on what plan is needed for your use case.

With Prometheus, you can set up charts to visualize metrics, configure alerts, etc. and connect to other tools like Grafana or services like Datadog or Newrelic.

The following article details a basic Prometheus and Grafana implementation, providing some examples of dashboards that you can download.

Finding the Prometheus endpoint

You can find the Prometheus endpoint on the admin console in the Insights tab for every database. Please note that you will need to whitelist the IP or establish a peering connection if the Private Network is set up to private.

Metrics reference

The available metrics are described in the following tables.

JVM metricsDescription
jvm_classes_loadedThe number of classes that are currently loaded in the JVM
jvm_classes_loaded_totalThe total number of classes that have been loaded since the JVM has started execution
jvm_memory_pool_bytes_usedUsed bytes of a given JVM memory pool
jvm_memory_bytes_initInitial bytes of a given JVM memory area
jvm_memory_bytes_usedUsed bytes of a given JVM memory area
jvm_memory_bytes_maxMax bytes of a given JVM memory area
jvm_gc_collection_seconds_countTime spent in a given JVM garbage collector in seconds
Station metricsDescription
station_spec_memory_limit_bytesMemory limit for the station in bytes
station_cpu_usage_seconds_totalTotal amount of CPU time in seconds
station_memory_usage_bytesCurrent memory usage in bytes
station_memory_failures_totalCumulative count of memory allocation failures
station_cache_bytesAmount of bytes of cache
station_cpu_load_average_deltaCPU load average percentage
station_last_seenLast time the station was seen by the exporter
station_network_transmit_bytes_totalCumulative count of bytes transmitted
station_network_receive_bytes_totalCumulative count of bytes received
station_uptime_secondsStation uptime in seconds
station_scrape_error1 if there was an error while getting metrics, 0 otherwise
station_fs_reads_bytes_totalTotal amount of bytes read from the filesystem
station_fs_writes_bytes_totalTotal amount of bytes written to the filesystem
Storage metricsDescription
database_relationships_sizeRelationships size on disk in bytes
database_properties_sizeProperties size on disk in bytes
database_device_used_sizeUsed disk size in bytes
database_graphs_sizeGraph databases size on disk in bytes
database_nodes_sizeNodes size on disk in bytes
database_plugins_sizePlugins size on disk
database_indexes_sizeIndexes size on disk

On Cluster databases, ONgDB and Neo4j Enterprise metrics will be also available. Read more about Clusters monitoring in this section.

Setting up Prometheus and Grafana with Docker

Prerequisites

Docker​ and ​ docker-compose​ are needed to run the following steps. You can read ​here​ how to download and install them.

Configuration files

You will need to create the following configuration files before running the Prometheus and Grafana containers.

global:
  scrape_interval:     15s  # By default, scrape targets every 15 seconds.
  evaluation_interval: 15s
 
scrape_configs:
  - job_name: 'graphenedb-prometheus'
    scheme: 'https'
    static_configs:
    - targets: ['db-aqocpxyfkjc1loorgcuq.graphenedb.com:24780']

📘

Please, note that we have removed the /metrics part from the given URL. Prometheus expects metrics to be available on targets on a path of /metrics.

In case you want to monitor all the cluster nodes, read more about Clusters monitoring in this section.

apiVersion: 1
 
datasources:
  - name: Prometheus
    type: prometheus
    access: proxy
    url: http://prometheus:9090

Finally, to run Grafana and Prometheus you will need to create the following docker-compose file. The volumes path should be set to the location of your prometheus.yml and datasource.yml
files.

version: "2"
services:
 
  prometheus:
    image: prom/prometheus
    container_name: prometheus
    volumes:
      - /path_to/prometheus.yml:/etc/prometheus/prometheus.yml
    ports:
      - "9090:9090"
 
  grafana:
    image: grafana/grafana:master
    container_name: grafana
    ports:
      - "3000:3000"
    volumes:
      - /path_to/datasource.yml:/etc/grafana/provisioning/datasources/datasource.yml
    links:
      - prometheus

Grafana and Prometheus

Once the configuration files have been created, you just need to run the following docker-compose command:

docker-compose run -d -p 3000:3000 grafana

By running this command, Grafana should be ready at ​http://localhost:3000/login​. The default user/pass is admin/admin. You can also check that both containers are up and running with the docker ps command.

After logging into Grafana, you should see on the Grafana Home Dashboard the first step​ Add your first data source​ as complete. At http://localhost:3000/datasources, Prometheus data source should be listed.

To test that the Prometheus endpoint is working as expected, click on​ the Prometheus data source list item. At the bottom of the page, click on the​ Test​ green button.

Creating a Grafana Dashboard

To create a new Dashboard, visit URL ​http://localhost:3000/dashboard/new​. Then click on the top
right Dashboard settings​ button to configure it.

Apart from the name (e.g., GrapheneDB instance), description, and general settings, we can tune our Dashboard further by adding variables.

From Grafana documentation:​ A variable is a placeholder for a value. You can use variables in
metric queries and in panel titles. So when you change the value, using the dropdown at the top
of the dashboard, your panel’s metric queries will change to reflect the new value.

Read more about Dashboard and panel variables here.

All is already set up to start visualizing Prometheus given metrics. The following section will use to select and aggregate time series a functional query language called PromQL (Prometheus Query Language).

It’s recommended to take a look at PromQL documentation to further understand.

In the following section, it is explained how to add some charts to the recently created dashboard. You will find a dashboard example and how to use it in the next section.

CPU usage

To add the first chart to our Grafana dashboard, click on the Add panel button (top right menu) and click on Add Query.

The CPU load information is given by the metric station_cpu_load_average_delta. To represent this information in the way we could expect, we will need to divide this value by 100:

Network I/O

Add a new panel and click on Add Query. We are going to use station_network_transmit_bytes_total and station_network_receive_bytes_total counters and Graph visualization.

To make these counters more valuable for a database state diagnosis, we are going to use irate function. The following visualization represent the per-second average rate of increase as measured over the last 5 minutes:

On the panel edition view right tab Field, select the proper Unit: bytes/sec.

Disk usage

Add a new panel and click on Add Query. The metric we are looking for this time is database_device_used_size to get the percentage of disk usage:

The database maximum size is given by the plan. In the example we are using a DS1, with 1GB (1073741824 Bytes).

On the right Panel tab, under Visualization section, select Gauge. Then, select on the tab Field, the proper unit: percent (0-100) and set as Max value 100.

The next step is to define the threshold values. Thresholds set the color of either the value text or the background depending on conditions that you define. Set one on 80% and the other on 95%.

Unfortunately, although this visualization gives a good way to check the disk space, no alerts are available for Gauge visualization.

Let’s prepare a Graph visualization to be able to set up an alert.

Setting up alerts with Grafana

First of all, you will need to define how you are going to be notified by the Grafana alerts. You can set up different ways to be notified at the Notification channels Alerting tab.

Please, keep in mind that every notification channel must be configured to work with the Grafana alerting feature. Once you have added the needed information on the Edit notification channel form, click on the Test button to ensure everything is working as expected.

Find extended information about notification channels on Grafana docs.

Create an alert rule

Only Graph visualization includes the alert feature. Let’s add an alert for the disk usage metric.

Given the following visualization of database_size_device_used_size:

Click on the Alert tab to add an alert to send a notification when the database is at 80% of its capacity.

In this example, a DS1 with 1GB of disk, we want to be notified when the disk is above 800MB (838860800 Bytes).

The rule will be evaluated every minute and the alert notification will be sent if this rule has been firing for more than 5 minutes:

At the bottom of the form, you can add the notification channels you want to include in this alert and customize the message that will be sent.

Don’t forget to save your changes at the top right corner of the panel edition page.

Cluster monitoring

On Cluster databases, ONgDB and Neo4j Enterprise metrics will also be available. Please, check Neo4j Operation Manual to get a list of the available metrics.

In the following section, you will find a cluster dashboard example and how to use it.

Prometheus configuration file

Find the Prometheus target URL on your GrapheneDB database Insights tab. Please, note that we have removed the /metrics part from the given URL. Prometheus expects metrics to be available on targets on a path of /metrics.

In case you want to monitor all the cluster nodes, you will need to modify the given endpoint URL as in the following example:

global:
  scrape_interval:     15s  # By default, scrape targets every 15 seconds.
  evaluation_interval: 15s
 
scrape_configs:
  - job_name: 'graphenedb-prometheus'
    scheme: 'https'
    static_configs:
    - targets: ['db-1-aqocpxyfkjc1loorgcuq.graphenedb.com:24780', 'db-2-aqocpxyfkjc1loorgcuq.graphenedb.com:24780', 'db-3-aqocpxyfkjc1loorgcuq.graphenedb.com:24780']

Import/Load a dashboard

We’ve prepared some example dashboards that will help you to start quickly with a working version with the most important metrics. You can reach out to our support team, and we'll be happy to send the JSON files over.
The options you have are a Single database dashboard, a Cluster database dashboard for OngDB, and a Cluster database dashboard for Neo4j Enterprise.

Below you can see the screenshot of how it looks like for a Single database dashboard, as an example:

1227

Single database dashboard example

To load a dashboard JSON, click on the + icon on the left menu bar and then on the Import link. You can also just visit http://localhost:3000/dashboard/import

Finally click on Upload JSON file button, select the downloaded file and select your data source. Once you click on the Import button, the example dashboard will be loaded.

In case you want to export your dashboard to a JSON file, you can do it by clicking on the Share link near your dashboard name:

Visit the Export tab and click on the Save to file button.