# VM Monitoring

### Quickstart

#### VM Monitoring using Prometheus + Otel Collector

RevDeBug collects metrics data from virtual machines (VMs) using Prometheus node-exporter. The collected data is then transferred to an OpenTelemetry Collector, which sends it to an OpenTelemetry receiver and eventually to a Meter System for analysis. To learn more about node exporter [click here](https://github.com/prometheus/node_exporter).

In this system, each VM is defined as a service in OpenTelemetry Protocol (OAP) and is identified by the prefix "vm::". This allows for easier tracking and management of metrics data for each individual VM.

To use VM monitoring, set the vm:: prefix for the service.

In order to hook up a virtual machine to monitoring, you need to install something on it that will catch metrics from it (by default, everything is geared for node exporter, but there is no obstacle to it being another system catching data). For node exporter enabled from docker, the configuration is:

{% code title="docker-compose.yml" %}

```yaml
version: '3.8'
services:
  node_exporter:
    image: docker.revdebug.com/node-exporter:latest
    container_name: node_exporter
    command:
      - '--path.rootfs=/host'
    network_mode: host
    pid: host
    restart: unless-stopped
    volumes:
      - '/:/host:ro,rslave'
```

{% endcode %}

{% hint style="info" %}
Remember to open port 9100.
{% endhint %}

The next step is to enter the command:

```bash
docker compose -p revdebug up -d
```

The next step is to configure and run an additional docker container. Go to where you have installed docker compose revdebug to data/otel-collector/otel-collector-config-template.yaml

{% hint style="info" %}
You need to change the name from otel-collector-config-template.yaml to otel-collector-config.yaml if you want to monitor your virtual machines.
{% endhint %}

In the static\_configs section, set the address of the virtual machines

{% code title="otel-collector-config.yaml" %}

```yaml
extensions:
  health_check:
# A receiver is how data gets into the OpenTelemetry Collector
receivers:
  # Set Prometheus Receiver to collects metrics from targets
  # It’s supports the full set of Prometheus configuration  
  prometheus:
    config:
      scrape_configs:
        - job_name: 'otel-collector'
          scrape_interval: 30s
          static_configs: 
              # Replace the IP to your VMs‘s IP which has installed Node Exporter
            - targets: [ 'vm.address:9100' ] 
            - targets: [ 'vm.address:9100' ] 
processors:
  batch:
# An exporter is how data gets sent to different systems/back-ends
exporters:
  # Exports metrics via gRPC using OpenCensus format
  opencensus:
    endpoint: "apm-oap:11800" # The OAP Server address
    insecure: true
  logging:
    logLevel: info
service:
  pipelines:
    metrics:
      receivers: [prometheus]
      processors: [batch]
      exporters: [logging, opencensus]

  extensions: [health_check
```

{% endcode %}

After the configuration (specifying the VM from which you will download metrics and the OAP to which these metrics will be sent) it is left to configure the OAP.

The final step is to run OAP with the environment variables set:

```
SW_OTEL_RECEIVER=default
SW_OTEL_RECEIVER_ENABLED_HANDLERS="oc"
SW_OTEL_RECEIVER_ENABLED_OC_RULES=vm,oap
```

Start the revdebug server using the command:

```bash
docker compose -p revdebug -f docker-compose.vm-monitoring.yml -f  docker-compose.yml up -d 
```

{% hint style="info" %}
In `SW_OTEL_RECEIVER_ENABLED_OC_RULES` you can write out other rules after the comma for example: `SW_OTEL_RECEIVER_ENABLED_OC_RULES:vm,oap,something`
{% endhint %}

#### VM Monitoring using Zabbix

To monitor a virtual masthead with Zabbix, all you need to do is enable the Zabbix agent in the container (on the monitored vm).

Variables to set: `ZBX_HOSTNAME` - this name will be visible in OAP, `ZBX_SERVER_HOST` - host address without protocol and without port.

```bash
docker run --name some-zabbix-agent -e ZBX_HOSTNAME="name" -e ZBX_SERVER_HOST=oaphost -d zabbix/zabbix-agent:latest
```

{% hint style="info" %}
From the OAP side, open port `10051` and make sure that the variable `SW_RECEIVER_ZABBIX:default` is set.
{% endhint %}

```yaml
  apm-oap:
    image: ${REVDEBUG_DOCKER_REGISTRY:-docker.revdebug.com/}apm-oap:${REVDEBUG_DOCKER_TAG:-latest}
    depends_on:
      opensearch:
        condition: service_healthy
    volumes:
        - interop:/interop
        # - ./config/vm.yaml:/skywalking/config/otel-oc-rules/vm.yaml
        - ./config/allconfigs:/skywalking/config
        # - ./config/self.yaml:/skywalking/config/fetcher-prom-rules/self.yaml
    environment:
            SW_STORAGE: elasticsearch
            SW_STORAGE_ES_CLUSTER_NODES: opensearch:9200
            SW_CORE_METRICS_DATA_TTL: ${SW_CORE_METRICS_DATA_TTL:-14}
            SW_CORE_RECORD_DATA_TTL: ${SW_CORE_RECORD_DATA_TTL:-3}
            SW_STORAGE_ES_INDEX_REPLICAS_NUMBER: ${SW_STORAGE_ES_INDEX_REPLICAS_NUMBER:-0}
            SW_OTEL_RECEIVER: default
            SW_OTEL_RECEIVER_ENABLED_OC_RULES: vm,oap,k8s-cluster,k8s-node,k8s-service,istio-controlplane
            SW_RECEIVER_ZABBIX: default
    logging:
        driver: "local"
    ports:
        - "12800:12800"
        - "10051:10051"

```

### Supported metrics in our VM monitoring

| Monitoring Panel             | Metric Name                                                                                                                                        | Description                                                                                                                                                                                                                                                                                                                                                                                                                        |
| ---------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| CPU Usage                    | cpu\_total\_percentage                                                                                                                             | The overall percentage of CPU core utilization, and when there are two cores, the highest possible usage would be 200%.                                                                                                                                                                                                                                                                                                            |
| Memory RAM Usage             | meter\_vm\_memory\_used                                                                                                                            | The amount of RAM being currently utilized.                                                                                                                                                                                                                                                                                                                                                                                        |
| Memory Swap Usage            | meter\_vm\_memory\_swap\_percentage                                                                                                                | The proportion of swap memory that is currently in use, expressed as a percentage of the total available swap memory.                                                                                                                                                                                                                                                                                                              |
| CPU Average Used             | meter\_vm\_cpu\_average\_used                                                                                                                      | The percentage of CPU core utilization in each mode, where "mode" refers to different states in which the CPU can operate, such as user mode or system mode.                                                                                                                                                                                                                                                                       |
| CPU Load                     | <p>meter\_vm\_cpu\_load1</p><p>meter\_vm\_cpu\_load5</p><p>meter\_vm\_cpu\_load15</p>                                                              | The average CPU load over the last 1, 5, and 15 minutes, indicating the amount of work that the CPU has been performing during these time periods.                                                                                                                                                                                                                                                                                 |
| Memory RAM                   | <p>meter\_vm\_memory\_total</p><p>meter\_vm\_memory\_available</p><p>meter\_vm\_memory\_used</p>                                                   | Information regarding the RAM, which includes the total amount of RAM available, the amount of RAM currently being used, and the amount of RAM that is still available for use.                                                                                                                                                                                                                                                    |
| Memory Swap                  | <p>meter\_vm\_memory\_swap\_free</p><p>meter\_vm\_memory\_swap\_total</p>                                                                          | Details regarding the swap memory, which includes the total amount of swap memory available, as well as the amount of swap memory that is currently not being used and is available for use.                                                                                                                                                                                                                                       |
| File System Mountpoint Usage | meter\_vm\_filesystem\_percentage                                                                                                                  | The proportion of the file system that is currently being used at each mount point, expressed as a percentage of the total capacity of the file system at that specific location.                                                                                                                                                                                                                                                  |
| Disk R/W                     | <p>meter\_vm\_disk\_read</p><p>meter\_vm\_disk\_written</p>                                                                                        | The amount of data that has been read from and written to the disk, indicating the amount of input/output operations that have occurred.                                                                                                                                                                                                                                                                                           |
| Network Bandwidth Usage      | <p>meter\_vm\_network\_receive</p><p>meter\_vm\_network\_transmit</p>                                                                              | The amount of data that has been received by and transmitted from the network interface, indicating the amount of incoming and outgoing network traffic.                                                                                                                                                                                                                                                                           |
| Network Status               | <p>meter\_vm\_tcp\_curr\_estab</p><p>meter\_vm\_tcp\_tw</p><p>meter\_vm\_tcp\_alloc</p><p>meter\_vm\_sockets\_used</p><p>meter\_vm\_udp\_inuse</p> | Information related to the network connections and protocols, including the total number of TCP connections that have been established, the number of TCP connections that are currently in the "time wait" state, the number of allocated TCP connections that are being used, the total number of sockets that are currently in use, and the total number of User Datagram Protocol (UDP) connections that are currently active. |
| Filefd Allocated             | meter\_vm\_filefd\_allocated                                                                                                                       | The total number of file descriptors that have been allocated, indicating the number of files that can be opened or accessed by the system at any given time.                                                                                                                                                                                                                                                                      |

### Tab with monitoring VMs in APM

<figure><img src="/files/iv2XYtaLlFhKbXSG0gZ9" alt=""><figcaption><p>VMs monitoring in APM</p></figcaption></figure>

If you want to enable the virtual machine monitoring option, please contact <sales@revdebug.com>.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://revdebug.gitbook.io/revdebug/revdebug-features/vm-monitoring.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
