Links

VM Monitoring

Connect virtual machine monitoring

Quickstart

VM Monitoring using Prometheus + Otel Collector

RevDeBug collects metrics data from virtual machines (VMs) using Prometheus node-exporter. The collected data is then transferred to an OpenTelemetry Collector, which sends it to an OpenTelemetry receiver and eventually to a Meter System for analysis. To learn more about node exporter click here.
In this system, each VM is defined as a service in OpenTelemetry Protocol (OAP) and is identified by the prefix "vm::". This allows for easier tracking and management of metrics data for each individual VM.
To use VM monitoring, set the vm:: prefix for the service.
In order to hook up a virtual machine to monitoring, you need to install something on it that will catch metrics from it (by default, everything is geared for node exporter, but there is no obstacle to it being another system catching data). For node exporter enabled from docker, the configuration is:
docker-compose.yml
version: '3.8'
services:
node_exporter:
image: docker.revdebug.com/node-exporter:latest
container_name: node_exporter
command:
- '--path.rootfs=/host'
network_mode: host
pid: host
restart: unless-stopped
volumes:
- '/:/host:ro,rslave'
Remember to open port 9100.
The next step is to enter the command:
docker compose -p revdebug up -d
The next step is to configure and run an additional docker container. Go to where you have installed docker compose revdebug to data/otel-collector/otel-collector-config-template.yaml
You need to change the name from otel-collector-config-template.yaml to otel-collector-config.yaml if you want to monitor your virtual machines.
In the static_configs section, set the address of the virtual machines
otel-collector-config.yaml
extensions:
health_check:
# A receiver is how data gets into the OpenTelemetry Collector
receivers:
# Set Prometheus Receiver to collects metrics from targets
# It’s supports the full set of Prometheus configuration
prometheus:
config:
scrape_configs:
- job_name: 'otel-collector'
scrape_interval: 30s
static_configs:
# Replace the IP to your VMs‘s IP which has installed Node Exporter
- targets: [ 'vm.address:9100' ]
- targets: [ 'vm.address:9100' ]
processors:
batch:
# An exporter is how data gets sent to different systems/back-ends
exporters:
# Exports metrics via gRPC using OpenCensus format
opencensus:
endpoint: "apm-oap:11800" # The OAP Server address
insecure: true
logging:
logLevel: info
service:
pipelines:
metrics:
receivers: [prometheus]
processors: [batch]
exporters: [logging, opencensus]
extensions: [health_check
After the configuration (specifying the VM from which you will download metrics and the OAP to which these metrics will be sent) it is left to configure the OAP.
The final step is to run OAP with the environment variables set:
SW_OTEL_RECEIVER=default
SW_OTEL_RECEIVER_ENABLED_HANDLERS="oc"
SW_OTEL_RECEIVER_ENABLED_OC_RULES=vm,oap
Start the revdebug server using the command:
docker compose -p revdebug -f docker-compose.vm-monitoring.yml -f docker-compose.yml up -d
In SW_OTEL_RECEIVER_ENABLED_OC_RULES you can write out other rules after the comma for example: SW_OTEL_RECEIVER_ENABLED_OC_RULES:vm,oap,something

VM Monitoring using Zabbix

To monitor a virtual masthead with Zabbix, all you need to do is enable the Zabbix agent in the container (on the monitored vm).
Variables to set: ZBX_HOSTNAME - this name will be visible in OAP, ZBX_SERVER_HOST - host address without protocol and without port.
docker run --name some-zabbix-agent -e ZBX_HOSTNAME="name" -e ZBX_SERVER_HOST=oaphost -d zabbix/zabbix-agent:latest
From the OAP side, open port 10051 and make sure that the variable SW_RECEIVER_ZABBIX:default is set.
apm-oap:
image: ${REVDEBUG_DOCKER_REGISTRY:-docker.revdebug.com/}apm-oap:${REVDEBUG_DOCKER_TAG:-latest}
depends_on:
opensearch:
condition: service_healthy
volumes:
- interop:/interop
# - ./config/vm.yaml:/skywalking/config/otel-oc-rules/vm.yaml
- ./config/allconfigs:/skywalking/config
# - ./config/self.yaml:/skywalking/config/fetcher-prom-rules/self.yaml
environment:
SW_STORAGE: elasticsearch
SW_STORAGE_ES_CLUSTER_NODES: opensearch:9200
SW_CORE_METRICS_DATA_TTL: ${SW_CORE_METRICS_DATA_TTL:-14}
SW_CORE_RECORD_DATA_TTL: ${SW_CORE_RECORD_DATA_TTL:-3}
SW_STORAGE_ES_INDEX_REPLICAS_NUMBER: ${SW_STORAGE_ES_INDEX_REPLICAS_NUMBER:-0}
SW_OTEL_RECEIVER: default
SW_OTEL_RECEIVER_ENABLED_OC_RULES: vm,oap,k8s-cluster,k8s-node,k8s-service,istio-controlplane
SW_RECEIVER_ZABBIX: default
logging:
driver: "local"
ports:
- "12800:12800"
- "10051:10051"

Supported metrics in our VM monitoring

Monitoring Panel
Metric Name
Description
CPU Usage
cpu_total_percentage
The overall percentage of CPU core utilization, and when there are two cores, the highest possible usage would be 200%.
Memory RAM Usage
meter_vm_memory_used
The amount of RAM being currently utilized.
Memory Swap Usage
meter_vm_memory_swap_percentage
The proportion of swap memory that is currently in use, expressed as a percentage of the total available swap memory.
CPU Average Used
meter_vm_cpu_average_used
The percentage of CPU core utilization in each mode, where "mode" refers to different states in which the CPU can operate, such as user mode or system mode.
CPU Load
meter_vm_cpu_load1
meter_vm_cpu_load5
meter_vm_cpu_load15
The average CPU load over the last 1, 5, and 15 minutes, indicating the amount of work that the CPU has been performing during these time periods.
Memory RAM
meter_vm_memory_total
meter_vm_memory_available
meter_vm_memory_used
Information regarding the RAM, which includes the total amount of RAM available, the amount of RAM currently being used, and the amount of RAM that is still available for use.
Memory Swap
meter_vm_memory_swap_free
meter_vm_memory_swap_total
Details regarding the swap memory, which includes the total amount of swap memory available, as well as the amount of swap memory that is currently not being used and is available for use.
File System Mountpoint Usage
meter_vm_filesystem_percentage
The proportion of the file system that is currently being used at each mount point, expressed as a percentage of the total capacity of the file system at that specific location.
Disk R/W
meter_vm_disk_read
meter_vm_disk_written
The amount of data that has been read from and written to the disk, indicating the amount of input/output operations that have occurred.
Network Bandwidth Usage
meter_vm_network_receive
meter_vm_network_transmit
The amount of data that has been received by and transmitted from the network interface, indicating the amount of incoming and outgoing network traffic.
Network Status
meter_vm_tcp_curr_estab
meter_vm_tcp_tw
meter_vm_tcp_alloc
meter_vm_sockets_used
meter_vm_udp_inuse
Information related to the network connections and protocols, including the total number of TCP connections that have been established, the number of TCP connections that are currently in the "time wait" state, the number of allocated TCP connections that are being used, the total number of sockets that are currently in use, and the total number of User Datagram Protocol (UDP) connections that are currently active.
Filefd Allocated
meter_vm_filefd_allocated
The total number of file descriptors that have been allocated, indicating the number of files that can be opened or accessed by the system at any given time.

Tab with monitoring VMs in APM

VMs monitoring in APM
If you want to enable the virtual machine monitoring option, please contact [email protected].