VM Monitoring
Connect virtual machine monitoring
RevDeBug collects metrics data from virtual machines (VMs) using Prometheus node-exporter. The collected data is then transferred to an OpenTelemetry Collector, which sends it to an OpenTelemetry receiver and eventually to a Meter System for analysis. To learn more about node exporter click here.
In this system, each VM is defined as a service in OpenTelemetry Protocol (OAP) and is identified by the prefix "vm::". This allows for easier tracking and management of metrics data for each individual VM.
To use VM monitoring, set the vm:: prefix for the service.
In order to hook up a virtual machine to monitoring, you need to install something on it that will catch metrics from it (by default, everything is geared for node exporter, but there is no obstacle to it being another system catching data). For node exporter enabled from docker, the configuration is:
docker-compose.yml
version: '3.8'
services:
node_exporter:
image: docker.revdebug.com/node-exporter:latest
container_name: node_exporter
command:
- '--path.rootfs=/host'
network_mode: host
pid: host
restart: unless-stopped
volumes:
- '/:/host:ro,rslave'
Remember to open port 9100.
The next step is to enter the command:
docker compose -p revdebug up -d
The next step is to configure and run an additional docker container. Go to where you have installed docker compose revdebug to data/otel-collector/otel-collector-config-template.yaml
You need to change the name from otel-collector-config-template.yaml to otel-collector-config.yaml if you want to monitor your virtual machines.
In the static_configs section, set the address of the virtual machines
otel-collector-config.yaml
extensions:
health_check:
# A receiver is how data gets into the OpenTelemetry Collector
receivers:
# Set Prometheus Receiver to collects metrics from targets
# It’s supports the full set of Prometheus configuration
prometheus:
config:
scrape_configs:
- job_name: 'otel-collector'
scrape_interval: 30s
static_configs:
# Replace the IP to your VMs‘s IP which has installed Node Exporter
- targets: [ 'vm.address:9100' ]
- targets: [ 'vm.address:9100' ]
processors:
batch:
# An exporter is how data gets sent to different systems/back-ends
exporters:
# Exports metrics via gRPC using OpenCensus format
opencensus:
endpoint: "apm-oap:11800" # The OAP Server address
insecure: true
logging:
logLevel: info
service:
pipelines:
metrics:
receivers: [prometheus]
processors: [batch]
exporters: [logging, opencensus]
extensions: [health_check
After the configuration (specifying the VM from which you will download metrics and the OAP to which these metrics will be sent) it is left to configure the OAP.
The final step is to run OAP with the environment variables set:
SW_OTEL_RECEIVER=default
SW_OTEL_RECEIVER_ENABLED_HANDLERS:"oc"
SW_OTEL_RECEIVER_ENABLED_OC_RULES:vm,oap
Start the revdebug server using the command:
docker compose -p revdebug -f docker-compose.vm-monitoring.yml -f docker-compose.yml up -d
In
SW_OTEL_RECEIVER_ENABLED_OC_RULES
you can write out other rules after the comma for example: SW_OTEL_RECEIVER_ENABLED_OC_RULES:vm,oap,something
To monitor a virtual masthead with Zabbix, all you need to do is enable the Zabbix agent in the container (on the monitored vm).
Variables to set:
ZBX_HOSTNAME
- this name will be visible in OAP, ZBX_SERVER_HOST
- host address without protocol and without port.docker run --name some-zabbix-agent -e ZBX_HOSTNAME="name" -e ZBX_SERVER_HOST=oaphost -d zabbix/zabbix-agent:latest
From the OAP side, open port
10051
and make sure that the variable SW_RECEIVER_ZABBIX:default
is set. apm-oap:
image: ${REVDEBUG_DOCKER_REGISTRY:-docker.revdebug.com/}apm-oap:${REVDEBUG_DOCKER_TAG:-latest}
depends_on:
opensearch:
condition: service_healthy
volumes:
- interop:/interop
# - ./config/vm.yaml:/skywalking/config/otel-oc-rules/vm.yaml
- ./config/allconfigs:/skywalking/config
# - ./config/self.yaml:/skywalking/config/fetcher-prom-rules/self.yaml
environment:
SW_STORAGE: elasticsearch
SW_STORAGE_ES_CLUSTER_NODES: opensearch:9200
SW_CORE_METRICS_DATA_TTL: ${SW_CORE_METRICS_DATA_TTL:-14}
SW_CORE_RECORD_DATA_TTL: ${SW_CORE_RECORD_DATA_TTL:-3}
SW_STORAGE_ES_INDEX_REPLICAS_NUMBER: ${SW_STORAGE_ES_INDEX_REPLICAS_NUMBER:-0}
SW_OTEL_RECEIVER: default
SW_OTEL_RECEIVER_ENABLED_OC_RULES: vm,oap,k8s-cluster,k8s-node,k8s-service,istio-controlplane
SW_RECEIVER_ZABBIX: default
logging:
driver: "local"
ports:
- "12800:12800"
- "10051:10051"
Monitoring Panel | Metric Name | Description |
---|---|---|
CPU Usage | cpu_total_percentage | The overall percentage of CPU core utilization, and when there are two cores, the highest possible usage would be 200%. |
Memory RAM Usage | meter_vm_memory_used | The amount of RAM being currently utilized. |
Memory Swap Usage | meter_vm_memory_swap_percentage | The proportion of swap memory that is currently in use, expressed as a percentage of the total available swap memory. |
CPU Average Used | meter_vm_cpu_average_used | The percentage of CPU core utilization in each mode, where "mode" refers to different states in which the CPU can operate, such as user mode or system mode. |
CPU Load | meter_vm_cpu_load1 meter_vm_cpu_load5 meter_vm_cpu_load15 | The average CPU load over the last 1, 5, and 15 minutes, indicating the amount of work that the CPU has been performing during these time periods. |
Memory RAM | meter_vm_memory_total meter_vm_memory_available meter_vm_memory_used | Information regarding the RAM, which includes the total amount of RAM available, the amount of RAM currently being used, and the amount of RAM that is still available for use. |
Memory Swap | meter_vm_memory_swap_free meter_vm_memory_swap_total | Details regarding the swap memory, which includes the total amount of swap memory available, as well as the amount of swap memory that is currently not being used and is available for use. |
File System Mountpoint Usage | meter_vm_filesystem_percentage | The proportion of the file system that is currently being used at each mount point, expressed as a percentage of the total capacity of the file system at that specific location. |
Disk R/W | meter_vm_disk_read meter_vm_disk_written | The amount of data that has been read from and written to the disk, indicating the amount of input/output operations that have occurred. |
Network Bandwidth Usage | meter_vm_network_receive meter_vm_network_transmit | The amount of data that has been received by and transmitted from the network interface, indicating the amount of incoming and outgoing network traffic. |
Network Status | meter_vm_tcp_curr_estab meter_vm_tcp_tw meter_vm_tcp_alloc meter_vm_sockets_used meter_vm_udp_inuse | Information related to the network connections and protocols, including the total number of TCP connections that have been established, the number of TCP connections that are currently in the "time wait" state, the number of allocated TCP connections that are being used, the total number of sockets that are currently in use, and the total number of User Datagram Protocol (UDP) connections that are currently active. |
Filefd Allocated | meter_vm_filefd_allocated | The total number of file descriptors that have been allocated, indicating the number of files that can be opened or accessed by the system at any given time. |

VMs monitoring in APM
Last modified 7d ago