VM Monitoring

Connect virtual machine monitoring

Quickstart

VM Monitoring using Prometheus + Otel Collector

RevDeBug collects metrics data from virtual machines (VMs) using Prometheus node-exporter. The collected data is then transferred to an OpenTelemetry Collector, which sends it to an OpenTelemetry receiver and eventually to a Meter System for analysis. To learn more about node exporter click here.

In this system, each VM is defined as a service in OpenTelemetry Protocol (OAP) and is identified by the prefix "vm::". This allows for easier tracking and management of metrics data for each individual VM.

To use VM monitoring, set the vm:: prefix for the service.

In order to hook up a virtual machine to monitoring, you need to install something on it that will catch metrics from it (by default, everything is geared for node exporter, but there is no obstacle to it being another system catching data). For node exporter enabled from docker, the configuration is:

docker-compose.yml
version: '3.8'
services:
  node_exporter:
    image: docker.revdebug.com/node-exporter:latest
    container_name: node_exporter
    command:
      - '--path.rootfs=/host'
    network_mode: host
    pid: host
    restart: unless-stopped
    volumes:
      - '/:/host:ro,rslave'

Remember to open port 9100.

The next step is to enter the command:

docker compose -p revdebug up -d

The next step is to configure and run an additional docker container. Go to where you have installed docker compose revdebug to data/otel-collector/otel-collector-config-template.yaml

You need to change the name from otel-collector-config-template.yaml to otel-collector-config.yaml if you want to monitor your virtual machines.

In the static_configs section, set the address of the virtual machines

otel-collector-config.yaml
extensions:
  health_check:
# A receiver is how data gets into the OpenTelemetry Collector
receivers:
  # Set Prometheus Receiver to collects metrics from targets
  # It’s supports the full set of Prometheus configuration  
  prometheus:
    config:
      scrape_configs:
        - job_name: 'otel-collector'
          scrape_interval: 30s
          static_configs: 
              # Replace the IP to your VMs‘s IP which has installed Node Exporter
            - targets: [ 'vm.address:9100' ] 
            - targets: [ 'vm.address:9100' ] 
processors:
  batch:
# An exporter is how data gets sent to different systems/back-ends
exporters:
  # Exports metrics via gRPC using OpenCensus format
  opencensus:
    endpoint: "apm-oap:11800" # The OAP Server address
    insecure: true
  logging:
    logLevel: info
service:
  pipelines:
    metrics:
      receivers: [prometheus]
      processors: [batch]
      exporters: [logging, opencensus]

  extensions: [health_check

After the configuration (specifying the VM from which you will download metrics and the OAP to which these metrics will be sent) it is left to configure the OAP.

The final step is to run OAP with the environment variables set:

SW_OTEL_RECEIVER=default
SW_OTEL_RECEIVER_ENABLED_HANDLERS="oc"
SW_OTEL_RECEIVER_ENABLED_OC_RULES=vm,oap

Start the revdebug server using the command:

docker compose -p revdebug -f docker-compose.vm-monitoring.yml -f  docker-compose.yml up -d 

In SW_OTEL_RECEIVER_ENABLED_OC_RULES you can write out other rules after the comma for example: SW_OTEL_RECEIVER_ENABLED_OC_RULES:vm,oap,something

VM Monitoring using Zabbix

To monitor a virtual masthead with Zabbix, all you need to do is enable the Zabbix agent in the container (on the monitored vm).

Variables to set: ZBX_HOSTNAME - this name will be visible in OAP, ZBX_SERVER_HOST - host address without protocol and without port.

docker run --name some-zabbix-agent -e ZBX_HOSTNAME="name" -e ZBX_SERVER_HOST=oaphost -d zabbix/zabbix-agent:latest

From the OAP side, open port 10051 and make sure that the variable SW_RECEIVER_ZABBIX:default is set.

  apm-oap:
    image: ${REVDEBUG_DOCKER_REGISTRY:-docker.revdebug.com/}apm-oap:${REVDEBUG_DOCKER_TAG:-latest}
    depends_on:
      opensearch:
        condition: service_healthy
    volumes:
        - interop:/interop
        # - ./config/vm.yaml:/skywalking/config/otel-oc-rules/vm.yaml
        - ./config/allconfigs:/skywalking/config
        # - ./config/self.yaml:/skywalking/config/fetcher-prom-rules/self.yaml
    environment:
            SW_STORAGE: elasticsearch
            SW_STORAGE_ES_CLUSTER_NODES: opensearch:9200
            SW_CORE_METRICS_DATA_TTL: ${SW_CORE_METRICS_DATA_TTL:-14}
            SW_CORE_RECORD_DATA_TTL: ${SW_CORE_RECORD_DATA_TTL:-3}
            SW_STORAGE_ES_INDEX_REPLICAS_NUMBER: ${SW_STORAGE_ES_INDEX_REPLICAS_NUMBER:-0}
            SW_OTEL_RECEIVER: default
            SW_OTEL_RECEIVER_ENABLED_OC_RULES: vm,oap,k8s-cluster,k8s-node,k8s-service,istio-controlplane
            SW_RECEIVER_ZABBIX: default
    logging:
        driver: "local"
    ports:
        - "12800:12800"
        - "10051:10051"

Supported metrics in our VM monitoring

Monitoring PanelMetric NameDescription

CPU Usage

cpu_total_percentage

The overall percentage of CPU core utilization, and when there are two cores, the highest possible usage would be 200%.

Memory RAM Usage

meter_vm_memory_used

The amount of RAM being currently utilized.

Memory Swap Usage

meter_vm_memory_swap_percentage

The proportion of swap memory that is currently in use, expressed as a percentage of the total available swap memory.

CPU Average Used

meter_vm_cpu_average_used

The percentage of CPU core utilization in each mode, where "mode" refers to different states in which the CPU can operate, such as user mode or system mode.

CPU Load

meter_vm_cpu_load1

meter_vm_cpu_load5

meter_vm_cpu_load15

The average CPU load over the last 1, 5, and 15 minutes, indicating the amount of work that the CPU has been performing during these time periods.

Memory RAM

meter_vm_memory_total

meter_vm_memory_available

meter_vm_memory_used

Information regarding the RAM, which includes the total amount of RAM available, the amount of RAM currently being used, and the amount of RAM that is still available for use.

Memory Swap

meter_vm_memory_swap_free

meter_vm_memory_swap_total

Details regarding the swap memory, which includes the total amount of swap memory available, as well as the amount of swap memory that is currently not being used and is available for use.

File System Mountpoint Usage

meter_vm_filesystem_percentage

The proportion of the file system that is currently being used at each mount point, expressed as a percentage of the total capacity of the file system at that specific location.

Disk R/W

meter_vm_disk_read

meter_vm_disk_written

The amount of data that has been read from and written to the disk, indicating the amount of input/output operations that have occurred.

Network Bandwidth Usage

meter_vm_network_receive

meter_vm_network_transmit

The amount of data that has been received by and transmitted from the network interface, indicating the amount of incoming and outgoing network traffic.

Network Status

meter_vm_tcp_curr_estab

meter_vm_tcp_tw

meter_vm_tcp_alloc

meter_vm_sockets_used

meter_vm_udp_inuse

Information related to the network connections and protocols, including the total number of TCP connections that have been established, the number of TCP connections that are currently in the "time wait" state, the number of allocated TCP connections that are being used, the total number of sockets that are currently in use, and the total number of User Datagram Protocol (UDP) connections that are currently active.

Filefd Allocated

meter_vm_filefd_allocated

The total number of file descriptors that have been allocated, indicating the number of files that can be opened or accessed by the system at any given time.

Tab with monitoring VMs in APM

If you want to enable the virtual machine monitoring option, please contact sales@revdebug.com.

Last updated