Skip to content

Commit

Permalink
Uptime and ICMP metrics
Browse files Browse the repository at this point in the history
New:
 - Include uptime metrics
 - Include hosts ICMP metrics
  • Loading branch information
monofox committed Apr 14, 2024
1 parent 32cce24 commit 52cbf7e
Show file tree
Hide file tree
Showing 4 changed files with 82 additions and 14 deletions.
4 changes: 3 additions & 1 deletion .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@ name: Release monit_exporter
on:
push:
tags: [ "*" ]
release:
types: [created]

jobs:

Expand All @@ -22,7 +24,7 @@ jobs:
- name: Set up Go
uses: actions/setup-go@v3
with:
go-version: 1.21.3
go-version: 1.21.9

- name: Run GoReleaser
uses: goreleaser/goreleaser-action@master
Expand Down
5 changes: 4 additions & 1 deletion CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ The headers are:
- Enhancements
- Features

## 0.3.0 (2024-04-xx)
## 0.3.0 (2024-04-14)

### Bugs
- Catching scrape errors
Expand All @@ -35,12 +35,15 @@ The headers are:
- Added extraction of:
- port response times
- unix socket response times
- ICMP response times (hosts)
- CPU usage
- Memory usage
- Disk write metrics
- Disk read metrics
- I/O service times
- Network link metrics
- Uptime metrics
- Monit version information
- Added option in order to ignore TLS certificate validation (restricted and not recommended)

## 0.2.2 (2023-10-22)
Expand Down
7 changes: 4 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,16 +34,17 @@ These metrics are exported by `monit_exporter`:

| name | description |
|----------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| monit_service_check | Monit service check info with following labels provided:<br><dl><dt>`check_name`</dt><dd>Name of monit check</dd><dt>`monitored`</dt><dd>Specifies, if the service is monitored or not, whereas `0` means no and `1` means yes.</dd><dt>`type`</dt><dd>Specifies the type of service.</dd></dl>
| monit_service_check | Monit service check info with following labels provided:<br><dl><dt>`check_name`</dt><dd>Name of monit check</dd><dt>`monitored`</dt><dd>Specifies, if the service is monitored or not, whereas `0` means no, `1` means yes, `2` means init and `4` means waiting. A combination is possible. E.g. `5` means, that the service is monitored, but currently waiting for e.g. the right time range.</dd><dt>`type`</dt><dd>Specifies the type of service.</dd></dl><br>Value of this metric is representing "error". Means: `0` is "service is fine" whereas `> 0` means, there is an error.
| monit_service_cpu_perc | Monit service CPU info with following labels:<br><dl><dt>`check_name`</dt><dd>Name of monit check</dd><dt>`type`</dt><dd>Specifies value type whereas value can be `percentage` or `percentage_total`</dd></dl>
| monit_service_mem_bytes | Monit service mem info with following labels:<br><dl><dt>`check_name`</dt><dd>Name of monit check</dd><dt>`type`</dt><dd>Specifies value type whereas value can be `kilobyte` or `kilobyte_total`</dd></dl>
| monit_service_network_link_state | Monit service link states<br><dl><dt>`check_name`</dt><dd>Name of monit check</dd></dl><br>Value can be either `-1` = Not available, `0` = down and `1` = up
| monit_service_network_link_statistics | Monit service link statistics<br><dl><dt>`check_name`</dt><dd>Name of monit check</dd><dt>`direction`</dt><dd>Specifies link direction (upload / download)</dd><dt>`unit`</dt><dd>Spcifies unit of metrics (bytes, errors, packets)</dd><dt>`type`</dt><dd>Specifies the type with either now or total. Whereas now means "per second"</dd></dl>
| monit_service_port_response_times | Monit service port and unix socket checks response times<br><dl><dt>`check_name`</dt><dd>Name of monit check</dd><dt>`hostname`</dt><dd>Specifies hostname checked</dd><dt>`path`</dt><dd>Specifies a unix socket path</dd><dt>`port`</dt><dd>Specifies port to check</dd><dt>`protocol`</dt><dd>Specifies protocol used for checking service (e.g. POP, IMAP, REDIS, etc.). Default is a RAW check.</dd><dt>`type`</dt><dd>Specifies protocol type (e.g. TCP, UDP, UNIX)</dd><dt>`uri`</dt><dd>Gives full URI for the service check including type, host and port or path.</dd></dl>
| monit_service_port_response_times | Monit service port, unix socket and icmp checks response times<br><dl><dt>`check_name`</dt><dd>Name of monit check</dd><dt>`hostname`</dt><dd>Specifies hostname checked</dd><dt>`path`</dt><dd>Specifies a unix socket path</dd><dt>`port`</dt><dd>Specifies port to check</dd><dt>`protocol`</dt><dd>Specifies protocol used for checking service (e.g. POP, IMAP, REDIS, etc.). Default is a RAW check.</dd><dt>`type`</dt><dd>Specifies protocol type (e.g. TCP, UDP, UNIX, ICMP)</dd><dt>`uri`</dt><dd>Gives full URI for the service check including type, host and port or path.</dd></dl>
| monit_service_read_bytes | Monit service Disk Read Bytes<dl><dt>`check_name`</dt><dd>Name of monit check</dd><dt>`type`</dt><dd>Specifies type of read / write. Possible values: read_count, read_count_total. Value is given in bytes.</dd></dl>
| monit_service_uptime | Service and server uptime in seconds<br><dl><dt>`check_name`</dt><dd>Name of monit check</dd><dt>`type`</dt><dd>Type of the uptime service check (Possible values: `system` / `server`)</dd></dl>
| monit_service_write_bytes | Monit service Disk Writes Bytes<dl><dt>`check_name`</dt><dd>Name of monit check</dd><dt>`type`</dt><dd>Specifies type of read / write. Possible values: write_count, write_count_total. Value is given in bytes.</dd></dl>
| monit_up | Monit status availability. `0` = not available and `1` = available

| monit_version | Monit current version as label - the value is const 1.

#### Service types

Expand Down
80 changes: 71 additions & 9 deletions monit_exporter.go
Original file line number Diff line number Diff line change
Expand Up @@ -45,23 +45,32 @@ var serviceTypes = map[int]string{
}

type monitXML struct {
MonitServer monitServer `xml:"server"`
MonitServices []monitService `xml:"service"`
}

type monitServer struct {
Hostname string `xml:"localhostname"`
Uptime int64 `xml:"uptime"`
Version string `xml:"version"`
}

// Simplified structure of monit check.
type monitService struct {
Type int `xml:"type,attr"`
Name string `xml:"name"`
Status int `xml:"status"`
Monitored string `xml:"monitor"`
Memory monitServiceMem `xml:"memory"`
CPU monitServiceCPU `xml:"cpu"`
DiskWrite monitServiceDisk `xml:"write"`
DiskRead monitServiceDisk `xml:"read"`
ServiceTimes monitServiceTime `xml:"servicetime"`
DiskWrite monitServiceDisk `xml:"write"`
Link monitServiceLink `xml:"link"`
Memory monitServiceMem `xml:"memory"`
Monitored string `xml:"monitor"`
Name string `xml:"name"`
Ports []monitServicePort `xml:"port"`
ServiceTimes monitServiceTime `xml:"servicetime"`
Status int `xml:"status"`
Type int `xml:"type,attr"`
UnixSockets []monitServicePort `xml:"unix"`
Link monitServiceLink `xml:"link"`
Uptime int64 `xml:"uptime"`
Icmp monitServiceIcmp `xml:"icmp"`
}

type monitServiceMem struct {
Expand Down Expand Up @@ -110,6 +119,11 @@ type monitServiceLinkDirection struct {
Errors monitNetworkCount `xml:"errors"`
}

type monitServiceIcmp struct {
Type string `xml:"type"`
Responsetime float64 `xml:"responsetime"`
}

type monitBytes struct {
Count int `xml:"count"`
Total int `xml:"total"`
Expand All @@ -127,6 +141,8 @@ type Exporter struct {
client *http.Client

up prometheus.Gauge
version *prometheus.GaugeVec
checkUptime *prometheus.GaugeVec
checkStatus *prometheus.GaugeVec
checkMem *prometheus.GaugeVec
checkCPU *prometheus.GaugeVec
Expand Down Expand Up @@ -237,6 +253,20 @@ func NewExporter(c *Config) (*Exporter, error) {
Name: "up",
Help: "Monit status availability",
}),
version: prometheus.NewGaugeVec(prometheus.GaugeOpts{
Namespace: namespace,
Name: "version",
Help: "Monit version",
},
[]string{"version"},
),
checkUptime: prometheus.NewGaugeVec(prometheus.GaugeOpts{
Namespace: namespace,
Name: "service_uptime",
Help: "Monit service and server uptime",
},
[]string{"check_name", "type"},
),
checkStatus: prometheus.NewGaugeVec(prometheus.GaugeOpts{
Namespace: namespace,
Name: "service_check",
Expand Down Expand Up @@ -300,6 +330,8 @@ func NewExporter(c *Config) (*Exporter, error) {
// implements prometheus.Collector.
func (e *Exporter) Describe(ch chan<- *prometheus.Desc) {
e.up.Describe(ch)
e.version.Describe(ch)
e.checkUptime.Describe(ch)
e.checkStatus.Describe(ch)
e.checkCPU.Describe(ch)
e.checkMem.Describe(ch)
Expand All @@ -326,6 +358,15 @@ func (e *Exporter) scrape() error {
log.Errorf("Error parsing data from monit: %v\n%s", err, data)
} else {
e.up.Set(1)
e.checkUptime.With(
prometheus.Labels{
"check_name": parsedData.MonitServer.Hostname,
"type": "server",
}).Set(float64(parsedData.MonitServer.Uptime))
e.version.With(
prometheus.Labels{
"version": parsedData.MonitServer.Version,
}).Set(1)
// Constructing metrics
for _, service := range parsedData.MonitServices {
e.checkStatus.With(prometheus.Labels{"check_name": service.Name, "type": serviceTypes[service.Type], "monitored": service.Monitored}).Set(float64(service.Status))
Expand All @@ -336,7 +377,7 @@ func (e *Exporter) scrape() error {
"monitored": service.Monitored,
}).Set(float64(service.Status))

// Memory + CPU only for specifiy status types (cf. monit/xml.c)
// Memory + CPU + Uptime only for specifiy status types (cf. monit/xml.c)
if service.Type == SERVICE_TYPE_PROCESS || service.Type == SERVICE_TYPE_SYSTEM {
e.checkMem.With(
prometheus.Labels{
Expand All @@ -358,6 +399,11 @@ func (e *Exporter) scrape() error {
"check_name": service.Name,
"type": "percentage_total",
}).Set(float64(service.CPU.PercentTotal))
e.checkUptime.With(
prometheus.Labels{
"check_name": service.Name,
"type": serviceTypes[service.Type],
}).Set(float64(service.Uptime))
}
if service.Type == SERVICE_TYPE_PROCESS || service.Type == SERVICE_TYPE_FILESYSTEM {
e.checkDiskWrite.With(
Expand Down Expand Up @@ -392,6 +438,20 @@ func (e *Exporter) scrape() error {
e.addNetLinkElement(&service, "upload", &service.Link.Upload)
}

// ICMP checks
if service.Type == SERVICE_TYPE_HOST && service.Icmp.Type != "" {
e.checkPortRespTimes.With(
prometheus.Labels{
"check_name": service.Name,
"type": "ICMP",
"hostname": "",
"path": "",
"port": "",
"protocol": strings.ToUpper(service.Icmp.Type),
"uri": "",
}).Set(float64(service.Icmp.Responsetime))
}

// Port checks
for _, port := range service.Ports {
var uri = fmt.Sprintf("%s://%s:%s", strings.ToLower(port.Type), port.Hostname, port.Portnumber)
Expand Down Expand Up @@ -458,6 +518,8 @@ func (e *Exporter) Collect(ch chan<- prometheus.Metric) {
e.checkStatus.Reset()
if err := e.scrape(); err == nil {
e.up.Collect(ch)
e.version.Collect(ch)
e.checkUptime.Collect(ch)
e.checkStatus.Collect(ch)
e.checkMem.Collect(ch)
e.checkCPU.Collect(ch)
Expand Down

0 comments on commit 52cbf7e

Please sign in to comment.