Support Eclipse MicroProfile Metrics
Overview
The goal of this feature is to increase observability of WildFly by exposing its metrics to monitoring tools such as Promotheus and the OpenShift Console.
This feature adds support for the Eclipse MicroProfile Metrics 1.1.1 specification.
The integration uses the smallrye-metrics component to provide the MicroProfile Metrics implementation.
Issue Metadata
Issue
Related Issues
-
WFLY-10522 - Support Eclipse MicroProfile Config
-
MicroProfile Metrics has a dependency on MicroProfile Config
-
-
CLOUD-2731 - Expose EAP Metrics so that it can be consumed by Kiali
-
EAP7-1060 - Make "Expose EAP Metrics" configurable in HAL Web Console
Dev Contacts
QE Contacts
Affected Projects or Components
The implementation of Eclipse MicroProfile Metrics 1.1.1 is provided by smallrye-metrics.
Other Interested Projects
-
OpenShift Web Console MUST be able to consume metrics exposed by this extension.
-
Prometheus MUST be able to consume metrics exposed by this extension.
-
HAL, WildFly Web Console, will expose the new extension to configure it and potentially display the metrics exposed by the extension.
-
Kiali SHOULD be able to consume metrics exposed by this extension.
-
Thorntail metrics fraction - Some WildFly-based code to activate and install metrics in WildFly have been developed for Thorntail 2.x and could be reused in the proposed extension.
Requirements
External Requirements
The external requirements corresponds to the way metrics are discovered and consumed from WildFly by monitoring tools such as Promotheus and the OpenShift Web Console.
-
Provide the MicroProfile Metrics HTTP endpoints (under
/metrics
) on WildFly management HTTP interface (on port9990
). -
Support both JSON and Prometheus format from the Metrics HTTP endpoints
-
Access to the REST endpoint CAN be secured. By default, the user MUST be authenticated to access the Metrics HTTP endpoints. It is possible to disable authentication by setting the
security-enabled
attribute of the/subsystem=microprofile-metrics-smallrye
tofalse
. -
If RBAC is enabled, the user must be a
Monitor
to be able to get metrics from the HTTP endpoints
Internal requirements
-
Support MicroProfile Metrics for required base metrics as listed in the chapter 4 of the MicroProfile Metrics specification.
-
Support additional vendor metrics corresponding to metrics always available from WildFly servers, such as:
-
number of loaded JBoss Modules
-
Memory used by NIO Buffer Pool
-
Current and peak usages of the memory pools
-
WildFly management model has a notion of metrics for resource attributes (as described in WildFly Admin Guide). This extension MUST be able to expose these metrics from WildFly management model, such as:
-
Transaction metrics (from the transactions subsystem)
-
HTTP usage (from the undertow subsystem)
-
JDBC Pool usage (from the datasources subsystem)
Application metrics from deployments are also supported.
Non-Requirements
-
The MicroProfile 1.1.1 does not support multiple deployments. It is out of scope of this feature to support it in WildFly. Work will have to be done in a future version of the specification to support it. This adds constraints on the application deployments that must coordinate to avoid exposing the same metrics (especially absolute ones whose name can clash between deployments).
Implementation Plan
-
Add dependencies to
smallrye-metrics
andmicroprofile-metrics-api
artifacts -
Add a new
microprofile-metrics-smallrye
extension to provide Metrics support (including its HTTP endpoints) -
Expose Metrics HTTP endpoints on WildFly HTTP management interface under the
/metrics
context. -
Add mechanism to register metrics from WildFly management model into the
MetricRegistry
for vendor metrics.
Implementation Details
The metrics will be exposed by HTTP endpoints on WildFly HTTP management interface under the /metrics
context.
HTTP Endpoint formats
The MicroProfile Metrics HTTP endpoints support two types of format:
-
JSON format (as specified in chapter 3.1 of Eclipse MicroProfile Metrics 1.1)
In the absence of a, Accept
header (or if it is explicitly set to text/plain
), the HTTP endpoint will return metrics
with the Prometheus format:
$ curl -v http://127.0.0.1:9990/metrics/ ... < HTTP/1.1 200 OK < Content-Type: text/plain ... # HELP base:classloader_total_loaded_class_count Displays the total number of classes that have been loaded since the Java virtual machine has started execution. # TYPE base:classloader_total_loaded_class_count counter base:classloader_total_loaded_class_count 10836.0 # HELP base:cpu_system_load_average Displays the system load average for the last minute. The system load average is the sum of the number of runnable entities queued to the available processors and the number of runnable entities running on the available processors averaged over a period of time. The way in which the load average is calculated is operating system specific but is typically a damped time-dependent average. If the load average is not available, a negative value is displayed. This attribute is designed to provide a hint about the system load and may be queried frequently. The load average may be unavailable on some platform where it is expensive to implement this method. # TYPE base:cpu_system_load_average gauge base:cpu_system_load_average 2.3134765625 ... # HELP vendor:foo <description> # TYPE vendor:foo <type> vendor:foo 12345.0
To fetch metrics in the JSON format, the Accept
HTTP header MUST be set to application/json
:
$ curl -v -H "Accept: application/json" http://127.0.0.1:9990/metrics/ ... < HTTP/1.1 200 OK < Content-Type: application/json ... {"base" : { "classloader.totalLoadedClass.count" : 10911, "cpu.systemLoadAverage" : 2.1201171875, ... } ,"vendor" : { ... "foo": 12345.0 }
Authentication
By default, the HTTP endpoints require authentication. This can be alleviated by explicitly setting
the security-enabled
of the /subsystem=microprofile-smallrye-metrics
to false
.
If security is enabled, the HTTP client must be authenticated (otherwise, the server will reply with a
401 NOT AUTHORIZED
response):
$ curl -v http://127.0.0.1:9990/metrics/ ... < HTTP/1.1 401 Unauthorized
In that case, WildFly MUST have a management user and the HTTP client MUST pass its credential to the Metrics HTTP endpoint:
$ curl -v --digest -u admin:adminpwd http://127.0.0.1:9990/metrics < HTTP/1.1 200 OK ... # HELP base:cpu_system_load_average Displays the system load average for the last minute. The system load average is the sum of the number of runnable entities queued to the available processors and the number of runnable entities running on the available processors averaged over a period of time. The way in which the load average is calculated is operating system specific but is typically a damped time-dependent average. If the load average is not available, a negative value is displayed. This attribute is designed to provide a hint about the system load and may be queried frequently. The load average may be unavailable on some platform where it is expensive to implement this method. # TYPE base:cpu_system_load_average gauge base:cpu_system_load_average 1.9658203125 ...
If security is disabled in the subsystem configuration, the HTTP client does not require authentication (and it does not require a WildFly management user either).
Note that the standalone profiles will explicitly disable authentication in their configuration.
Subsystem description
The subsystem will be named microprofile-metrics-smallrye
.
The subsystem contains no child resources.
The subsystem has 2 attribute:
-
security-enabled
(true
by default) - if security is enabled, the HTTP client must be authenticated to query the HTTP endpoints. -
exposed-subsystems
- a list of strings corresponding the name of subsystems that exposes their metrics in the HTTP endpoints.-
This attributes affects only the WildFly metrics registered in the vendor scope (as defined below). By default, this attribute is not defined (so there is no metrics from subsystems that are exposed). The special character
*
can be used to specify that all subsystems will expose their metrics.
-
Exposed Metrics
Base Metrics
By default, the subsystem will expose all required base metrics specified in the chapter 4 of the MicroProfile Metrics specification which exposes JVM metrics:
-
memory.usedHeap
-
memory.committedHeap
-
memory.maxHeap
-
gc.%s.count
- Count for the various Garbage Collectors -
gc.%s.time
- Approximate accumulated collection elapsed time for the various Garbage Collectors -
jvm.uptime
-
thread.count
-
thread.daemon.count
-
thread.max.count
-
classloader.currentLoadedClass.count
-
classloader.totalLoadedClass.count
-
classloader.totalUnloadedClass.count
-
cpu.availableProcessors
-
cpu.systemLoadAverage
-
cpu.processCpuLoad
Implementation Details
All these base metrics are gathered from the JVM MBeans and uses smallrye-metrics JmxRegistrar to bridge from JMX to the MicroProfile Metrics API.
The required base metrics are explicitly specified in a property file (named /io/smallrye/metrics/base-metrics.properties
located in the extension Jar) using a set of properties for each metric:
<metric name>.displayName: <Human-readable name of the metric> <metric name>.type: <Type of metric enumerated in org.eclipse.microprofile.metrics.MetricType (e.g counter, gauge)> <metric name>.unit: <Unit of the metric, can be none, listed in org.eclipse.microprofile.metrics.MetricUnits or other units> <metric name.description: <Human-readable description of the metric> <metric name>.mbean: <ObjectName of the MBean and attribute>
For example, the properties to expose the Total Loaded Class Count of the JVM are:
classloader.totalLoadedClass.count.displayName: Total Loaded Class Count classloader.totalLoadedClass.count.type: counter classloader.totalLoadedClass.count.unit: none classloader.totalLoadedClass.count.description: Displays the total number of classes that have been loaded since the Java virtual machine has started execution. classloader.totalLoadedClass.count.mbean: java.lang:type=ClassLoading/TotalLoadedClassCount
The name of the metric itself is classloader.totalLoadedClass.count
.
The list of required base metrics required to pass the MicroProfile Metrics TCK is listed at https://github.com/thorntail/thorntail/blob/master/fractions/microprofile/microprofile-metrics/src/main/resources/io/smallrye/metrics/base-metrics.properties
Note that this /io/smallrye/metrics/base-metrics.properties
file is stored in the extension Jar file and is not meant to be configurable by the user.
Vendor Metrics
Vendor metrics are specific to a "vendor" (in our case WildFly) and exposes metrics specific to the vendor runtime.
Examples of such metrics are:
-
Number of modules loaded by JBoss Module
-
Transaction statistics from Narayana Transaction Manager
-
Bytes throughput from Undertow
-
Etc.
Implementation Details
JMX-Based Vendor Metrics
Some of these metrics can be obtained by JMX and can rely on smallrye-metrics that loads these metrics from a property file named /io/smallrye/metrics/vendor-metrics.properties
that is located in the
extension jar. This file works similarly to the base-metrics.properties
as explained in the section above and is not meant to be configurable by the user.
The list of vendor metrics exposed by this mechanism is determined during the build process of WildFly.
It will include at least:
-
loadedModules
- Number of loaded JBoss Modules -
BufferPool_used_memory_%s
- the memory used by the various NIO BufferPool -
memoryPool.%s.usage
- Current usage of the various memory pool -
memoryPool.%s.usage.max
- Peak usage of the various memory pools
WildFly Vendor Metrics
However it is expected that most vendor metrics will come from WildFly Management Model (as described in WildScribe).
For example the transaction metrics will be retrieved from the /subsystem=transactions's attributes such as:
-
number-of-committed-transactions
-
number-of-inflight-transactions
-
number-of-transactions
-
number-of-aborted-transactions
-
etc.
When the micrprofile-smallrye-metrics
is installed, it will browse WildFly Resource Model and find every metrics registered by subsystem resources from the model.
If will only expose metrics from the subsystems specified by the exposed-subsystems
attribute (or all if the wildcard *
is used).
All those WildFly metrics will be translated to MicroProfile Metrics (with corresponding metadata) and registered in the MicrProfile Metrics' Vendor registry.
Note that metrics from other parts of WildFly resource model (e.g. below /core-service
resources are not registered).
When A HTTP client will request MicroProfile Metrics, the micrprofile-smallrye-metrics
will fetch the metric value by invoking the :read-attribute
operation
for the given resource and attribute.
Implementation Details
I looked at the metrics registered by the subsystems in the various standalone profiles that are shipped with WildFly:
-
Standalone Profile (51 metrics):
-
batch-jberet
- 7 metrics (per thread pool) -
ejb3
- 7 metrics (per thread pool) -
io
- 5 metrics (per worker) -
jca
- 8 metrics (per workmanager) -
request-controller
- 1 metric (overall) -
transactions
- 11 metrics (overall) -
undertow
- 12 metrics (per http/https listener)
-
-
Standalone Full Profile (10 additional metrics):
-
messaging-activemq
- 5 metrics (per destinations)
-
-
Standalone Full HA Profile (186 additional metrics):
-
undertow
- 6 metrics (for 1 additionalajp-listener
) -
jgroups
- 180 metrics (for 1 ee channel)
-
Name of WildFly vendor metrics
The name of the MicroProfile metric is derived from WildFly’s attribute name and the resource that defines it.
The name of the metric is composed of the address of the resource using the "path style" (where =
are replaced by /
) and the attribute name separated by a /
:
<resource address in "path" style>/<attribute name>
Note that the metric name has no leading /
.
So for example, the metric bytes-sent
on the resource /subsystem=undertow/server=default-server/http-listener=default
will be named subsystem/undertow/server/default-server/http-listener/default/bytes-sent
This will be the name of the metric registered in MicroProfile VENDOR metric registry and can be used from the HTTP endpoints.
However, this name will then be converted when exported to the Prometheus format according to the translation rules (section 3.2.1 of the MicroProfile Metrics 1.1.1 specification):
$ curl -v http://127.0.0.1:9990/metrics/vendor/subsystem/undertow/server/default-server/http-listener/default/bytes-sent < HTTP/1.1 200 OK ... # HELP vendor:subsystem_undertow_server_default_server_http_listener_default_bytes_sent_bytes The number of bytes that have been sent out on this listener # TYPE vendor:subsystem_undertow_server_default_server_http_listener_default_bytes_sent_bytes gauge vendor:subsystem_undertow_server_default_server_http_listener_default_bytes_sent_bytes 0.0 ...
Implementation Issues
Invalid WildFly Metrics Registration
Some resources in WildFly and WildFly Core codebase improperly registered runtime read-only attributes as metrics. This is tracked by WFCORE-4173 and WFLY-11212 and will be resolved before this feature is integrated.
Complex WildFly Metrics Are Not Supported
Resources may return arbitrarily complex return types for metrics. Most complex example I found in the codebase is http://wildscribe.github.io/WildFly/14.0/core-service/platform-mbean/type/memory/index.html#attr-heap-memory-usage.
The registration code in micrprofile-smallrye-metrics
will only register MicroProfile Metrics for simple numerical ModelType
.
WildFly Metrics Are Registered As Gauge
WildFly Metrics with undefined value / statistic-enabled = false
WildFly metrics are always registered and can be queried by the read-attribute
operation.
However such metric may not be actually available in the runtime. These metrics are defined with a so-called undefined metric value. This information is not made available in
the resource description and there is no way to know if the value returned by the metric is the actual runtime value of this "undefined metric value" placeholder.
This may lead to incorrect representation of a metric.
A typical example is undertow’s bytes-received metric for its http-listener. When this metric is queried, it might return 0
as its value.
It does not necessarily mean that there has been no network activity, it might be that undertow statistics are disabled (which is true by default).
A monitoring tool would then report no network activity even though there actually is some.
In addition, from a given metric (such as /subsystem=undertow/server/http-listener#bytes-received
) there is no way to know if this metric is actually enabled by looking at the value of another unrelated attributes
such as /subsystem=undertow#statistics-enabled
).
To solve this, the :read-attribute
operation used to fetch the metric value will be enhanced with a "include-undefined-metric" as tracked by WFCORE-4190.
By default, this flag is false
. If it is true, the :read-attribute
operation handler will not use the "undefined metric value" if the metric read handler returns an undefined value.
The micrprofile-smallrye-metrics
will set this flag to false
and remove the metric from the list returned by its endpoints.
This will requires additional fixes in the various metric handler that must not return a defined value (e.g. 0
) if the metric can not be computed.
Management Resources Added After Server Boot Will Not Expose Their Metrics
The microprofile-smallrye-metrics
extension will register any valid metric from WildFly Management Model when it is installed.
However if other management resources are added afterwards, the extension will not be aware of them and will not register their metrics.
Note that this does not apply to deployments which are handled separately in the extension Deployment Unit Processor.
Application Metrics
Application metrics are part of application deployments. They are created using the MicroProfile Metrics 1.1.1 API.
They are exposed by the HTTP endpoints in the application
scope.
Implementation Issues
Multiple Deployment Is Not Supported
Support of multiple deployments is planned for MicroProfile Metrics 2.0. If two different deployment register the same non-reusable metric, smallrye-metrics will reject the second registration thus making the second deployment fail (TO BE CONFIRMED).
Metrics are all unregistered during undeployment
smallrye-metrics unregisters all metrics when it is undeployed from the server (as explained in smallrye-metrics #12). This leads to a blocker issue as base and vendor metrics are removed when any deployment is undeployed (or redeployed).
Test Plan
-
smallrye-metrics component is passing the MicroProfile Metrics TCK during its release process.
-
WildFly integration test suite will be enhanced with tests that checks exposed metrics from the REST endpoint (both with JSON and Prometheus formats).
-
Tests must include required base metrics, vendor metrics (esp. from WildFly managemement model) and application metrics.
-
Tests must verify authentication access to the HTTP endpoints
-
Community Documentation
The feature will be documented in WildFly Admin Guide (in a new MicroProfile Metrics section).