-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Report container FS metrics into prometheus /metrics #1642
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems pretty reasonable from my perspective
PerDiskStats reported from cgroups were not being surfaced into prometheus. In order to properly correlate the metrics, we need to assign a device label to each metric (which is the FS or device path). Since blkio cgroup tracks devices, we create a synthetic device `/dev/NAME` for the metric. Assign a Device label to each PerDiskStat for the handlers up front, and then surface the PerDiskStat values into the prometheus metrics. Report two new metrics - total bytes read and total bytes written.
f7c360e
to
4e25a79
Compare
@dashpole as discussed last week - we don't have a prometheus e2e test, and I wasn't sure what expectations are around new tests on something like this. |
The code changes look good, but I am still trying to wrap my head around: |
All the filesystem data is in form /dev/sda1. What we get from the machine info is "sda". To make the two roughly consistent, I turn "sda" into "/dev/sda" |
And at least on the few systems I looked at (RHEL / Mac / ubuntu) there is always a /dev/NAME for what mount returns |
Ok, so the name comes from reading the /sys/block/ directory. Would it make more sense to name these |
Interestingly enough - we depend on /dev/sda existing because we do this: https://github.com/google/cadvisor/blob/master/utils/sysfs/sysfs.go#L82 So a client that wanted to lookup the device by name can rely on this path (to then get major/minor). |
yes, so that is /sys/block/sda/dev |
My review is mostly based on the resulting metric output, and that lgtm. I only had a chance to glance over the implementation, but generally looks good as well. (coming from sig-instrumentation) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, Thanks.
Any chance to get a release with this PR included soon? |
yep, ill cut a cAdvisor release next week sometime before the kubernetes 1.7 |
Hmm, I applied this patch on 0.25.0 and with
The actual difference in reported metrics between defaults and my case:
Looks like no disk metrics were dropped. Is this expected? |
these are diskIO stats |
@dashpole, thanks, makes sense now. |
If I'm not mistake, this should be in 0.26.1, correct? But I still don't have the metrics:
|
Hi, I try this in ubuntu and centos, the container_fs_writes_total show that ubuntu is corrected and centos are not? why ? |
@ZhiqinYang Maybe differences in how the cgroups are mounted? |
Thank you for your reply, I do it with this command on centos7: |
Thank you for help, I take a mistake, The cgroup is set to error! |
Hello @ZhiqinYang , I'm not sure to understand how you managed to solve this issue with the cgroup ? |
Fixes #1403
PerDiskStats reported from cgroups were not being surfaced into
prometheus. In order to properly correlate the metrics, we need to
assign a device label to each metric (which is the FS or device path).
Since blkio cgroup tracks devices, we create a synthetic device
/dev/NAME
for the metric.Assign a Device label to each PerDiskStat for the handlers up front, and
then surface the PerDiskStat values into the prometheus metrics. Report
two new metrics - total bytes read and total bytes written.
Open questions:
_bytes_total
metrics?