Skip to content

Decouple metrics from hostNetwork using proxy DaemonSet #443

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

frobware
Copy link
Contributor

@frobware frobware commented May 27, 2025

The bpfman-agent DaemonSet requires hostNetwork: true for eBPF operations such as loading XDP programs and accessing host network interfaces. It also exposes Prometheus metrics via TCP port 8443 on the host network.

In #437, the metrics service was updated to use TCP port 8443 by default. This aligned with controller-runtime’s default for secure metrics endpoints and is configurable via the bpfman ConfigMap.

However, 8443 is a commonly used port and may plausibly be claimed by other host-level services or privileged containers. In clusters where hostNetwork: true is used, this increases the risk of port binding conflicts. The underlying issue is not the specific port but the need to bind to any TCP port on the host network.

Additionally, in PR #437, the metrics Service was not marked clusterIP: None, so it was not headless. As a result, Prometheus would scrape the Service’s cluster IP, which performs round-robin load balancing across pods. This is not suitable for a DaemonSet, where metrics must be scraped from each pod individually (e.g., to collect per-node data). A headless Service is required to expose the full set of pod endpoints for proper per-pod scraping.

In cloud environments with restrictive security groups, the use of hostNetwork: true introduces additional operational complexity. Since hostNetwork pods are assigned node IPs rather than pod IPs, Prometheus scraping across nodes may fail (does fail) unless explicit firewall rules or security group exceptions are configured. This creates cloud-provider-specific coupling, requires coordination with infrastructure teams, and increases deployment friction - particularly in environments like AWS where inter-node traffic to arbitrary ports is not allowed by default.

This PR proposes an architectural change to eliminate the requirement to bind a host port for metrics entirely.

Options considered:

  1. Do not expose metrics from bpfman-agent.

    • Simple
    • But eliminates observability
  2. Leave metrics on hostNetwork and document the need to open firewall ports in environments that restrict inter-node traffic (e.g. cloud platforms using security groups). This is not typically required in libvirt or bare-metal environments.

    • Minimal change (would still want a headless service, for example)
    • Still subject to TCP port conflicts and blocked metrics
  3. Introduce a metrics-proxy DaemonSet (chosen).

    • Avoids host-level TCP port binding
    • Enables per-pod scraping over HTTPS
    • Works across clouds without firewall changes
    • Small resource overhead

Implementation:

  • Add a metrics-proxy DaemonSet that runs in the container network.
  • It mounts a shared volume with bpfman-agent to access a Unix domain socket.
  • Proxies metrics over HTTPS using controller-runtime’s metrics server.
  • bpfman-agent now serves metrics only over the socket, no longer binds TCP ports.
  • The associated Service is marked clusterIP: None to enable per-pod scraping.
  • ServiceMonitor now targets /proxy-metrics.
  • Proxy HTTP client timeout is set to 8 seconds to fail before Prometheus’s 10-second default scrape timeout.

Outcome:

  • Port conflicts are eliminated.
  • Prometheus can scrape all pods without infrastructure changes.
  • No need to modify AWS security groups or cloud firewall rules.
  • Metrics are cleanly separated from core eBPF operations.

Cons:

  • Another daemonset, more resource usage
  • Requires privileged: true to access the host-mounted Unix socket, but:
    • This does not widen the security surface - the original bpfman-agent pod already ran with privileged: true
    • The metrics-proxy pod inherits only the minimum required permissions
    • The shared volume containing the Unix socket is mounted read-only
    • The proxy only reads metrics; it does not interact with eBPF or mutate state

@frobware frobware force-pushed the two-tier-metrics-agent-collection branch from 7c5612f to d8b3348 Compare May 27, 2025 17:38
@frobware frobware marked this pull request as draft May 27, 2025 17:39
@frobware frobware force-pushed the two-tier-metrics-agent-collection branch 2 times, most recently from 27f95d5 to 77139a1 Compare May 28, 2025 10:25
The bpfman-agent DaemonSet runs with hostNetwork=true, which causes
multiple issues for metrics collection:

1. Port conflicts when pods on the same node bind to the same port
2. Cloud security groups blocking inter-node scraping
3. Non-headless Service preventing per-pod discovery

This change decouples metrics collection from hostNetwork by
introducing a dedicated metrics-proxy DaemonSet:

- Runs on the container network, avoiding host-level constraints
- Uses a headless Service for correct per-pod scraping
- Proxies bpfman-agent metrics via Unix domain socket
- Serves HTTPS on port 8443 with unified TLS handling:
  * OpenShift: uses Kubernetes service-serving certificates
  * Local/KIND: auto-generates self-signed certificates

The ServiceMonitor now targets /proxy-metrics on container-network
pods. This removes the need to open cloud firewall ports while
maintaining secure HTTPS-based scraping.

Signed-off-by: Andrew McDermott <[email protected]>
@frobware frobware force-pushed the two-tier-metrics-agent-collection branch from 77139a1 to 6d4810a Compare May 28, 2025 10:59
$ make bundle

Signed-off-by: Andrew McDermott <[email protected]>
@frobware frobware force-pushed the two-tier-metrics-agent-collection branch from 6d4810a to 157fc35 Compare May 28, 2025 11:00
@frobware frobware changed the title [WIP] Decouple metrics from hostNetwork via dedicated proxy Decouple metrics from hostNetwork using proxy DaemonSet May 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant