The progress of Dataupload is not updated frequently enough #8579

reasonerjt · 2025-01-06T08:03:20Z

Describe the problem/challenge you have
When I test to backup a PV with 50Gi data via CSI snapshot and data mover, I noticed that for a few mins the BYTES DONE seemed stopped updating:

[~/velero-ws/tmp/v1.15.0]$ k get dataupload jt-nginx-bak-2-kdlbz -n velero
NAME                   STATUS       STARTED   BYTES DONE   TOTAL BYTES   STORAGE LOCATION   AGE     NODE
jt-nginx-bak-2-kdlbz   InProgress   7m53s     1576075264   52428800000   default            9m12s   ip-192-168-50-199.ec2.internal
[~/velero-ws/tmp/v1.15.0]$ k get dataupload jt-nginx-bak-2-kdlbz -n velero
NAME                   STATUS       STARTED   BYTES DONE   TOTAL BYTES   STORAGE LOCATION   AGE   NODE
jt-nginx-bak-2-kdlbz   InProgress   11m       5118558208   52428800000   default            12m   ip-192-168-50-199.ec2.internal
[~/velero-ws/tmp/v1.15.0]$ k get dataupload jt-nginx-bak-2-kdlbz -n velero
NAME                   STATUS       STARTED   BYTES DONE   TOTAL BYTES   STORAGE LOCATION   AGE   NODE
jt-nginx-bak-2-kdlbz   InProgress   11m       5118558208   52428800000   default            13m   ip-192-168-50-199.ec2.internal
[~/velero-ws/tmp/v1.15.0]$ k get dataupload jt-nginx-bak-2-kdlbz -n velero
NAME                   STATUS       STARTED   BYTES DONE   TOTAL BYTES   STORAGE LOCATION   AGE   NODE
jt-nginx-bak-2-kdlbz   InProgress   11m       5118558208   52428800000   default            13m   ip-192-168-50-199.ec2.internal
[~/velero-ws/tmp/v1.15.0]$ k get dataupload jt-nginx-bak-2-kdlbz -n velero
NAME                   STATUS       STARTED   BYTES DONE   TOTAL BYTES   STORAGE LOCATION   AGE   NODE
jt-nginx-bak-2-kdlbz   InProgress   12m       5118558208   52428800000   default            13m   ip-192-168-50-199.ec2.internal
[~/velero-ws/tmp/v1.15.0]$ k get dataupload jt-nginx-bak-2-kdlbz -n velero
NAME                   STATUS       STARTED   BYTES DONE   TOTAL BYTES   STORAGE LOCATION   AGE   NODE
jt-nginx-bak-2-kdlbz   InProgress   13m       5118558208   52428800000   default            14m   ip-192-168-50-199.ec2.internal
[~/velero-ws/tmp/v1.15.0]$ k get dataupload jt-nginx-bak-2-kdlbz -n velero
NAME                   STATUS       STARTED   BYTES DONE   TOTAL BYTES   STORAGE LOCATION   AGE   NODE
jt-nginx-bak-2-kdlbz   InProgress   14m       5118558208   52428800000   default            15m   ip-192-168-50-199.ec2.internal
[~/velero-ws/tmp/v1.15.0]$ k get dataupload jt-nginx-bak-2-kdlbz -n velero
NAME                   STATUS       STARTED   BYTES DONE   TOTAL BYTES   STORAGE LOCATION   AGE   NODE
jt-nginx-bak-2-kdlbz   InProgress   14m       5118558208   52428800000   default            15m   ip-192-168-50-199.ec2.internal
[~/velero-ws/tmp/v1.15.0]$ k get dataupload jt-nginx-bak-2-kdlbz -n velero
NAME                   STATUS       STARTED   BYTES DONE   TOTAL BYTES   STORAGE LOCATION   AGE   NODE
jt-nginx-bak-2-kdlbz   InProgress   14m       5118558208   52428800000   default            15m   ip-192-168-50-199.ec2.internal
[~/velero-ws/tmp/v1.15.0]$ k get dataupload jt-nginx-bak-2-kdlbz -n velero -w
NAME                   STATUS       STARTED   BYTES DONE   TOTAL BYTES   STORAGE LOCATION   AGE   NODE
jt-nginx-bak-2-kdlbz   InProgress   14m       5118558208   52428800000   default            15m   ip-192-168-50-199.ec2.internal

Because the uploaded bytes are notified via event of the CR Dataupload, I also tried to check the events section of "kubectl describe dataupload":

$ k describe dataupload jt-nginx-bak-2-kdlbz -n velero
......
Events:
  Type    Reason              Age                   From                  Message
  ----    ------              ----                  ----                  -------
  Normal  Data-Path-Started   19m                   jt-nginx-bak-2-kdlbz  Data path for jt-nginx-bak-2-kdlbz started
  Normal  Data-Path-Progress  4m34s (x91 over 19m)  jt-nginx-bak-2-kdlbz  {"totalBytes":52428800000,"doneBytes":11760435200}
  
  k describe dataupload jt-nginx-bak-2-kdlbz -n velero
  ....
  Events:
  Type    Reason              Age                  From                  Message
  ----    ------              ----                 ----                  -------
  Normal  Data-Path-Started   20m                  jt-nginx-bak-2-kdlbz  Data path for jt-nginx-bak-2-kdlbz started
  Normal  Data-Path-Progress  41s (x121 over 20m)  jt-nginx-bak-2-kdlbz  {"totalBytes":52428800000,"doneBytes":15443755008}

From the output of the command, it seems that the events were written as expected (every 10s), but it was not updated one by one to the dataupload resource, but retrieved in bulk (~30 events in ~5mins).

Let's double check if this is the design of k8s and see if there's a way to improve the responsiveness.

Describe the solution you'd like
The upload progress should be updated more frequently in the dataupload CR.

Environment:

Velero version (use velero version): v1.15.0
Kubernetes version (use kubectl version): v1.29.12-eks-2d5f260
Kubernetes installer & version:
Cloud provider or hardware configuration: EKS

Vote on this issue!

This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.

👍 for "The project would be better with this feature added"
👎 for "This feature will not enhance the project in a meaningful way"

The text was updated successfully, but these errors were encountered:

reasonerjt added Area/CSI Related to Container Storage Interface support area/datamover labels Jan 6, 2025

reasonerjt assigned Lyndon-Li Jan 6, 2025

Lyndon-Li added this to the v1.16 milestone Jan 8, 2025

Lyndon-Li mentioned this issue Jan 8, 2025

Issue 8579 - set event burst #8590

Merged

Lyndon-Li closed this as completed Jan 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

The progress of Dataupload is not updated frequently enough #8579

The progress of Dataupload is not updated frequently enough #8579

reasonerjt commented Jan 6, 2025

The progress of Dataupload is not updated frequently enough #8579

The progress of Dataupload is not updated frequently enough #8579

Comments

reasonerjt commented Jan 6, 2025