Skip to content

Made segment upload to remote async #18333

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ask-kamal-nayan
Copy link

@ask-kamal-nayan ask-kamal-nayan commented May 19, 2025

Description

This PR modifies the runAfterRefreshWithPermit execution to run asynchronously using the thread pool scheduler. This change will prevent blocking the refresh thread during segment uploads to remote storage. Primary nodes will become searchable just after the refresh completes without waiting for segments to get uploaded to the remote storage.

Changes

  • Modified the runAfterRefreshWithPermit execution to run asynchronously using threadPool.schedule
  • The operation will be scheduled with zero delay to maintain immediate execution while being async
  • Uses the retry thread pool for scheduling the task

Check List

  • [ Yes] Functionality includes testing.
    • Tested by ingesting docs and checking if that remains searchable from primary as well as replica shards.
    • Tested by checking if the size of primary and replica becomes same after doc ingestion.
  • [ No] API changes companion pull request
    • Not applicable - This change is internal implementation only and doesn't affect public APIs
  • [ Yes] Public documentation issue/PR [created](ToDo: block fetch doc?)

Potential Risks

  • Monitor scenarios when multiple calls hit runAfterRefreshWithPermit within very short span of time.
  • Monitor the segment upload matrix to check if for any anomaly because of this change.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link
Contributor

❌ Gradle check result for d09228f: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@ashking94
Copy link
Member

❌ Gradle check result for d09228f: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

I see 285 tests failure, can we fix them please?

@ashking94
Copy link
Member

Lets ensure that we are rerunning the failing tests for around 1k iterations locally to confirm once we have fixed them.

Copy link
Contributor

❌ Gradle check result for d09228f: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 08fe462: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 25b3eb3: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 3823c96: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 1d048ec: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 7c4e809: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Kamal Nayan <[email protected]>

Minor format and nit fix also updated the ITs

Signed-off-by: Kamal Nayan <[email protected]>

Updated the integ tests

Signed-off-by: Kamal Nayan <[email protected]>
Copy link
Contributor

❌ Gradle check result for 14ba1e3: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants