Skip to content

Slow startup when many threads exist #1678

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
abbeyj opened this issue Apr 16, 2025 · 2 comments
Open

Slow startup when many threads exist #1678

abbeyj opened this issue Apr 16, 2025 · 2 comments
Labels
enhancement Extension or improvement to existing feature Linux 🐧 Linux related issues

Comments

@abbeyj
Copy link

abbeyj commented Apr 16, 2025

This may be related to #1535.

When you have a large number of threads running (e.g. 10000), htop takes a noticeable amount of time to start up and display the first output.

I have the settings "Hide kernel threads" and "Hide userland process threads" turned on. I have the setting "Highlight out-dated/removed programs (red) / libraries (yellow)" turned off. I am not displaying the M_LRS column. I'm measuring startup time with time ./htop < /dev/null. To create a lot of threads for testing, I'm using perl -Mthreads -e 'sub thd { sleep; } threads->create(\&thd) for (1 .. 8000); print "All threads created...\n"; sleep;'

When there are only a handful of threads running, htop will start up quickly, in 0.1s or less. When many threads are running, this gets slower, taking a second or several seconds. I think there are few related factors that are contributing to this.

First, LinuxProcessTable_recurseProcTree will try to recurse into the task directory. At the top level this makes sense because /proc/N/task will exist and will contain a list of the threads for PID N. At the next level, this will try to open /proc/N/task/M/task. At least on my system this never exists and always fails with ENOENT. All those extra syscalls trying to check for a directory that does not exist add up Changing

LinuxProcessTable_recurseProcTree(this, procFd, lhost, "task", lp);
to

      if (!mainTask)
        LinuxProcessTable_recurseProcTree(this, procFd, lhost, "task", lp);

short-circuits this and avoids the extra open attempts. This shows an improvement of 15% or so for me. It may show a bigger effect if there is a virus scanner hooking into all open calls and adding extra overhead to them. If there is some situation in which the task directory might show up at the second level then this change wouldn't work.

Second, Machine_scanTables is called 3 times on startup. Two calls are from CommandLine_run and one is from ScreenManager_run => checkRecalculation. The two calls in CommandLine_run are intentional according to eb196f8. The fact that this is being run 3 times instead of 2 is seemingly not intended though. Avoiding the call in checkRecalculation looked difficult so I removed the second call in CommandLine_run with no apparent ill effects. That resulted in another ~15% improvement on top of the first change.

If this function was always quick to run then running it twice would be fine. But since it is sometimes slow to run, maybe the decision to run it twice should be revisited? I know that I'd personally prefer to have the UI drawn and responsive earlier if the downside was that some meters might need an additional update cycle. I might not even be displaying those meters. Or I might be running htop on this particular occasion to look at the process list and will not even be looking at the meter with the problem at all.

Removing both calls in CommandLine_run (and the call to CommandLine_delay) makes this even faster, getting an extra 20% on top of the version that only removes one call. With all changes combined this results in a ~48% reduction in startup time.

@BenBE
Copy link
Member

BenBE commented Apr 16, 2025

IIRC the third all may be related to highlighting of new/terminated processes. The first call is made with a timestamp of 0 intentionally to set the process creation time far into the past.

The proposed change regarding avoiding to recurse if not inside the main task seems reasonable.

@BenBE BenBE added enhancement Extension or improvement to existing feature Linux 🐧 Linux related issues labels Apr 16, 2025
BenBE added a commit to BenBE/htop that referenced this issue May 9, 2025
The view of tasks/threads that belong to a process is presented as a
flat list in the process' task subdirectory in procfs. There is no
need to dig any deeper than that.

Fixes: htop-dev#1678

Co-Authored-By: James Abbatiello <[email protected]>
BenBE added a commit to BenBE/htop that referenced this issue May 9, 2025
As detailed in commit eb196f8 the UI should receive at least
two information gathering cycles before the first data is displayed.

Previously, as of the commit mentioned above, there were two expliit
call sites that updated the Machine/Process information. These were
located as follows:

- CommandLine_run (First):
    Initial scan; used to gather the list of processes as a baseline.
    Also established a baseline for many of the available meters.

- CommandLine_run (Second, removed in this commit):
    Follow-up scan to refresh information.
    This scan is the first to use a proper monotonic clock reference.

- ScreenManager_run -> checkRecalculation (Third):
    Run for every refresh cycle at runtime.
    First scan that allows for proper rate information to be calculated.

With the change in this commit the second of these calls is removed,
causing the first set of values displayed to lack proper rate information.

Fixes: htop-dev#1678

Co-Authored-By: James Abbatiello <[email protected]>
@BenBE BenBE mentioned this issue May 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Extension or improvement to existing feature Linux 🐧 Linux related issues
Projects
None yet
Development

No branches or pull requests

3 participants
@abbeyj @BenBE and others