You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When you have a large number of threads running (e.g. 10000), htop takes a noticeable amount of time to start up and display the first output.
I have the settings "Hide kernel threads" and "Hide userland process threads" turned on. I have the setting "Highlight out-dated/removed programs (red) / libraries (yellow)" turned off. I am not displaying the M_LRS column. I'm measuring startup time with time ./htop < /dev/null. To create a lot of threads for testing, I'm using perl -Mthreads -e 'sub thd { sleep; } threads->create(\&thd) for (1 .. 8000); print "All threads created...\n"; sleep;'
When there are only a handful of threads running, htop will start up quickly, in 0.1s or less. When many threads are running, this gets slower, taking a second or several seconds. I think there are few related factors that are contributing to this.
First, LinuxProcessTable_recurseProcTree will try to recurse into the task directory. At the top level this makes sense because /proc/N/task will exist and will contain a list of the threads for PID N. At the next level, this will try to open /proc/N/task/M/task. At least on my system this never exists and always fails with ENOENT. All those extra syscalls trying to check for a directory that does not exist add up Changing
if (!mainTask)
LinuxProcessTable_recurseProcTree(this, procFd, lhost, "task", lp);
short-circuits this and avoids the extra open attempts. This shows an improvement of 15% or so for me. It may show a bigger effect if there is a virus scanner hooking into all open calls and adding extra overhead to them. If there is some situation in which the task directory might show up at the second level then this change wouldn't work.
Second, Machine_scanTables is called 3 times on startup. Two calls are from CommandLine_run and one is from ScreenManager_run => checkRecalculation. The two calls in CommandLine_run are intentional according to eb196f8. The fact that this is being run 3 times instead of 2 is seemingly not intended though. Avoiding the call in checkRecalculation looked difficult so I removed the second call in CommandLine_run with no apparent ill effects. That resulted in another ~15% improvement on top of the first change.
If this function was always quick to run then running it twice would be fine. But since it is sometimes slow to run, maybe the decision to run it twice should be revisited? I know that I'd personally prefer to have the UI drawn and responsive earlier if the downside was that some meters might need an additional update cycle. I might not even be displaying those meters. Or I might be running htop on this particular occasion to look at the process list and will not even be looking at the meter with the problem at all.
Removing both calls in CommandLine_run (and the call to CommandLine_delay) makes this even faster, getting an extra 20% on top of the version that only removes one call. With all changes combined this results in a ~48% reduction in startup time.
The text was updated successfully, but these errors were encountered:
IIRC the third all may be related to highlighting of new/terminated processes. The first call is made with a timestamp of 0 intentionally to set the process creation time far into the past.
The proposed change regarding avoiding to recurse if not inside the main task seems reasonable.
The view of tasks/threads that belong to a process is presented as a
flat list in the process' task subdirectory in procfs. There is no
need to dig any deeper than that.
Fixes: htop-dev#1678
Co-Authored-By: James Abbatiello <[email protected]>
BenBE
added a commit
to BenBE/htop
that referenced
this issue
May 9, 2025
As detailed in commit eb196f8 the UI should receive at least
two information gathering cycles before the first data is displayed.
Previously, as of the commit mentioned above, there were two expliit
call sites that updated the Machine/Process information. These were
located as follows:
- CommandLine_run (First):
Initial scan; used to gather the list of processes as a baseline.
Also established a baseline for many of the available meters.
- CommandLine_run (Second, removed in this commit):
Follow-up scan to refresh information.
This scan is the first to use a proper monotonic clock reference.
- ScreenManager_run -> checkRecalculation (Third):
Run for every refresh cycle at runtime.
First scan that allows for proper rate information to be calculated.
With the change in this commit the second of these calls is removed,
causing the first set of values displayed to lack proper rate information.
Fixes: htop-dev#1678
Co-Authored-By: James Abbatiello <[email protected]>
This may be related to #1535.
When you have a large number of threads running (e.g. 10000), htop takes a noticeable amount of time to start up and display the first output.
I have the settings "Hide kernel threads" and "Hide userland process threads" turned on. I have the setting "Highlight out-dated/removed programs (red) / libraries (yellow)" turned off. I am not displaying the M_LRS column. I'm measuring startup time with
time ./htop < /dev/null
. To create a lot of threads for testing, I'm usingperl -Mthreads -e 'sub thd { sleep; } threads->create(\&thd) for (1 .. 8000); print "All threads created...\n"; sleep;'
When there are only a handful of threads running, htop will start up quickly, in 0.1s or less. When many threads are running, this gets slower, taking a second or several seconds. I think there are few related factors that are contributing to this.
First,
LinuxProcessTable_recurseProcTree
will try to recurse into thetask
directory. At the top level this makes sense because/proc/N/task
will exist and will contain a list of the threads for PID N. At the next level, this will try to open/proc/N/task/M/task
. At least on my system this never exists and always fails with ENOENT. All those extra syscalls trying to check for a directory that does not exist add up Changinghtop/linux/LinuxProcessTable.c
Line 1603 in 987a47f
short-circuits this and avoids the extra open attempts. This shows an improvement of 15% or so for me. It may show a bigger effect if there is a virus scanner hooking into all
open
calls and adding extra overhead to them. If there is some situation in which the task directory might show up at the second level then this change wouldn't work.Second,
Machine_scanTables
is called 3 times on startup. Two calls are fromCommandLine_run
and one is fromScreenManager_run
=>checkRecalculation
. The two calls inCommandLine_run
are intentional according to eb196f8. The fact that this is being run 3 times instead of 2 is seemingly not intended though. Avoiding the call incheckRecalculation
looked difficult so I removed the second call inCommandLine_run
with no apparent ill effects. That resulted in another ~15% improvement on top of the first change.If this function was always quick to run then running it twice would be fine. But since it is sometimes slow to run, maybe the decision to run it twice should be revisited? I know that I'd personally prefer to have the UI drawn and responsive earlier if the downside was that some meters might need an additional update cycle. I might not even be displaying those meters. Or I might be running htop on this particular occasion to look at the process list and will not even be looking at the meter with the problem at all.
Removing both calls in
CommandLine_run
(and the call toCommandLine_delay
) makes this even faster, getting an extra 20% on top of the version that only removes one call. With all changes combined this results in a ~48% reduction in startup time.The text was updated successfully, but these errors were encountered: