Slow startup when many threads exist #1678

abbeyj · 2025-04-16T20:41:35Z

This may be related to #1535.

When you have a large number of threads running (e.g. 10000), htop takes a noticeable amount of time to start up and display the first output.

I have the settings "Hide kernel threads" and "Hide userland process threads" turned on. I have the setting "Highlight out-dated/removed programs (red) / libraries (yellow)" turned off. I am not displaying the M_LRS column. I'm measuring startup time with time ./htop < /dev/null. To create a lot of threads for testing, I'm using perl -Mthreads -e 'sub thd { sleep; } threads->create(\&thd) for (1 .. 8000); print "All threads created...\n"; sleep;'

When there are only a handful of threads running, htop will start up quickly, in 0.1s or less. When many threads are running, this gets slower, taking a second or several seconds. I think there are few related factors that are contributing to this.

First, LinuxProcessTable_recurseProcTree will try to recurse into the task directory. At the top level this makes sense because /proc/N/task will exist and will contain a list of the threads for PID N. At the next level, this will try to open /proc/N/task/M/task. At least on my system this never exists and always fails with ENOENT. All those extra syscalls trying to check for a directory that does not exist add up Changing

htop/linux/LinuxProcessTable.c

Line 1603 in 987a47f

LinuxProcessTable_recurseProcTree(this, procFd, lhost, "task", lp);

to

      if (!mainTask)
        LinuxProcessTable_recurseProcTree(this, procFd, lhost, "task", lp);

short-circuits this and avoids the extra open attempts. This shows an improvement of 15% or so for me. It may show a bigger effect if there is a virus scanner hooking into all open calls and adding extra overhead to them. If there is some situation in which the task directory might show up at the second level then this change wouldn't work.

Second, Machine_scanTables is called 3 times on startup. Two calls are from CommandLine_run and one is from ScreenManager_run => checkRecalculation. The two calls in CommandLine_run are intentional according to eb196f8. The fact that this is being run 3 times instead of 2 is seemingly not intended though. Avoiding the call in checkRecalculation looked difficult so I removed the second call in CommandLine_run with no apparent ill effects. That resulted in another ~15% improvement on top of the first change.

If this function was always quick to run then running it twice would be fine. But since it is sometimes slow to run, maybe the decision to run it twice should be revisited? I know that I'd personally prefer to have the UI drawn and responsive earlier if the downside was that some meters might need an additional update cycle. I might not even be displaying those meters. Or I might be running htop on this particular occasion to look at the process list and will not even be looking at the meter with the problem at all.

Removing both calls in CommandLine_run (and the call to CommandLine_delay) makes this even faster, getting an extra 20% on top of the version that only removes one call. With all changes combined this results in a ~48% reduction in startup time.

The text was updated successfully, but these errors were encountered:

BenBE · 2025-04-16T21:15:09Z

IIRC the third all may be related to highlighting of new/terminated processes. The first call is made with a timestamp of 0 intentionally to set the process creation time far into the past.

The proposed change regarding avoiding to recurse if not inside the main task seems reasonable.

The view of tasks/threads that belong to a process is presented as a flat list in the process' task subdirectory in procfs. There is no need to dig any deeper than that. Fixes: htop-dev#1678 Co-Authored-By: James Abbatiello <[email protected]>

As detailed in commit eb196f8 the UI should receive at least two information gathering cycles before the first data is displayed. Previously, as of the commit mentioned above, there were two expliit call sites that updated the Machine/Process information. These were located as follows: - CommandLine_run (First): Initial scan; used to gather the list of processes as a baseline. Also established a baseline for many of the available meters. - CommandLine_run (Second, removed in this commit): Follow-up scan to refresh information. This scan is the first to use a proper monotonic clock reference. - ScreenManager_run -> checkRecalculation (Third): Run for every refresh cycle at runtime. First scan that allows for proper rate information to be calculated. With the change in this commit the second of these calls is removed, causing the first set of values displayed to lack proper rate information. Fixes: htop-dev#1678 Co-Authored-By: James Abbatiello <[email protected]>

BenBE added enhancement Extension or improvement to existing feature Linux 🐧 Linux related issues labels Apr 16, 2025

BenBE mentioned this issue May 9, 2025

Startup boost #1696

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Slow startup when many threads exist #1678

Slow startup when many threads exist #1678

abbeyj commented Apr 16, 2025

BenBE commented Apr 16, 2025

Uh oh!

Uh oh!

Slow startup when many threads exist #1678

Slow startup when many threads exist #1678

Comments

abbeyj commented Apr 16, 2025

BenBE commented Apr 16, 2025

Uh oh!