> The Linux kernel retains the term "task" for "execution context"
This has historical roots from the first UNIX versions written for DEC computers. Even if the UNIX authors usually preferred the term "process", in the DEC documentation for their computers and operating systems the term "task" was always used for "process". So the term was also used in UNIX in various places.
> I don't remember any time that I wanted to join on a group of threads
This is not surprising, because there are many kinds of multi-threaded applications and many styles of programming them.
I do not doubt that what you say is correct for you applications, but my experience happened to be opposite. I have never encountered a case when I wanted to join a single thread, but I have encountered a lot of cases when I wanted to join any of a group of threads (e.g. for keeping a number of active threads matching the number of available cores) and also a less number of cases when I wanted to join all of a group of threads, typically at the end of an operation. The latter case is less important, because it can be done by repeating a join with a single thread, even if that is much less efficient than a wait that waits for all.
> are not part of the C or C++ language, but Windows system calls
This is precisely what I have already said in my first post, i.e. that standardized languages like C/C++ are forced to specify only the minimal features that are available on all operating systems, so they cannot include WaitForMultipleEvents, while PL/I was free of such portability concerns, so it could specify more powerful features.
> This is precisely what I have already said in my first post, i.e. that standardized languages like C/C++ are forced to specify only the minimal features that are available on all operating systems, so they cannot include WaitForMultipleEvents, while PL/I was free of such portability concerns,
I think you missed my point. WaitForMultipleEvents is not part of a thread API on any platform. It's a part of the platform API, and is used by single-threaded and multi-threaded code. There's no reason for pthreads (or any other thread API) to represent this system call, because the system call either exists, and can be used directly, or does not exist, and cannot be used.
In essence, you're really just noting that POSIX (not pthreads) never had a wait-for-just-about-anything API. That's a legitimate complaint, just not very relevant for multithreaded programming.
> This is not surprising
Well, given that you said "This is almost never what you want.", I'd count it as least a little surprising. My point was that multi-join is not "almost never what you want", but has always been "useful in certain contexts". I have never come across a multi-join API that blocks until all threads have completed (they typically return when any of the specified threads completes), and so the difference in efficiency for this version of multi-join is essentially identical to a loop+single-join.
>This has historical roots from the first UNIX versions written for DEC computers.
I don't see much evidence for this claim. task_t exists in early versions of AIX and Mach, and the terminology was already common in Multics (as you know). I don't think that Linux' use of task_t has any relationship to the Ultrix use, but maybe you have some specific insight here?
> Even if the UNIX authors usually preferred the term "process"
The OS I learned on was called CTOS, which used the term "process" to refer to "the basic unit of code that competes in the scheduler for access to the CPU". A "task" was essentially what we'd now call a program, complete with libraries and sub-processes. We didn't use the term "thread". I think CTOS dates to about 1981.
This has historical roots from the first UNIX versions written for DEC computers. Even if the UNIX authors usually preferred the term "process", in the DEC documentation for their computers and operating systems the term "task" was always used for "process". So the term was also used in UNIX in various places.
> I don't remember any time that I wanted to join on a group of threads
This is not surprising, because there are many kinds of multi-threaded applications and many styles of programming them.
I do not doubt that what you say is correct for you applications, but my experience happened to be opposite. I have never encountered a case when I wanted to join a single thread, but I have encountered a lot of cases when I wanted to join any of a group of threads (e.g. for keeping a number of active threads matching the number of available cores) and also a less number of cases when I wanted to join all of a group of threads, typically at the end of an operation. The latter case is less important, because it can be done by repeating a join with a single thread, even if that is much less efficient than a wait that waits for all.
> are not part of the C or C++ language, but Windows system calls
This is precisely what I have already said in my first post, i.e. that standardized languages like C/C++ are forced to specify only the minimal features that are available on all operating systems, so they cannot include WaitForMultipleEvents, while PL/I was free of such portability concerns, so it could specify more powerful features.