bitwiseworks / pthread-os2 Goto Github PK
View Code? Open in Web Editor NEWOS/2 implementation of pthread
OS/2 implementation of pthread
Currently, error codes from Dos APIs are either not analyzed or too widely generalized to something not very useful like EINVAL or ENOMEM. But now we have __libc_native2errno
in LIBCn which does a good job in converting OS/2 errors to POSIX errors. It should be used in approprate places (with a reference to the POSIX documentation regarding ERRORS that pthread APIs should return in some particular cases).
This is needed for e.g. V8 sources in Chromium (bitwiseworks/qtwebengine-chromium-os2#3).
Our implementation is based on pthreads4w, formerly known as pthreads-win32. There has been a lot of changes since we last synced in 2010 (see e4528d9, 171b54d). In particular, the latest win32 implementation supports PTHREAD_MUTEX_ROBUST and other nice things.
Once this is done, #9 should be not necessary any more (and could be undone).
Note urgent (we don't have projects strongly dependent on new stuff) so postponed.
This is a regression of ed15fbd.
The story is that pthread_condattr_init
used to be a no-op in previous builds. So apps calling it would do nothing on a pthread_condattr_t
variable instead of properly initializing it. And in previous pthread
builds pthread_cond_init
would simply always ignore its attr
argument but now it doesn't do so and assumes it was properly initialized. The following depends on the luck โ if an argument passed by the app to pthread_condattr_init
contains garbage which is an invalid memory location, we will get a crash like this:
______________________________________________________________________
Exception Report - created 2021/09/28 18:07:13
______________________________________________________________________
LIBC: Killed by SIGSEGV
Hostname: novator
OS2/eCS Version: 2.45
# of Processors: 4
Physical Memory: 3260 mb
Virt Addr Limit: 3072 mb
Exceptq Version: 7.11.5-shl BETA8 (Jun 1 2020 18:37:02)
______________________________________________________________________
Exception C0000005 - Access Violation
______________________________________________________________________
Process: C:\USR\BIN\CREATEREPO_C.EXE (09/23/2021 16:42:07 22,398)
PID: 29D (669)
TID: 01 (1)
Priority: 200
Filename: C:\USR\LIB\PTHR01.DLL (09/27/2021 16:05:25 7,640)
Address: 005B:1DDE079B (0001:0000079B)
Cause: Attempted to read from 00000038
(not a valid address)
______________________________________________________________________
Failing Instruction
______________________________________________________________________
1DDE0793 JZ 0x1dde07d8 (74 43)
1DDE0795 MOV EDX, [EDI] (8b17)
1DDE0797 TEST EDX, EDX (85d2)
1DDE0799 JZ 0x1dde07d8 (74 3d)
1DDE079B >MOV EDX, [EDX] (8b12)
1DDE079D MOV [EAX], EDX (8910)
1DDE079F MOV DWORD [ESP+0xc], 0x0 (c74424 0c 00000000)
1DDE07A7 MOV DWORD [ESP+0x8], 0x800 (c74424 08 00080000)
______________________________________________________________________
Registers
______________________________________________________________________
EAX : 20037A34 EBX : 2005514C ECX : 00000001 EDX : 00000038
ESI : 20037A28 EDI : 0012FC8C
ESP : 0012FC30 EBP : 0012FC68 EIP : 1DDE079B EFLG : 00010202
CS : 005B CSLIM: FFFFFFFF SS : 0053 SSLIM: FFFFFFFF
EAX : read/write memory allocated by LIBCN0
EBX : read/write memory allocated by LIBCN0
ECX : not a valid address
EDX : not a valid address
ESI : read/write memory allocated by LIBCN0
EDI : read/write memory on this thread's stack
______________________________________________________________________
Stack Info for Thread 01
______________________________________________________________________
Size Base ESP Max Top
00100000 00130000 -> 0012FC30 -> 0012D000 -> 00030000
______________________________________________________________________
Call Stack
______________________________________________________________________
EBP Address Module Obj:Offset Nearest Public Symbol
-------- --------- -------- ------------- -----------------------
Trap -> 1DDE079B PTHR01 0001:0000079B my_os2cond.c#73 _pthread_cond_init + 6B 0001:00000730 (.\pthread-os2-0.2.5\src\my_os2cond.c)
Offset Name Type Hex Value
8 cond pointer to type 0x202 20037A28
12 attr pointer to type 0x208 12FC8C
0012FC68 1D41537C GLIB20 0001:000B537C
0012FC98 1D41575A GLIB20 0001:000B575A
0012FCA8 1D364322 GLIB20 0001:00004322
0012FCC8 1D36435B GLIB20 0001:0000435B
0012FCE8 1D3BB06D GLIB20 0001:0005B06D
0012FD28 0001169F CREATERE 0001:0000169F
0012FF40 00010047 CREATERE 0001:00000047
0012FF64 1E19F621 LIBCX0 0001:0000F621 ___init_app + 11 0001:0000F610 (main.obj)
0012FFE0 1E05384B LIBCN0 0001:0003384B appinit.s#16 ___init_app + B 0001:00033840 (appinit.obj)
However, it might accidentally point to some valid memory and touching it can cause data corruption with unpredictable results. This needs a proper fix for backward compatibility.
Sometimes a thread exit after checking map->done which was false and before DosWaitThread(). As a result, DosWaitThread() fails because the thread does not exist anymore. Then, pthread_join() returns ESRCH.
Here is the patch.
0001-double-check-map-done-when-joining-thread.patch.txt
From 73b73e781ee079a6d0cae568cfaa63230ef5360c Mon Sep 17 00:00:00 2001
From: KO Myung-Hun <[email protected]>
Date: Sun, 19 Jul 2020 23:07:43 +0900
Subject: [PATCH] double check map->done when joining thread
Sometimes a thread exit after checking map->done which was false and
before DosWaitThread(). As a result, DosWaitThread() fails because the
thread does not exist anymore. Then, pthread_join() returns ESRCH.
However, this is wrong! Because the thread ended already without any
problems. Instead, check map->done once more to check the thread ended
already.
git-status fails on vlc because of this, complaining like this:
fatal: unable to join threaded lstat
---
src/my_os2thread.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/src/my_os2thread.c b/src/my_os2thread.c
index 427bc3a..d92fe20 100644
--- a/src/my_os2thread.c
+++ b/src/my_os2thread.c
@@ -301,8 +301,11 @@ int pthread_join( pthread_t thread, void **status)
// (by DosKillThread in LIBC which delivers Posix signals this way)
if (rc == ERROR_INTERRUPT)
continue;
- if (rc == ERROR_INVALID_THREADID)
+ if (rc == ERROR_INVALID_THREADID) {
+ if (map->done)
+ break;
return ESRCH;
+ }
if (rc != NO_ERROR)
return EINVAL;
break;
--
2.22.0
Current pthread crashes like that when running tst-flock2:
Running test tst-flock2...
child locked file
file locked by child
Killed by SIGSEGV
pid=0x19f5 ppid=0x19f4 tid=0x0003 slot=0x00ad pri=0x0200 mc=0x0001 ps=0x0010
D:\CODING\LIBCX\MASTER-BUILD\STAGE\BIN\TST-FLOCK2.EXE
TST-FLOC 0:00000510
cs:eip=0000:00010510 ss:esp=0000:00000000 ebp=00000053
ds=0000 es=0000 fs=0000 gs=0000 efl=00000000
eax=0246fdb4 ebx=1ffc9d7c ecx=0246fdc8 edx=0246fdec edi=00000000 esi=0246ffd4
Creating 19F5_03.TRP
Exit code = 127
Elapsed time = 3.440000s
ERROR: Executing "D:\Coding\libcx\env.cmd" failed with code 127.
This surely did not happen before so some recent change must be the culprit.
There is a pthread key destructor implementation in c485249 (by @komh) but it it's not thread-safe. I.e. if two threads will attempt to create two keys at the same time, the behavior is undefined (data corruption most likely).
Needed for bitwiseworks/qtwebengine-chromium-os2#3 as Chromium may use it from multiple threads (e.g. at startup time, see its thread_local_storage.cc
).
Maybe I am overlooking something but these functions:
"pthread_create"
"pthread_self"
(possibly others, I have not checked all)
are dynamically allocating memory and I can see no function or cleanup routine that would free that memory under whatever condition (except for if an error occurs).
Of course, the user of the library can call "free" on the returned pointers (at the places he sees fit) but I doubt that this matches the intended design of the library.
"pthread_attr_init" on the other hand has a corresponding counterpart "pthread_attr_destroy".
There is a pthread key destructor implementation in c485249 (by @komh) but it only works for threads started with pthread_create
and not for the ones started with LIBC beginthread
(and also not for the main thread). This is not what a POSIX app expects.
Needed for bitwiseworks/qtwebengine-chromium-os2#3.
we need this feature in the glib2 port. at least in the update to version 2.44
this is also a part of #10, even the win32 pthread doesn't support it right now. which is a bit strange imho.
Continuation from bitwiseworks/qtwebengine-chromium-os2#32 (comment) .
After updating pthread to 0.2.3-1, yum crashes every time with a libc panic "fmutex deadlock: Recursive mutex!" and a trp-file is created. It is not even possible to install the debuginfo-packages you requested, and neither does (for example) "yum downgrade pthread" work. Only after succeeding in downgrading pthread to 0.2.2-1 (not in an officially approved way) did yum again work normally.
Starting situation is with libc 0.1.6-1 , libcx 0.6.9-2, pthread 0.2.2-1, gettext 0.19.8.1-3, rpm 4.13.0-19 and python 2.7.6-25. Debuginfo packages are installed for libc 0.1.7-1, pthread 0.2.3-1, and for the mentioned versions of the other packages (so only libc and pthread have debuginfo packages installed for their updated versions).
Executing "yum update pthread" seems to work normally and updates both pthread to 0.2.3-1 and also libc to 0.1.7-1. Then reboot. Next try to execute "yum update libcx". That proceeds normally up to the final question "Is this ok [y/N]:". Answering Y produces the line "Downloading Packages" and next the libc panic occurs and the trp-file is created.
The full libc panic is copied to the beginning of the trp-file.
given the below snippet:
#include <pthread.h>
static pthread_mutex_t foo_mutex;
void main()
{
pthread_mutex_lock(&foo_mutex);
/* Do work. */
pthread_mutex_unlock(&foo_mutex);
}
it crashes on our pthread impmementation while it works on nix and mac.
POSIX man page says to initialise the struct first with either:
static pthread_mutex_t foo_mutex = PTHREAD_MUTEX_INITIALIZER;
or
pthread_mutext_init(&foo_mutex, NULL);
but a some unix developer seem to not care and this then gives the nice crashes for us.
Currently, the pthread_mutex
API implementation internally uses OS/2 mutexes (DosCreateMutexSem
). This is not a good idea because Linux applications may create LOTS of pthread mutexes (one example is bitwiseworks/qtwebengine-chromium-os2#31) and on OS/2 there may be only 65512 of them per process (due to kernel 16-bitness). And besides that, the mutex is a rather expensive kernel primitive.
I will replace OS/2 mutexes with LIBC _fmutex
primitives because they are in fact cheaper: they only create an OS/2 kernel primitive when there is a definite need for that: when two threads happen to request (lock) the mutex at the very same time. In this case, the thread which is a bit slower will create an OS/2 event semaphore (which is also cheaper than an OS/2 mutex) and wait on it until the first thread releases the lock. If there is only one thread locking and unlocking the mutex at the same time (and no overlaps with other threads wanting to do the same), then the semaphore is not created and locking is guarded by an atomic CPU operation.
Note that the original pthreads-win32 code our implementation is based upon, uses exactly the same optimization: they use an atomic flag variable and a Win32 event object. However, when @ydario imported it 10 years ago (see e4528d9), he decided to go with an OS/2 mutex instead โ for simplicity, I guess. But now it shot us in the back.
As an alternative, I could reimport it again now and implement it using an atomic flag and an OS/2 event semaphore as well. This would also bring things like PTHREAD_MUTEX_ROBUST attr support (and all the other "current" pthread stuff) but it looks like a lot of work for now, given that I'm totally occupied with Chromium. So I will go with _fmutex
for now and postpone the task of resynching for later.
pthread.h
defines pthread_key_t
as a pointer to an opaque pthread_key_t_
struct. While it is not required to be integer by POSIX, it looks like every pthread implementation over there defines it like that. Having a pointer instead breaks many POSIX apps that often initialise pthread_key_t
values with 0, -1 without a cast.
We should be POSIX friendly and use an integer too. Needed for bitwiseworks/qtwebengine-chromium-os2#3.
There is TlsFreeThreadLocalMemory
(which I believe is there since mysql times) which is intended to free all global resources (like the OS/2 TLS slot) allocated on behalf of the pthread_key
implementation. However, this function is never called now assuming that the resources are freed anyway upon application termination (which is mostly true).
However, it makes sense to call TlsFreeThreadLocalMemory
explicitly from pthread's DLL_InitTerm
for the sake of a proper resource cleanup. This routine should also destory key with index 0 (which pthread itself creates for its pthread_self()
implementation).
Not very urgent so postponed.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.