Reduces severe performance degradation due to false-sharing. Note that this
does not account for hardware which can perform adjacent cacheline prefetch.
[mjg: massaged the commit message and the patch to use aligned_alloc
instead of malloc]
PR: 272238
MFC after: 1 week
The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.
Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix
In libthr we use PAGE_SIZE when allocating memory with mmap and to check
various structs will fit into a single page so we can use this allocator
for them.
Ask the kernel for the page size on init for use by the page allcator
and add a new machine dependent macro to hold the smallest page size
the architecture supports to check the structure is small enough.
This allows us to use the same libthr on arm64 with either 4k or 16k
pages.
Reviewed by: kib, markj, imp
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34984
In some corner cases of static linking and unexpected libraries order
on the linker command line, libc symbol might preempt the same libthr
symbol, in which case libthr jump table points back to libc causing
either infinite recursion or loop. Handle all of such symbols by
using private libthr names for them, ensuring that the right pointers
are installed into the table.
In collaboration with: arichardson
PR: 239475
Tested by: pho
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D21088
This basically adds makes use of the C99 restrict keyword, and also
adds some 'const's to four threading functions: pthread_mutexattr_gettype(),
pthread_mutexattr_getprioceiling(), pthread_mutexattr_getprotocol(), and
pthread_mutex_getprioceiling. The changes are in accordance to POSIX/SUSv4-2018.
Hinted by: DragonFlyBSD
Relnotes: yes
MFC after: 1 month
Differential Revision: D16722
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using mis-identified many licenses so this was mostly a manual - error
prone - task.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
This was prompted by a compiler warning about 'ret' shadowing
a local variable in the callers of the macro.
Reviewed by: kib
MFC after: 3 days
Sponsored by: Dell EMC
Differential Revision: https://reviews.freebsd.org/D10832
breaking the ABI. Special value is stored in the lock pointer to
indicate shared lock, and offline page in the shared memory is
allocated to store the actual lock.
Reviewed by: vangyzen (previous version)
Discussed with: deischen, emaste, jhb, rwatson,
Martin Simmons <martin@lispworks.com>
Tested by: pho
Sponsored by: The FreeBSD Foundation
a silly rwlock deadlock problem, the deadlock is caused by writer
waiters, if a thread has already locked a reader lock, and wants to
acquire another reader lock, it will be blocked by writer waiters,
but we had already fixed it years ago.
functions set or get pthread_rwlock type, current supported types are:
PTHREAD_RWLOCK_PREFER_READER_NP,
PTHREAD_RWLOCK_PREFER_WRITER_NONRECURSIVE_NP,
PTHREAD_RWLOCK_PREFER_WRITER_NP,
default is PTHREAD_RWLOCK_PREFER_WRITER_NONCECURSIVE_NP, this maintains
binary compatible with old code.
same null value, the code can not distinguish between them, to
fix the problem, now a destroyed object is assigned to a non-null
value, and it will be rejected by some pthread functions.
PTHREAD_ADAPTIVE_MUTEX_INITIALIZER_NP is changed to number 1, so that
adaptive mutex can be statically initialized correctly.
The most notable is that it is not bumped in rwlock_rdlock_common() when
the hard path (__thr_rwlock_rdlock()) returns successfully.
This can lead to deadlocks in libthr when rwlocks recursion in read mode
happens.
Fix the interested parts by correctly handling rdlock_count.
PR: threads/136345
Reported by: rink
Tested by: rink
Reviewed by: jeff
Approved by: re (kib)
MFC: 2 weeks
1. fast simple type mutex.
2. __thread tls works.
3. asynchronous cancellation works ( using signal ).
4. thread synchronization is fully based on umtx, mainly, condition
variable and other synchronization objects were rewritten by using
umtx directly. those objects can be shared between processes via
shared memory, it has to change ABI which does not happen yet.
5. default stack size is increased to 1M on 32 bits platform, 2M for
64 bits platform.
As the result, some mysql super-smack benchmarks show performance is
improved massivly.
Okayed by: jeff, mtm, rwatson, scottl
o In the rwlock code: move a duplicated check inside an if..else to after
the if...else clause.
o When initializing a static rwlock move the initialization check
inside the lock.
o In thr_setschedparam.c: When breaking out of the trylock...retry if busy
loop make sure to reset the mtx pointer to null if the mutex is nolonger
in a queue.
a PTHREAD_RWLOCK_INITIALIZER to do for rwlocks what
a similarly named symbol does for statically initialized mutexes.
This symbol was dropped in The Open Group Base Specifications Issue 6
and does not exist in IEEE Std 1003.1, 2003, but it should still be
supported for backwards compatibility.
Pointy hat: mtm
what do I get for my troubles? libc breaks offcourse!
Reimplement a hack (in libthr) that allows libc to use
rwlocks without initializing them first. The hack was reimplemented
so that only a private libc version of the rwlock locking functions
initializes an uninitialized rwlock. The application version will
correctly fail.
a list in the thread structure to keep track of the locks and
how many times they have been locked. This list is checked
on every lock and unlock. The traversal through the list is
O(n). Most applications don't hold so many locks at once that
this will become a problem. However, if it does become a problem
it might be a good idea to review this once libthr is
off probation and in the optimization cycle.
This fixes:
o deadlock when a thread tries to recursively acquire a
read lock when a writer is waiting on the lock.
o a thread could previously successfully unlock a lock it did not own
o deadlock when a thread tries to acquire a write lock on
a lock it already owns for reading or writing [ this is admittedly
not required by POSIX, but is nice to have ]
code and simply return EINVAL (which is allowed by the standard) in
all those pthread functions that previously initialized it.
o Refactor the pthread_rwlock_[try]rdlock() and pthread_rwlock_[try]wrlock()
functions. They are now completeley condensed into rwlock_rdlock_common()
and rwlock_wrlock_common(), respectively.
o If the application tries to destroy an rwlock that is currently
held by a thread return EBUSY where it previously went ahead and
freed all resources associated with the lock.
o Refactor _pthread_rwlock_init() to make it look (relatively) sane.
o When obtaining a read lock on an rwlock the check for whether it
would exceed the maximum allowed read locks should happen *before*
we obtain the lock.
o The pthread_rwlock_* functions shall *never* return EINTR, so make
sure to requeue/resuspend the thread if it encounters such an error.
o Make a note that pthread_rwlock_unlock() needs to ensure it holds a
lock on an rwlock it tries to unlock. It will be implemented in a
separate commit because it requires some additional rwlock infrastructure.