Commit graph

91 commits

Author SHA1 Message Date
Andrew Turner dd4155bec7 rtld: Add arch_digest_dynamic
This will be used to handle the DT_AARCH64_VARIANT_PCS tag.

Reviewed by:	kib
Sponsored by:	Arm Ltd
Differential Revision:	https://reviews.freebsd.org/D45117
2024-05-17 09:37:12 +00:00
Andrew Turner 06db20ffec rtld: Add MD_OBJ_ENTRY to extend Struct_Obj_Entry
Add a macro the architectures can use to add per-arch fields to
Struct_Obj_Entry.

Reviewed by:	kib
Sponsored by:	Arm Ltd
Differential Revision:	https://reviews.freebsd.org/D45116
2024-05-17 09:36:08 +00:00
Andrew Turner d8925a5f42 Support BTI in rtld
Read the elf note to decide when to set the guard page on arm64.

Reviewed by:	kib
Sponsored by:	Arm Ltd
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D39452
2024-04-12 14:30:44 +00:00
Stephen J. Kiernan 95335dd3c1 rtld: introduce STATIC_TLS_EXTRA
The new STATIC_TLS_EXTRA variable provides a means for applications
to increases the size of the extra static TLS space allocated by
rtld beyond the default of '128'. This extra static TLS space is used
for objects loaded with dlopen.

The value specified in the variable must be no less than the default
value and no greater than the maximum allowed value for size_t type.

If an invalid value is specified, rtld will ignore it and just use
the default value.

The rtld(1) man page is updated to document this new option.

Obtained from:  Juniper Networks, Inc.
Differential Revision:  https://reviews.freebsd.org/D42025
2023-10-30 13:42:05 -04:00
Warner Losh d0b2dbfa0e Remove $FreeBSD$: one-line sh pattern
Remove /^\s*#[#!]?\s*\$FreeBSD\$.*$\n/
2023-08-16 11:55:03 -06:00
Warner Losh 42b388439b Remove $FreeBSD$: one-line .h pattern
Remove /^\s*\*+\s*\$FreeBSD\$.*$\n/
2023-08-16 11:54:23 -06:00
Warner Losh b3e7694832 Remove $FreeBSD$: two-line .h pattern
Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/
2023-08-16 11:54:16 -06:00
Dmitry Chagin e541cf8316 rtld: Annotate .rtld_start on i386
Add a stop indicator to rtld_start to satisfy unwinders:
The right unwinding stop indicator should be CFI-undefined PC.
https://dwarfstd.org/doc/Dwarf3.pdf - page 118:
If a Return Address register is defined in the virtual unwind table,
and its rule is undefined (for example, by DW_CFA_undefined), then
there is no return address and no call address, and the virtual
unwind of stack activations is complete.

That is allows gdb and libunwind successfully stop when unwinding stack
from global constructors and destructors.

Reviewed by:		kib
Differential Revision:	https://reviews.freebsd.org/D40949
2023-07-11 15:10:32 +03:00
Dmitry Chagin 86c63225ea rtld: Microoptimize rtld_start on i386
Initial stack pointer is preserved in calle-saved %esi,
use it bellow to pass initial stack pointer to _rtld().

Reviewed by:		kib
Differential Revision:	https://reviews.freebsd.org/D40950
2023-07-11 15:10:08 +03:00
Konstantin Belousov 283a4f4097 rtld: rename tls_done to tls_static
The meaning of the flag is that static TLS allocation was done.

Taken from NetBSD Joerg Sonnenberger change for src/libexec/ld.elf_so/tls.c
rev. 1.18.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2023-06-05 22:33:17 +03:00
Warner Losh 4d846d260e spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD
The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.

Discussed with:		pfg
MFC After:		3 days
Sponsored by:		Netflix
2023-05-12 10:44:03 -06:00
John Baldwin b069d3e019 rtld: Revert "When loading dso without PT_GNU_STACK phdr, only call"
After the removal of ia64 and sparc64, all current architectures
support executable stacks at an architectural level.

This reverts commit 1290d38ac5.

Reviewed by:	kib
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D37904
2023-01-04 14:55:00 -08:00
John Baldwin 8bcdb144eb TLS: Use <machine/tls.h> for libc and rtld.
- Include <machine/tls.h> in MD rtld_machdep.h headers.

- Remove local definitions of TLS_* constants from rtld_machdep.h
  headers and libc using the values from <machine/tls.h> instead.

- Use _tcb_set() instead of inlined versions in MD
  allocate_initial_tls() routines in rtld.  The one exception is amd64
  whose _tcb_set() invokes the amd64_set_fsbase ifunc.  rtld cannot
  use ifuncs, so amd64 inlines the logic to optionally write to fsbase
  directly.

- Use _tcb_set() instead of _set_tp() in libc.

- Use '&_tcb_get()->tcb_dtv' instead of _get_tp() in both rtld and libc.
  This permits removing _get_tp.c from rtld.

- Use TLS_TCB_SIZE and TLS_TCB_ALIGN with allocate_tls() in MD
  allocate_initial_tls() routines in rtld.

Reviewed by:	kib, jrtc27 (earlier version)
Differential Revision:	https://reviews.freebsd.org/D33353
2021-12-09 13:23:05 -08:00
Fangrui Song 8f63fa78e8 rtld: Remove calculate_tls_end
Variant I architectures use off and Variant II ones use size + off.
Define TLS_VARIANT_I/TLS_VARIANT_II symbols similarly to how libc
handles it.

Reviewed by:	kib
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31539
Differential revision:	https://reviews.freebsd.org/D31541
2021-08-16 13:55:35 +03:00
Fangrui Song e6c7696203 rtld: Fix i386/amd64 TP offset when p_vaddr % p_align != 0
For a Variant II architecture, the TP offset of a TLS symbol is st_value -
tlsoffset + r_addend. tlsoffset is computed by either calculate_tls_offset
or calculate_first_tls_offset.

The return value of calculate_first_tls_offset is the smallest integer
satisfying res >= size and (-res) % p_align = p_vaddr % p_align
(= p_offset % p_align).  (The formula is a bit contrived. The basic idea
is to subtract the minimum integer from size + align - 1 so that the result
ihas the expected remainder.)

Reviewed by:	kib
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31538
Differential revision:	https://reviews.freebsd.org/D31541
2021-08-16 13:55:34 +03:00
Konstantin Belousov e8b9c508b7 rtld: use _get_tp() in __tls_get_addr()
This eliminates some non-trivial amount of code duplication, where done.
Only x86 and mips are handled right now.

Tested by:      bdragon (powerpc), mhorne (riscv)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D29623
2021-04-09 23:46:24 +03:00
Konstantin Belousov 99c2ce7ef1 rtld: define TLS_DTV_OFFSET on all architectures
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D29623
2021-04-09 23:46:24 +03:00
Konstantin Belousov f61ecf60cf rtld/x86/reloc.c: style
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D29623
2021-04-09 23:46:23 +03:00
Marius Strobl b58c853edf rtld-elf(1): remove obsolete pre_init() hook
It's no longer used since 600ee699ed
and r358358 respectively.
2020-12-25 19:47:46 +01:00
Konstantin Belousov e5c3405ce8 Align initial-exec TLS segments to the p_vaddr % align.
This is continuation of D21163/r359634, which handled the alignment
for global mode.

Non-x86 arches are not handled, maintainers are welcomed.

Tested by:	emaste
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D24366
2020-04-19 09:28:59 +00:00
Konstantin Belousov e3741c01c6 r357895: fix typo in the relocation name for i386 IRELATIVE.
Reported by: antoine
Sponsored by:	The FreeBSD Foundation
MFC after:	6 days
2020-02-14 12:59:27 +00:00
Konstantin Belousov c5ca0d1132 Handle non-plt IRELATIVE relocations, at least for x86.
lld 10.0 seems to generate this relocation for rdtsc_mb() ifunc in our libc.

Reported, reviewed, and tested by:	dim (amd64, previous version)
Discussed with:	emaste
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D23652
2020-02-13 23:42:09 +00:00
Konstantin Belousov 7e3300e505 rtld: clean up Makefile.
Move all MD statements into $MACHINE_ARCH/Makefile.inc.
Unconditionally apply version script to rtld, the interpreter is not
functional without it for long time.

Reviewed by:	brooks, emaste
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D23083
2020-01-11 09:18:58 +00:00
Konstantin Belousov 4105901933 rtld-elf: Remove x86 elf_rtld.x linker scripts.
First, amd64 version of the script cannot work at least due to the
wrong architecture specification.  Second, kernel can activate shared
objects for long time, due to PIE support.

It seems the intent was to allow ld-elf.so.1 to be build and used as
an executable.  Since we have direct exec mode implemented for dso
ld-elf.so.1, the non-functional and commented out scripts can be
finally removed.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2019-08-04 21:43:34 +00:00
Eric van Gyzen ac818ca644 rtld: pacify -Wmaybe-uninitialized from gcc6
Sponsored by:	Dell EMC Isilon
2019-02-01 23:16:59 +00:00
Michal Meloun 4849c3a570 Improve R_AARCH64_TLSDESC relocation.
The original code did not support dynamically loaded libraries and used
suboptimal access to TLS variables.
New implementation removes lazy resolving of TLS relocation - due to flaw
in TLSDESC design is impossible to switch resolver function at runtime
without expensive locking.

Due to this, 3 specialized resolvers are implemented:
 - load time resolver for TLS relocation from libraries loaded with main
   executable (thus with known TLS offset).
 - resolver for undefined thread weak symbols.
 - slower lazy resolver for dynamically loaded libraries with fast path for
   already resolved symbols.

PR:		228892, 232149, 233204, 232311
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D18417
2018-12-15 10:38:07 +00:00
Alex Richardson 903e0ffd07 rtld-elf: compile with WANRS=4 warnings other than -Wcast-align
Reviewed By:	kib
Approved By:	brooks (mentor)
Differential Revision: https://reviews.freebsd.org/D17153
2018-10-29 21:08:19 +00:00
Marius Strobl 41fc6f680b o Let rtld(1) set up psABI user trap handlers prior to executing the
objects' init functions instead of doing the setup via a constructor
  in libc as the init functions may already depend on these handlers
  to be in place. This gets us rid of:
  - the undefined order in which libc constructors as __guard_setup()
    and jemalloc_constructor() are executed WRT __sparc_utrap_setup(),
  - the requirement to link libc last so __sparc_utrap_setup() gets
    called prior to constructors in other libraries (see r122883).
  For static binaries, crt1.o still sets up the user trap handlers.
o Move misplaced prototypes for MD functions in to the MD prototype
  section of rtld.h.
o Sprinkle nitems().
2018-02-03 23:14:11 +00:00
Pedro F. Giffuni e6209940de libexec: adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

No functional change intended.
2017-11-27 15:25:02 +00:00
Konstantin Belousov e35ddbe448 Implement LD_BIND_NOT knob for rtld.
From the manpage:
When set to a nonempty string, prevents modifications of the PLT slots
when doing bindings.  As result, each call of the PLT-resolved
function is resolved.  In combination with debug output, this provides
complete account of all bind actions at runtime.

Same feature exists on Linux and Solaris.

Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2017-03-15 21:11:57 +00:00
Konstantin Belousov d27078f990 Adjust r308689 to make rtld compilable with either in-tree or
(hopefully) stock gcc 4.2.1 on i386 and other arches.

In particular:
- Do not use %ebx in the asm constraints on i386, since rtld is
  compiled with -fPIC and gcc cannot handle GOT-base register reload
  (clang and newer gcc can).
- Avoid direct use of [static N] construct in the function
  declaration/definion.  In-tree gcc was patched to support this, but
  stock 4.2.1 cannot handle the feature.

Requested by:	bde
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2016-11-21 14:13:57 +00:00
Konstantin Belousov 4352999e0e Pass CPUID[1] %edx (cpu_feature), %ecx (cpu_feature2) and
CPUID[7].%ebx (cpu_stdext_feature), %ecx (cpu_stdext_feature2) to the
ifunc resolvers on x86.

It is much more clean to use CPUID instruction in usermode to retrieve
this information than to pass AT_HWCAP aux vector from kernel, on
x86.  Still, the change does allow for use of AT_HWCAP on arches where it is
needed, by passing aux array to ifunc_init() initializer which should
prepare arguments for ifunc resolvers.

Current signature for resolvers on x86 is
	func_t iresolve(uint32_t cpu_feature, uint32_t cpu_feature2,
	    uint32_t cpu_stdext_feature, uint32_t cpu_stdext_feature2);
where arguments have identical meaning as the kernel variables of the
same name.  The ABIs allow to use resolvers with the void or shortened
list of arguments.

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D8448
2016-11-15 09:43:26 +00:00
Konstantin Belousov 9fee0541f2 Do not call callbacks for dl_iterate_phdr(3) with the rtld bind and
phdr locks locked.  This allows to call rtld services from the
callback, which is only reasonable for dlopen(path, RTLD_NOLOAD) to
test existence of the library in the image, and for dlsym().  The
later might still be not quite safe, due to the lazy resolution of
filters.

To allow dropping the locks around iteration in dl_iterate_phdr(3), we
insert markers to track current position between relocks.  The global
objects list is converted to tailq and all iterators skip markers,
globallist_next() and globallist_curr() helpers are added.

Reported and tested by:	davide
Reviewed by:	kan
Sponsored by:	The FreeBSD Foundation
MFC after:	3 weeks
2016-01-20 07:21:33 +00:00
Warner Losh 8fd53f4577 Create a generalized exec hook that different architectures can hook
into if they need to, but default to no action.

Differential Review: https://reviews.freebsd.org/D2718
2016-01-03 04:32:02 +00:00
Eric van Gyzen ddab052725 Disable SSE in libthr
Clang emits SSE instructions on amd64 in the common path of
pthread_mutex_unlock.  If the thread does not otherwise use SSE,
this usage incurs a context-switch of the FPU/SSE state, which
reduces the performance of multiple real-world applications by a
non-trivial amount (3-5% in one application).

Instead of this change, I experimented with eagerly switching the
FPU state at context-switch time.  This did not help.  Most of the
cost seems to be in the read/write of memory--as kib@ stated--and
not in the #NM handling.  I tested on machines with and without
XSAVEOPT.

One counter-argument to this change is that most applications already
use SIMD, and the number of applications and amount of SIMD usage
are only increasing.  This is absolutely true.  I agree that--in
general and in principle--this change is in the wrong direction.
However, there are applications that do not use enough SSE to offset
the extra context-switch cost.  SSE does not provide a clear benefit
in the current libthr code with the current compiler, but it does
provide a clear loss in some cases.  Therefore, disabling SSE in
libthr is a non-loss for most, and a gain for some.

I refrained from disabling SSE in libc--as was suggested--because
I can't make the above argument for libc.  It provides a wide variety
of code; each case should be analyzed separately.

https://lists.freebsd.org/pipermail/freebsd-current/2015-March/055193.html

Suggestions from:	dim, jmg, rpaulo
Approved by:	kib (mentor)
MFC after:	2 weeks
Sponsored by:	Dell Inc.
2015-08-05 12:53:55 +00:00
Konstantin Belousov 0c4f9ecde3 Change compiler setting to make default visibility of the symbols for
rtld on x86 to be hidden.  This is a micro-optimization, which allows
intrinsic references inside rtld to be handled without indirection
through PLT.  The visibility of rtld symbols for other objects in the
symbol namespace is controlled by a version script.

Reviewed by:	kan, jilles
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2015-03-29 18:53:21 +00:00
Konstantin Belousov 74b0daf4f9 Optimize r270798, only do the second pass over non-plt relocations
when the first pass found IFUNCs.

Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2014-08-29 10:43:56 +00:00
Konstantin Belousov 14c3564759 IFUNC symbol type shall be processed for non-PLT relocations,
e.g. when a global variable is initialized with a pointer to ifunc.
Add symbol type check and call resolver for STT_GNU_IFUNC symbol types
when processing non-PLT relocations, but only after non-IFUNC
relocations are done.  The two-phase proceessing is required since
resolvers may reference other symbols, which must be ready to use when
resolver calls are done.

Restructure reloc_non_plt() on x86 to call find_symdef() and handle
IFUNC in single place.

For non-x86 reloc_non_plt(), check for call for IFUNC relocation and
do nothing, to avoid processing relocs twice.

PR:	193048
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2014-08-29 09:29:10 +00:00
Konstantin Belousov f62651920d Add GNU hash support for rtld.
Based on dragonflybsd support for GNU hash by John Marino <draco marino st>
Reviewed by:	kan
Tested by:	bapt
MFC after:	2 weeks
2012-04-30 13:31:10 +00:00
Konstantin Belousov 082f959ac8 Fix several problems with our ELF filters implementation.
Do not relocate twice an object which happens to be needed by loaded
binary (or dso) and some filtee opened due to symbol resolution when
relocating need objects.  Record the state of the relocation
processing in Obj_Entry and short-circuit relocate_objects() if
current object already processed.

Do not call constructors for filtees loaded during the early
relocation processing before image is initialized enough to run
user-provided code.  Filtees are loaded using dlopen_object(), which
normally performs relocation and initialization.  If filtee is
lazy-loaded during the relocation of dso needed by the main object,
dlopen_object() runs too earlier, when most runtime services are not
yet ready.

Postpone the constructors call to the time when main binary and
depended libraries constructors are run, passing the new flag
RTLD_LO_EARLY to dlopen_object().  Symbol lookups callers inform
symlook_* functions about early stage of initialization with
SYMLOOK_EARLY.  Pass flags through all functions participating in
object relocation.

Use the opportunity and fix flags argument to find_symdef() in
arch-specific reloc.c to use proper name SYMLOOK_IN_PLT instead of
true, which happen to have the same numeric value.

Reported and tested by:	theraven
Reviewed by:	kan
MFC after:	2 weeks
2012-03-20 13:20:49 +00:00
Konstantin Belousov 83aa9cc00c Add support for preinit, init and fini arrays. Some ABIs, in
particular on ARM, do require working init arrays.

Traditional FreeBSD crt1 calls _init and _fini of the binary, instead
of allowing runtime linker to arrange the calls.  This was probably
done to have the same crt code serve both statically and dynamically
linked binaries.  Since ABI mandates that first is called preinit
array functions, then init, and then init array functions, the init
have to be called from rtld now.

To provide binary compatibility to old FreeBSD crt1, which calls _init
itself, rtld only calls intializers and finalizers for main binary if
binary has a note indicating that new crt was used for linking.  Add
parsing of ELF notes to rtld, and cache p_osrel value since we parsed
it anyway.

The patch is inspired by init_array support for DragonflyBSD, written
by John Marino.

Reviewed by:	kan
Tested by:	andrew (arm, previous version), flo (sparc64, previous version)
MFC after:	3 weeks
2012-03-11 20:03:09 +00:00
Ed Schouten 581f58e7a3 Remove unneeded dtv variable.
It is only assigned and not used at all. The object files stay identical
when the variables are removed.

Approved by:	kib
2012-01-17 21:55:20 +00:00
Konstantin Belousov 5734c46c68 _rtld_bind() read-locks the bind lock, and possible plt resolution
from the dispatcher would also acquire bind lock in read mode, which
is the supported operation. plt is explicitely designed to allow safe
multithreaded updates, so the shared lock do not cause problems.

The error in r228435 is that it allows read lock acquisition after the
write lock for the bind block.  If we dlopened the shared object that
contains IRELATIVE or jump slot which target is STT_GNU_IFUNC, then
possible recursive plt resolve from the dispatcher would cause it.

Postpone the resolution for irelative/ifunc right before initializers
are called, and drop bind lock around calls to dispatcher.  Use
initlist to iterate over the objects instead of the ->next, due to
drop of the bind lock in iteration.

For i386/reloc.c:reloc_iresolve(), fix calculation of the dispatch
function address for dso, by taking into account possible non-zero
relocbase.

MFC after:	3 weeks
2011-12-14 16:47:53 +00:00
Konstantin Belousov 6be4b69715 Add support for STT_GNU_IFUNC and R_MACHINE_IRELATIVE GNU extensions to
rtld on 386 and amd64. This adds runtime bits neccessary for the use
of the dispatch functions from the dynamically-linked executables and
shared libraries.

To allow use of external references from the dispatch function, resolution
of the R_MACHINE_IRESOLVE relocations in PLT is postponed until GOT entries
for PLT are prepared, and normal resolution of the GOT entries is finished.
Similar to how it is done by GNU, IRELATIVE relocations are resolved in
advance, instead of normal lazy handling for PLT.

Move the init_pltgot() call before the relocations for the object are
processed.

MFC after:	3 weeks
2011-12-12 11:03:14 +00:00
Eitan Adler 36daf0495a - change "is is" to "is" or "it is"
- change "the the" to "the"

Approved by:	lstewart
Approved by:	sahil (mentor)
MFC after:	3 days
2011-10-16 14:30:28 +00:00
Konstantin Belousov ef9cbd91d0 Handle the R_386_TLS_TPOFF32 relocation, which is similar to R_386_TLS_TPOFF,
but with negative relocation value.

Found by:	mpfr test suite, pointed to by ale
Reviewed by:	kan
MFC after:	1 week
2011-10-08 12:42:19 +00:00
Konstantin Belousov cb38d4941c When loading dso without PT_GNU_STACK phdr, only call
__pthread_map_stacks_exec() on architectures that allow executable
stacks.

Reported and tested by:	marcel (ia64)
2011-01-25 21:12:31 +00:00
Konstantin Belousov 3ad6376e56 Add section .note.GNU-stack for assembly files used by 386 and amd64. 2011-01-07 16:07:05 +00:00
Dimitry Andric 9a17b89ccf Sort -mno-(mmx|3dnow|sse|sse2|sse3) options consistently throughout the
tree.

Submitted by:	arundel
2011-01-05 21:23:26 +00:00
Dimitry Andric e172464728 On amd64 and i386, tell the compiler to refrain from generating SSE,
3DNow, MMX and floating point instructions in rtld-elf.

Otherwise, _rtld_bind() (and whatever it calls) could possibly clobber
function arguments that are passed in SSE/3DNow/MMX/FP registers,
usually floating point values.  This can happen, for example, when clang
generates SSE code for memset() or memcpy() calls.

One symptom of this is sshd dying early on amd64 with "PRNG not seeded",
which is ultimately caused by libcrypto.so.6 calling RAND_add() with a
double parameter.  That parameter is passed via %xmm0, which gets wiped
out by an SSE memset() in _rtld_bind().

Reviewed by:	kib, kan
2011-01-04 20:51:28 +00:00