Go to file
GONG, Ruiqi 3c61529405 Randomized slab caches for kmalloc()
When exploiting memory vulnerabilities, "heap spraying" is a common
technique targeting those related to dynamic memory allocation (i.e. the
"heap"), and it plays an important role in a successful exploitation.
Basically, it is to overwrite the memory area of vulnerable object by
triggering allocation in other subsystems or modules and therefore
getting a reference to the targeted memory location. It's usable on
various types of vulnerablity including use after free (UAF), heap out-
of-bound write and etc.

There are (at least) two reasons why the heap can be sprayed: 1) generic
slab caches are shared among different subsystems and modules, and
2) dedicated slab caches could be merged with the generic ones.
Currently these two factors cannot be prevented at a low cost: the first
one is a widely used memory allocation mechanism, and shutting down slab
merging completely via `slub_nomerge` would be overkill.

To efficiently prevent heap spraying, we propose the following approach:
to create multiple copies of generic slab caches that will never be
merged, and random one of them will be used at allocation. The random
selection is based on the address of code that calls `kmalloc()`, which
means it is static at runtime (rather than dynamically determined at
each time of allocation, which could be bypassed by repeatedly spraying
in brute force). In other words, the randomness of cache selection will
be with respect to the code address rather than time, i.e. allocations
in different code paths would most likely pick different caches,
although kmalloc() at each place would use the same cache copy whenever
it is executed. In this way, the vulnerable object and memory allocated
in other subsystems and modules will (most probably) be on different
slab caches, which prevents the object from being sprayed.

Meanwhile, the static random selection is further enhanced with a
per-boot random seed, which prevents the attacker from finding a usable
kmalloc that happens to pick the same cache with the vulnerable
subsystem/module by analyzing the open source code. In other words, with
the per-boot seed, the random selection is static during each time the
system starts and runs, but not across different system startups.

The overhead of performance has been tested on a 40-core x86 server by
comparing the results of `perf bench all` between the kernels with and
without this patch based on the latest linux-next kernel, which shows
minor difference. A subset of benchmarks are listed below:

                sched/  sched/  syscall/       mem/       mem/
             messaging    pipe     basic     memcpy     memset
                 (sec)   (sec)     (sec)   (GB/sec)   (GB/sec)

control1         0.019   5.459     0.733  15.258789  51.398026
control2         0.019   5.439     0.730  16.009221  48.828125
control3         0.019   5.282     0.735  16.009221  48.828125
control_avg      0.019   5.393     0.733  15.759077  49.684759

experiment1      0.019   5.374     0.741  15.500992  46.502976
experiment2      0.019   5.440     0.746  16.276042  51.398026
experiment3      0.019   5.242     0.752  15.258789  51.398026
experiment_avg   0.019   5.352     0.746  15.678608  49.766343

The overhead of memory usage was measured by executing `free` after boot
on a QEMU VM with 1GB total memory, and as expected, it's positively
correlated with # of cache copies:

           control  4 copies  8 copies  16 copies

total       969.8M    968.2M    968.2M     968.2M
used         20.0M     21.9M     24.1M      26.7M
free        936.9M    933.6M    931.4M     928.6M
available   932.2M    928.8M    926.6M     923.9M

Co-developed-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: GONG, Ruiqi <gongruiqi@huaweicloud.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Acked-by: Dennis Zhou <dennis@kernel.org> # percpu
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2023-07-18 10:07:47 +02:00
arch - Do FPU AP initialization on Xen PV too which got missed by the recent 2023-07-09 10:13:32 -07:00
block block-6.5-2023-07-03 2023-07-03 18:48:38 -07:00
certs KEYS: Add missing function documentation 2023-04-24 16:15:52 +03:00
crypto This update includes the following changes: 2023-06-30 21:27:13 -07:00
Documentation Fixes for pci_clean_master, error handling in driver inits, and various 2023-07-09 09:35:51 -07:00
drivers Fixes for pci_clean_master, error handling in driver inits, and various 2023-07-09 09:35:51 -07:00
fs Minor cleanups for 6.5: 2023-07-09 09:50:42 -07:00
include Randomized slab caches for kmalloc() 2023-07-18 10:07:47 +02:00
init Kbuild updates for v6.5 2023-07-01 09:24:31 -07:00
io_uring io_uring-6.5-2023-07-03 2023-07-03 18:43:10 -07:00
ipc Merge branch 'work.namespace' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2023-02-24 19:20:07 -08:00
kernel dma-mapping fixes for Linux 6.5 2023-07-09 10:24:22 -07:00
lib 16 hotfixes. Six are cc:stable and the remainder address post-6.4 issues. 2023-07-08 14:30:25 -07:00
LICENSES LICENSES: Add the copyleft-next-0.3.1 license 2022-11-08 15:44:01 +01:00
mm Randomized slab caches for kmalloc() 2023-07-18 10:07:47 +02:00
net Including fixes from bluetooth, bpf and wireguard. 2023-07-05 15:44:45 -07:00
rust rust: error: impl Debug for Error with errname() integration 2023-06-13 01:24:42 +02:00
samples Including fixes from bluetooth, bpf and wireguard. 2023-07-05 15:44:45 -07:00
scripts A half-dozen late arriving docs patches. They are mostly fixes, but we 2023-07-06 22:15:38 -07:00
security + Bug Fixes 2023-07-07 09:55:31 -07:00
sound sound fixes for 6.5-rc1 2023-07-07 15:40:17 -07:00
tools perf tools changes and fixes for v6.5: 2nd batch 2023-07-08 10:21:51 -07:00
usr initramfs: Encode dependency on KBUILD_BUILD_TIMESTAMP 2023-06-06 17:54:49 +09:00
virt ARM64: 2023-07-03 15:32:22 -07:00
.clang-format iommu: Add for_each_group_device() 2023-05-23 08:15:51 +02:00
.cocciconfig
.get_maintainer.ignore get_maintainer: add Alan to .get_maintainer.ignore 2022-08-20 15:17:44 -07:00
.gitattributes .gitattributes: set diff driver for Rust source code files 2023-05-31 17:48:25 +02:00
.gitignore Revert ".gitignore: ignore *.cover and *.mbx" 2023-07-04 15:05:12 -07:00
.mailmap 16 hotfixes. Six are cc:stable and the remainder address post-6.4 issues. 2023-07-08 14:30:25 -07:00
.rustfmt.toml rust: add .rustfmt.toml 2022-09-28 09:02:20 +02:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS - Address -Wmissing-prototype warnings 2023-06-26 16:43:54 -07:00
Kbuild Kbuild updates for v6.1 2022-10-10 12:00:45 -07:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS MAINTAINERS 2: Electric Boogaloo 2023-07-09 10:29:53 -07:00
Makefile Linux 6.5-rc1 2023-07-09 13:53:13 -07:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.