linux/kernel/sched
Waiman Long da01903281 sched: Enforce user requested affinity
It was found that the user requested affinity via sched_setaffinity()
can be easily overwritten by other kernel subsystems without an easy way
to reset it back to what the user requested. For example, any change
to the current cpuset hierarchy may reset the cpumask of the tasks in
the affected cpusets to the default cpuset value even if those tasks
have pre-existing user requested affinity. That is especially easy to
trigger under a cgroup v2 environment where writing "+cpuset" to the
root cgroup's cgroup.subtree_control file will reset the cpus affinity
of all the processes in the system.

That is problematic in a nohz_full environment where the tasks running
in the nohz_full CPUs usually have their cpus affinity explicitly set
and will behave incorrectly if cpus affinity changes.

Fix this problem by looking at user_cpus_ptr in __set_cpus_allowed_ptr()
and use it to restrcit the given cpumask unless there is no overlap. In
that case, it will fallback to the given one. The SCA_USER flag is
reused to indicate intent to set user_cpus_ptr and so user_cpus_ptr
masking should be skipped. In addition, masking should also be skipped
if any of the SCA_MIGRATE_* flag is set.

All callers of set_cpus_allowed_ptr() will be affected by this change.
A scratch cpumask is added to percpu runqueues structure for doing
additional masking when user_cpus_ptr is set.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220922180041.1768141-4-longman@redhat.com
2022-10-27 11:01:22 +02:00
..
autogroup.c sched/all: Change all BUG_ON() instances in the scheduler to WARN_ON_ONCE() 2022-08-12 11:25:10 +02:00
autogroup.h
build_policy.c sched: Fix missing prototype warnings 2022-05-01 10:03:43 +02:00
build_utility.c sched: Fix missing prototype warnings 2022-05-01 10:03:43 +02:00
clock.c sched/clock: Use try_cmpxchg64 in sched_clock_{local,remote} 2022-05-19 23:46:09 +02:00
completion.c sched/completion: Add wait_for_completion_state() 2022-09-07 21:53:49 +02:00
core.c sched: Enforce user requested affinity 2022-10-27 11:01:22 +02:00
core_sched.c sched: Rename task_running() to task_on_cpu() 2022-09-07 21:53:47 +02:00
cpuacct.c Merge branch 'sched/fast-headers' into sched/core 2022-03-15 09:05:05 +01:00
cpudeadline.c sched/core: Introduce sched_asym_cpucap_active() 2022-08-02 12:32:45 +02:00
cpudeadline.h
cpufreq.c sched/headers: Introduce kernel/sched/build_utility.c and build multiple .c files there 2022-02-23 10:58:33 +01:00
cpufreq_schedutil.c cpufreq: schedutil: Move max CPU capacity to sugov_policy 2022-08-23 20:03:33 +02:00
cpupri.c sched/all: Change all BUG_ON() instances in the scheduler to WARN_ON_ONCE() 2022-08-12 11:25:10 +02:00
cpupri.h
cputime.c sched/core: add forced idle accounting for cgroups 2022-07-04 09:23:07 +02:00
deadline.c sched: Introduce affinity_context 2022-10-27 11:01:21 +02:00
debug.c - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in 2022-10-10 17:53:04 -07:00
fair.c sched/fair: Check if prev_cpu has highest spare cap in feec() 2022-10-27 11:01:20 +02:00
features.h sched/fair: Introduce SIS_UTIL to search idle CPU based on sum of util_avg 2022-06-28 09:08:30 +02:00
idle.c context_tracking: Take idle eqs entrypoints over RCU 2022-07-05 13:32:16 -07:00
isolation.c sched/headers: Introduce kernel/sched/build_utility.c and build multiple .c files there 2022-02-23 10:58:33 +01:00
loadavg.c sched/headers: Introduce kernel/sched/build_utility.c and build multiple .c files there 2022-02-23 10:58:33 +01:00
Makefile sched/headers: Introduce kernel/sched/build_policy.c and build multiple .c files there 2022-02-23 10:58:33 +01:00
membarrier.c sched/headers: Introduce kernel/sched/build_utility.c and build multiple .c files there 2022-02-23 10:58:33 +01:00
pelt.c sched/headers: Introduce kernel/sched/build_policy.c and build multiple .c files there 2022-02-23 10:58:33 +01:00
pelt.h sched/fair: Decay task PELT values during wakeup migration 2022-06-28 09:17:46 +02:00
psi.c PSI updates for v6.1: 2022-10-14 13:03:00 -07:00
rt.c sched: Introduce struct balance_callback to avoid CFI mismatches 2022-10-17 16:41:25 +02:00
sched-pelt.h
sched.h sched: Enforce user requested affinity 2022-10-27 11:01:22 +02:00
smp.h smp: Rename flush_smp_call_function_from_idle() 2022-05-01 10:03:43 +02:00
stats.c sched/headers: Introduce kernel/sched/build_utility.c and build multiple .c files there 2022-02-23 10:58:33 +01:00
stats.h sched/psi: Add PSI_IRQ to track IRQ/SOFTIRQ pressure 2022-09-09 11:08:32 +02:00
stop_task.c sched: Add update_current_exec_runtime helper 2022-08-27 00:05:35 +02:00
swait.c sched/headers: Introduce kernel/sched/build_utility.c and build multiple .c files there 2022-02-23 10:58:33 +01:00
topology.c sched/numa: Adjust imb_numa_nr to a better approximation of memory channels 2022-06-13 10:30:00 +02:00
wait.c sched/headers: Introduce kernel/sched/build_utility.c and build multiple .c files there 2022-02-23 10:58:33 +01:00
wait_bit.c wait_on_bit: add an acquire memory barrier 2022-08-26 09:30:25 -07:00