linux/kernel/sched
Kirill Tkhai 0e59bdaea7 sched/fair: Disable runtime_enabled on dying rq
We kill rq->rd on the CPU_DOWN_PREPARE stage:

	cpuset_cpu_inactive -> cpuset_update_active_cpus -> partition_sched_domains ->
	-> cpu_attach_domain -> rq_attach_root -> set_rq_offline

This unthrottles all throttled cfs_rqs.

But the cpu is still able to call schedule() till

	take_cpu_down->__cpu_disable()

is called from stop_machine.

This case the tasks from just unthrottled cfs_rqs are pickable
in a standard scheduler way, and they are picked by dying cpu.
The cfs_rqs becomes throttled again, and migrate_tasks()
in migration_call skips their tasks (one more unthrottle
in migrate_tasks()->CPU_DYING does not happen, because rq->rd
is already NULL).

Patch sets runtime_enabled to zero. This guarantees, the runtime
is not accounted, and the cfs_rqs won't exceed given
cfs_rq->runtime_remaining = 1, and tasks will be pickable
in migrate_tasks(). runtime_enabled is recalculated again
when rq becomes online again.

Ben Segall also noticed, we always enable runtime in
tg_set_cfs_bandwidth(). Actually, we should do that for online
cpus only. To prevent races with unthrottle_offline_cfs_rqs()
we take get_online_cpus() lock.

Reviewed-by: Ben Segall <bsegall@google.com>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Kirill Tkhai <ktkhai@parallels.com>
CC: Konstantin Khorenko <khorenko@parallels.com>
CC: Paul Turner <pjt@google.com>
CC: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1403684382.3462.42.camel@tkhai
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-05 11:17:42 +02:00
..
auto_group.c sched: Replace hardcoding of -20 and 19 with MIN_NICE and MAX_NICE 2014-02-22 18:15:54 +01:00
auto_group.h Revert "sched/autogroup: Fix crash on reboot when autogroup is disabled" 2012-12-11 10:23:45 +01:00
clock.c kernel: use macros from compiler.h instead of __attribute__((...)) 2014-04-07 16:36:11 -07:00
completion.c sched: Move completion code from core.c to completion.c 2013-11-06 07:49:19 +01:00
core.c sched/fair: Disable runtime_enabled on dying rq 2014-07-05 11:17:42 +02:00
cpuacct.c cgroup: remove css_parent() 2014-05-16 13:22:48 -04:00
cpuacct.h sched/cpuacct: Initialize root cpuacct earlier 2013-04-10 13:54:20 +02:00
cpudeadline.c sched/deadline: Replace NR_CPUS arrays 2014-05-22 10:21:28 +02:00
cpudeadline.h sched/deadline: Replace NR_CPUS arrays 2014-05-22 10:21:28 +02:00
cpupri.c Merge commit '3cf2f34' into sched/core, to fix build error 2014-06-12 13:46:37 +02:00
cpupri.h sched/cpupri: Replace NR_CPUS arrays 2014-05-22 10:21:29 +02:00
cputime.c sched: Sanitize irq accounting madness 2014-05-07 11:51:30 +02:00
deadline.c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2014-06-12 19:42:15 -07:00
debug.c Merge branch 'for-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2014-04-03 13:05:42 -07:00
fair.c sched/fair: Disable runtime_enabled on dying rq 2014-07-05 11:17:42 +02:00
features.h sched: Rename capacity related flags 2014-06-05 11:52:32 +02:00
idle.c sched/idle: Drop !! while calculating 'broadcast' 2014-07-05 11:17:31 +02:00
idle_task.c sched/fair: Push down check for high priority class task into idle_balance() 2014-03-11 12:05:37 +01:00
Makefile sched/idle: Move cpu/idle.c to sched/idle.c 2014-02-11 09:58:30 +01:00
proc.c sched: Change get_rq_runnable_load() to static and inline 2013-06-27 10:07:44 +02:00
rt.c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2014-06-12 19:42:15 -07:00
sched.h sched/fair: Implement fast idling of CPUs when the system is partially loaded 2014-07-05 11:17:32 +02:00
stats.c kernel: audit/fix non-modular users of module_init in core code 2014-04-03 16:21:07 -07:00
stats.h sched: Micro-optimize by dropping unnecessary task_rq() calls 2013-09-25 13:51:06 +02:00
stop_task.c sched, nohz: Change rq->nr_running to always use wrappers 2014-05-22 11:16:33 +02:00
wait.c arch: Mass conversion of smp_mb__*() 2014-04-18 14:20:48 +02:00