linux/mm
Greg Thelen 6920a1bd03 memcg: remove incorrect underflow check
When a memcg is deleted mem_cgroup_reparent_charges() moves charged
memory to the parent memcg.  As of v3.11-9444-g3ea67d0 "memcg: add per
cgroup writeback pages accounting" there's bad pointer read.  The goal
was to check for counter underflow.  The counter is a per cpu counter
and there are two problems with the code:

 (1) per cpu access function isn't used, instead a naked pointer is used
     which easily causes oops.
 (2) the check doesn't sum all cpus

Test:
  $ cd /sys/fs/cgroup/memory
  $ mkdir x
  $ echo 3 > /proc/sys/vm/drop_caches
  $ (echo $BASHPID >> x/tasks && exec cat) &
  [1] 7154
  $ grep ^mapped x/memory.stat
  mapped_file 53248
  $ echo 7154 > tasks
  $ rmdir x
  <OOPS>

The fix is to remove the check.  It's currently dangerous and isn't
worth fixing it to use something expensive, such as
percpu_counter_sum(), for each reparented page.  __this_cpu_read() isn't
enough to fix this because there's no guarantees of the current cpus
count.  The only guarantees is that the sum of all per-cpu counter is >=
nr_pages.

Fixes: 3ea67d06e4 ("memcg: add per cgroup writeback pages accounting")
Reported-and-tested-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Greg Thelen <gthelen@google.com>
Reviewed-by: Sha Zhengju <handai.szj@taobao.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-01 12:22:28 -07:00
..
backing-dev.c mm/backing-dev.c: check user buffer length before copying data to the related user buffer 2013-09-11 15:58:03 -07:00
balloon_compaction.c
bootmem.c mm: kill free_all_bootmem_node() 2013-07-03 16:07:39 -07:00
bounce.c mm/bounce.c: fix a regression where MS_SNAP_STABLE (stable pages snapshotting) was ignored 2013-09-30 14:31:02 -07:00
cleancache.c mm: cleancache: clean up cleancache_enabled 2013-04-30 17:04:01 -07:00
compaction.c mm/compaction.c: periodically schedule when freeing pages 2013-09-30 14:31:01 -07:00
debug-pagealloc.c
dmapool.c
fadvise.c
failslab.c
filemap.c mm: memcg: handle non-error OOM situations more gracefully 2013-10-16 21:35:53 -07:00
filemap_xip.c
fremap.c mm: save soft-dirty bits on file pages 2013-08-13 17:57:48 -07:00
frontswap.c frontswap: fix incorrect zeroing and allocation size for frontswap_map 2013-06-12 16:29:46 -07:00
highmem.c
huge_memory.c mm: Close races between THP migration and PMD numa clearing 2013-10-29 11:38:05 +01:00
hugetlb.c mm: hugetlb: initialize PG_reserved for tail pages of gigantic compound pages 2013-10-16 21:35:52 -07:00
hugetlb_cgroup.c cgroup: pass around cgroup_subsys_state instead of cgroup in file methods 2013-08-08 20:11:24 -04:00
hwpoison-inject.c mm/hwpoison: fix the lack of one reference count against poisoned page 2013-09-30 14:31:03 -07:00
init-mm.c
internal.h mm: vmscan: fix do_try_to_free_pages() livelock 2013-09-11 15:58:01 -07:00
interval_tree.c
Kconfig powerpc: Fix memory hotplug with sparse vmemmap 2013-10-03 17:21:38 +10:00
Kconfig.debug
kmemcheck.c
kmemleak-test.c
kmemleak.c mm: replace strict_strtoul() with kstrtoul() 2013-09-11 15:57:11 -07:00
ksm.c mm: replace strict_strtoul() with kstrtoul() 2013-09-11 15:57:11 -07:00
list_lru.c mm: list_lru: fix almost infinite loop causing effective livelock 2013-10-30 12:57:46 -07:00
maccess.c
madvise.c mm/hwpoison: fix traversal of hugetlbfs pages to avoid printk flood 2013-09-30 14:31:02 -07:00
Makefile list: add a new LRU list type 2013-09-10 18:56:30 -04:00
memblock.c memblock, numa: binary search node id 2013-09-11 15:57:51 -07:00
memcontrol.c memcg: remove incorrect underflow check 2013-11-01 12:22:28 -07:00
memory-failure.c mm/hwpoison: fix false report on 2nd attempt at page recovery 2013-09-30 14:31:02 -07:00
memory.c mm: numa: Sanitize task_numa_fault() callsites 2013-10-29 11:37:52 +01:00
memory_hotplug.c ACPI and power management fixes for 3.12-rc1 2013-09-12 11:22:45 -07:00
mempolicy.c mbind: add BUG_ON(!vma) in new_vma_page() 2013-09-11 15:57:50 -07:00
mempool.c mm/mempool.c: convert kmalloc_node(...GFP_ZERO...) to kzalloc_node(...) 2013-09-11 15:58:14 -07:00
migrate.c mm: Close races between THP migration and PMD numa clearing 2013-10-29 11:38:05 +01:00
mincore.c
mlock.c mm/mlock.c: prevent walking off the end of a pagetable in no-pmd configuration 2013-09-30 14:31:02 -07:00
mm_init.c mm: tune vm_committed_as percpu_counter batching size 2013-07-03 16:07:32 -07:00
mmap.c mm/mmap: remove unnecessary assignment 2013-09-11 15:58:13 -07:00
mmu_context.c mm: remove old aio use_mm() comment 2013-05-07 18:38:27 -07:00
mmu_notifier.c treewide: relase -> release 2013-06-28 14:34:33 +02:00
mmzone.c
mprotect.c mm: Account for a THP NUMA hinting update as one PTE update 2013-10-29 11:38:17 +01:00
mremap.c mm: revert mremap pud_free anti-fix 2013-10-16 21:35:53 -07:00
msync.c
nobootmem.c mm: concentrate modification of totalram_pages into the mm core 2013-07-03 16:07:33 -07:00
nommu.c mm: remove free_area_cache 2013-07-10 18:11:34 -07:00
oom_kill.c mm: memcg: handle non-error OOM situations more gracefully 2013-10-16 21:35:53 -07:00
page-writeback.c writeback: fix negative bdi max pause 2013-10-16 21:35:53 -07:00
page_alloc.c revert "mm/memory-hotplug: fix lowmem count overflow when offline pages" 2013-09-30 14:31:01 -07:00
page_cgroup.c
page_io.c aio: Kill aio_rw_vect_retry() 2013-07-30 11:53:12 -04:00
page_isolation.c mm: memory-hotplug: enable memory hotplug to handle hugepage 2013-09-11 15:57:48 -07:00
pagewalk.c mm/pagewalk.c: fix walk_page_range() access of wrong PTEs 2013-10-30 14:27:03 -07:00
percpu-km.c
percpu-vm.c
percpu.c
pgtable-generic.c mm: move pgtable related functions to right place 2013-09-11 15:57:30 -07:00
process_vm_access.c
quicklist.c
readahead.c readahead: make context readahead more conservative 2013-09-11 15:57:39 -07:00
rmap.c thp: account anon transparent huge pages into NR_ANON_PAGES 2013-09-12 15:38:03 -07:00
shmem.c initmpfs: make rootfs use tmpfs when CONFIG_TMPFS enabled 2013-09-11 15:59:37 -07:00
slab.c kernel: delete __cpuinit usage from all core kernel files 2013-07-14 19:36:59 -04:00
slab.h memcg: check that kmem_cache has memcg_params before accessing it 2013-08-28 19:26:38 -07:00
slab_common.c slab_common: Do not check for duplicate slab names 2013-09-28 09:47:41 +03:00
slob.c mm/sl[aou]b: Move kmallocXXX functions to common code 2013-09-04 20:51:33 +03:00
slub.c Merge branch 'slab/next' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux 2013-09-15 07:15:06 -04:00
sparse-vmemmap.c sparse-vmemmap: specify vmemmap population range in bytes 2013-04-29 15:54:35 -07:00
sparse.c mm/sparse: introduce alloc_usemap_and_memmap 2013-09-11 15:58:01 -07:00
swap.c mm: make lru_add_drain_all() selective 2013-09-12 15:38:02 -07:00
swap_state.c lib/radix-tree.c: make radix_tree_node_alloc() work correctly within interrupt 2013-09-11 15:59:36 -07:00
swapfile.c swap: fix set_blocksize race during swapon/swapoff 2013-10-16 21:35:53 -07:00
truncate.c truncate: drop 'oldsize' truncate_pagecache() parameter 2013-09-12 15:38:02 -07:00
util.c swap: clean-up #ifdef in page_mapping() 2013-09-11 15:57:31 -07:00
vmalloc.c mm/vmalloc: use wrapper function get_vm_area_size to caculate size of vm area 2013-09-11 15:58:02 -07:00
vmpressure.c Merge branch 'for-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2013-09-03 18:25:03 -07:00
vmscan.c mm/vmscan.c: don't forget to free shrinker->nr_deferred 2013-10-16 21:35:52 -07:00
vmstat.c mm: vmscan: fix do_try_to_free_pages() livelock 2013-09-11 15:58:01 -07:00
zbud.c mm/zbud: fix some trivial typos in comments 2013-09-11 15:57:35 -07:00
zswap.c mm/zswap: bugfix: memory leak when re-swapon 2013-10-16 21:35:52 -07:00