linux

mirror of https://github.com/torvalds/linux synced 2024-09-28 15:31:19 +00:00

Author	SHA1	Message	Date
Johannes Weiner	759496ba64	arch: mm: pass userspace fault flag to generic fault handler Unlike global OOM handling, memory cgroup code will invoke the OOM killer in any OOM situation because it has no way of telling faults occuring in kernel context - which could be handled more gracefully - from user-triggered faults. Pass a flag that identifies faults originating in user space from the architecture-specific fault handlers to generic code so that memcg OOM handling can be improved. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Michal Hocko <mhocko@suse.cz> Cc: David Rientjes <rientjes@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: azurIt <azurit@pobox.sk> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-12 15:38:01 -07:00
David S. Miller	0fbebed682	sparc64: Fix tsb_grow() in atomic context. If our first THP installation for an MM is via the set_pmd_at() done during khugepaged's collapsing we'll end up in tsb_grow() trying to do a GFP_KERNEL allocation with several locks held. Simply using GFP_ATOMIC in this situation is not the best option because we really can't have this fail, so we'd really like to keep this an order 0 GFP_KERNEL allocation if possible. Also, doing the TSB allocation from khugepaged is a really bad idea because we'll allocate it potentially from the wrong NUMA node in that context. So what we do is defer the hugepage TSB allocation until the first TLB miss we take on a hugepage. This is slightly tricky because we have to handle two unusual cases: 1) Taking the first hugepage TLB miss in the window trap handler. We'll call the winfix_trampoline when that is detected. 2) An initial TSB allocation via TLB miss races with a hugetlb fault on another cpu running the same MM. We handle this by unconditionally loading the TSB we see into the current cpu even if it's non-NULL at hugetlb_setup time. Reported-by: Meelis Roos <mroos@ut.ee> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-20 09:46:08 -08:00
David S. Miller	f88620b9c5	sparc64: Fix deficiencies in sun4v error reporting. Missing error types, attributes, and report fields. Pad out to 64-bytes. Make string reporting cleaner and easier to extend in the future using "const char *" arrays that index by either bit position, or absolute field value. Report the raw 64-byte error report as a sequence of u64s before the annotated version. Only report fields which are valid, given the context and the attribute bits which are set. For shutdown requests, use the local copy of the error report not the one we just freed up back to the queue. Also, use orderly_poweroff() just like the Domain Services shutdown request code does. If the real-address reported is "-1" (unknown) try to disassemble the instruction to report the effective address of the access. Only do this in privileged mode. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-10 17:19:32 -07:00
David Miller	9e695d2ecc	sparc64: Support transparent huge pages. This is relatively easy since PMD's now cover exactly 4MB of memory. Our PMD entries are 32-bits each, so we use a special encoding. The lowest bit, PMD_ISHUGE, determines the interpretation. This is possible because sparc64's page tables are purely software entities so we can use whatever encoding scheme we want. We just have to make the TLB miss assembler page table walkers aware of the layout. set_pmd_at() works much like set_pte_at() but it has to operate in two page from a table of non-huge PTEs, so we have to queue up TLB flushes based upon what mappings are valid in the PTE table. In the second regime we are going from huge-page to non-huge-page, and in that case we need only queue up a single TLB flush to push out the huge page mapping. We still have 5 bits remaining in the huge PMD encoding so we can very likely support any new pieces of THP state tracking that might get added in the future. With lots of help from Johannes Weiner. Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-09 16:23:06 +09:00
Shaohua Li	45cac65b0f	readahead: fault retry breaks mmap file read random detection .fault now can retry. The retry can break state machine of .fault. In filemap_fault, if page is miss, ra->mmap_miss is increased. In the second try, since the page is in page cache now, ra->mmap_miss is decreased. And these are done in one fault, so we can't detect random mmap file access. Add a new flag to indicate .fault is tried once. In the second try, skip ra->mmap_miss decreasing. The filemap_fault state machine is ok with it. I only tested x86, didn't test other archs, but looks the change for other archs is obvious, but who knows :) Signed-off-by: Shaohua Li <shaohua.li@fusionio.com> Cc: Rik van Riel <riel@redhat.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-09 16:22:47 +09:00
Kautuk Consul	7358e51082	sparc/mm/fault_64.c: Port OOM changes to do_sparc64_fault Commit `d065bd810b` (mm: retry page fault when blocking on disk transfer) and commit `37b23e0525` (x86,mm: make pagefault killable) The above commits introduced changes into the x86 pagefault handler for making the page fault handler retryable as well as killable. These changes reduce the mmap_sem hold time, which is crucial during OOM killer invocation. Port these changes to 64-bit sparc. Signed-off-by: Kautuk Consul <consul.kautuk@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-04 15:42:24 -07:00
Peter Zijlstra	a8b0ca17b8	perf: Remove the nmi parameter from the swevent and overflow interface The nmi parameter indicated if we could do wakeups from the current context, if not, we would set some state and self-IPI and let the resulting interrupt do the wakeup. For the various event classes: - hardware: nmi=0; PMI is in fact an NMI or we run irq_work_run from the PMI-tail (ARM etc.) - tracepoint: nmi=0; since tracepoint could be from NMI context. - software: nmi=[0,1]; some, like the schedule thing cannot perform wakeups, and hence need 0. As one can see, there is very little nmi=1 usage, and the down-side of not using it is that on some platforms some software events can have a jiffy delay in wakeup (when arch_irq_work_raise isn't implemented). The up-side however is that we can remove the nmi parameter and save a bunch of conditionals in fast paths. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Michael Cree <mcree@orcon.net.nz> Cc: Will Deacon <will.deacon@arm.com> Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com> Cc: Anton Blanchard <anton@samba.org> Cc: Eric B Munson <emunson@mgebm.net> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: David S. Miller <davem@davemloft.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Don Zickus <dzickus@redhat.com> Link: http://lkml.kernel.org/n/tip-agjev8eu666tvknpb3iaj0fg@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-01 11:06:35 +02:00
David S. Miller	4b17764737	sparc: Support show_unhandled_signals. Just faults right now, will add other traps later. Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-01 00:02:23 -08:00
David S. Miller	a084b6678a	sparc: Add missing SW perf fault events. Signed-off-by: David S. Miller <davem@davemloft.net>	2010-01-20 16:23:03 -08:00
David S. Miller	4ed5d5e429	sparc64: Add some missing __kprobes annotations to kernel fault paths. Signed-off-by: David S. Miller <davem@davemloft.net>	2009-12-10 18:08:29 -08:00
David S. Miller	135d082171	sparc64: Use kprobes_built_in() to avoid ifdefs in fault_64.c Signed-off-by: David S. Miller <davem@davemloft.net>	2009-12-10 18:02:19 -08:00
David S. Miller	a923c28fc5	sparc: Use page_fault_out_of_memory() for VM_FAULT_OOM. As noted by Nick Piggin. Signed-off-by: David S. Miller <davem@davemloft.net>	2009-08-02 19:17:15 -07:00
Linus Torvalds	d06063cc22	Move FAULT_FLAG_xyz into handle_mm_fault() callers This allows the callers to now pass down the full set of FAULT_FLAG_xyz flags to handle_mm_fault(). All callers have been (mechanically) converted to the new calling convention, there's almost certainly room for architectures to clean up their code and then add FAULT_FLAG_RETRY when that support is added. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-06-21 13:08:22 -07:00
David S. Miller	9b02605826	sparc64: Kill bogus TPC/address truncation during 32-bit faults. This builds upon `eeabac7386` ("sparc64: Validate kernel generated fault addresses on sparc64.") Upon further consideration, we actually should never see any fault addresses for 32-bit tasks with the upper 32-bits set. If it does every happen, by definition it's a bug. Whatever context created that fault would only have that fault satisfied if we used the full 64-bit address. If we truncate it, we'll always fault the wrong address and we'll always loop faulting forever. So catch such conditions and mark them as errors always. Log the error and fail the fault. Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-03 16:28:23 -08:00
David S. Miller	eeabac7386	sparc64: Validate kernel generated fault addresses on sparc64. In order to handle all of the cases of address calculation overflow properly, we run sparc 32-bit processes in "address masking" mode when running on a 64-bit kernel. Address masking mode zeros out the top 32-bits of the address calculated for every load and store instruction. However, when we're in privileged mode we have to run with that address masking mode disabled even when accessing userspace from the kernel. To "simulate" the address masking mode we clear the top-bits by hand for 32-bit processes in the fault handler. It is the responsibility of code in the compat layer to properly zero extend addresses used to access userspace. If this isn't followed properly we can get into a fault loop. Say that the user address is 0xf0000000 but for whatever reason the kernel code sign extends this to 64-bit, and then the kernel tries to access the result. In such a case we'll fault on address 0xfffffffff0000000 but the fault handler will process that fault as if it were to address 0xf0000000. We'll loop faulting forever because the fault never gets satisfied. So add a check specifically for this case, when the kernel is faulting on a user address access and the addresses don't match up. This code path is sufficiently slow path, and this bug is sufficiently painful to diagnose, that this kind of bug check is warranted. Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-02 22:08:15 -08:00
Sam Ravnborg	27137e5285	sparc,sparc64: unify mm/ - move all sparc64/mm/ files to arch/sparc/mm/ - commonly named files are named _64.c - add files to sparc/mm/Makefile preserving link order - delete now unused sparc64/mm/Makefile - sparc64 now finds mm/ in sparc Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-04 09:16:59 -08:00

16 commits