Remove the duplicated code and use the cpuidle common code for initialization.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Remove the duplicated code and use the cpuidle common code for initialization.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Remove the duplicated code and use the cpuidle common code for initialization.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Reviewed-by: Kevin Hilman <khilman@linaro.org>
Tested-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Remove the duplicated code and use the cpuidle common code for initialization.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Remove the duplicated code and use the cpuidle common code for initialization.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Remove the duplicated code and use the cpuidle common code for initialization.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Reviewed-by: Kevin Hilman <khilman@linaro.org>
Tested-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Remove the duplicated code and use the cpuidle common code for initialization.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Remove the duplicate code and use the cpuidle common code for initialization.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
All the drivers are using, in their initialization function, the
for_each_possible_cpu macro.
Using for_each_online_cpu means the driver must handle the initialization
of the cpuidle device when a cpu is up which is not the case here.
Change the macro to for_each_possible_cpu as that fix the hotplug
initialization and make the initialization routine consistent with the
rest of the code in the different drivers.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The en_core_tk_irqen flag is set in all the cpuidle driver which
means it is not necessary to specify this flag.
Remove the flag and the code related to it.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Kevin Hilman <khilman@linaro.org> # for mach-omap2/*
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit edc7cb2 (usb: phy: make it a menuconfig) makes USB_MXS_PHY
be a sub-item of menuconfig symbol USB_PHY. This change gets the
selection of CONFIG_USB_MXS_PHY in mxs_defconfig lost. Hence the
boot stops at the point below.
[ 1.600867] ci_hdrc ci_hdrc.0: doesn't support gadget
[ 1.606282] ci_hdrc ci_hdrc.0: EHCI Host Controller
[ 1.613522] ci_hdrc ci_hdrc.0: new USB bus registered, assigned bus number 1
Add CONFIG_USB_PHY to have the CONFIG_USB_MXS_PHY selection back to
work.
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Commit edc7cb2 (usb: phy: make it a menuconfig) makes USB_MXS_PHY
be a sub-item of menuconfig symbol USB_PHY. This change gets the
selection of CONFIG_USB_MXS_PHY in imx_v6_v7_defconfig lost. Hence the
boot stops at the point below.
ci_hdrc ci_hdrc.0: doesn't support gadget
ci_hdrc ci_hdrc.0: EHCI Host Controller
ci_hdrc ci_hdrc.0: new USB bus registered, assigned bus number 1
Add CONFIG_USB_PHY to have the CONFIG_USB_MXS_PHY selection back to
work.
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Drivers use cmpxchg64, cmpxchg64_local to perform 64-bit operation, so
they can cross 32-bit and 64-bit platforms (it is a standard way).
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Kay Sievers reported that coreutils' stat tool has a problem with
s390's statfs[64] definition:
> The definition of struct statfs::f_type needs a fix. s390 is the only
> architecture in the kernel that uses an int and expects magic
> constants lager than INT_MAX to fit into.
>
> A fix is needed to make Fedora boot on s390, it currently fails to do
> so. Userspace does not want to add code to paper-over this issue.
[...]
> Even coreutils cannot handle it:
> #define RAMFS_MAGIC 0x858458f6
> # stat -f -c%t /
> ffffffff858458f6
>
> #define BTRFS_SUPER_MAGIC 0x9123683E
> # stat -f -c%t /mnt
> ffffffff9123683e
The bug is caused by an implicit sign extension within the stat tool:
out_uint_x (pformat, prefix_len, statfsbuf->f_type);
where the format finally will be "%lx".
A similar problem can be found in the 'tail' tool.
s390 is the only architecture which has an int type f_type member in
struct statfs[64]. Other architectures have either unsigned ints or
long values, so that the problem doesn't occur there.
Therefore change the type of the f_type member to unsigned int, so
that we get zero extension instead of sign extension when assignment to
a long value happens.
This patch changes the s390 uapi struct stafs[64] definition in the kernel
to contain only unsigned values.
This was true for 32 bit builds anyway, since we use the generic uapi
header file in that case. So lets not include conditionally the generic
uapi header file but have the s390 implementation completely independent.
Also fix the types of struct compat_stafs to match reality and move the
definition of struct compat_statfs64 to asm/compat.h since it is not part
of the api.
Reported-by: Kay Sievers <kay@vrfy.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
For offset > PAGE_SIZE, s390_dma_map_pages() will issue a warning
and return a wrong dma address.
This patch removes the warning and fixes the dma return address
calculation.
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The compat definitions are not part of the uapi. So move them to
s390's private compat header file.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fix this one for !COMPAT:
compat.h: In function ‘arch_compat_alloc_user_space’:
compat.h:292:2: error: implicit declaration of function ‘is_compat_task’
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The f_spare field within struct compat_statfs is four bytes larger
than within the native 31 bit struct statfs.
compat_sys_statfs() clears the f_spare field in user space which
means that in compat mode four bytes that are behind the user space
supplied struct compat_statfs will be corrupted (zeroed).
According to Thomas Gleixner's Linux 2.6 history tree this bug is
present since v2.5.74 87880da124 "[PATCH] s390: 31 bit compat.".
So it get's fixed shortly before its 10th anniversary. Tough luck.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The gmap_map_segment function creates a special invalid segment table
entry with the address of the requested target location in the process
address space. The first access will create the connection between the
gmap segment table and the target page table of the main process.
If two threads do this concurrently both will walk the page tables and
allocate a gmap_rmap structure for the same segment table entry.
To avoid the race recheck the segment table entry after taking to page
table lock.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Implement gmap_translate() function which translates a guest absolute address
to a user space process address without establishing the guest page table
entries.
This is useful for kvm guest address translations where no memory access
is expected to happen soon (e.g. tprot exception handler).
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Pull MIPS fix from Ralf Baechle:
"Revert the change of the definition of PAGE_MASK which was prettier
but broke a few relativly rare platforms"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
Revert "MIPS: page.h: Provide more readable definition for PAGE_MASK."
This reverts commit c17a655478.
Manuel Lauss writes:
lmo commit c17a6554 (MIPS: page.h: Provide more readable definition for
PAGE_MASK) apparently breaks ioremap of 36-bit addresses on my Alchemy
systems (PCI and PCMCIA) The reason is that in arch/mips/mm/ioremap.c
line 157 (phys_addr &= PAGE_MASK) bits 32-35 are cut off. Seems the
new PAGE_MASK is explicitly 32bit, or one could make it signed instead
of unsigned long.
The builtin .dtb.S intermediate file needs to be marked with .SECONDARY
so that it isn't automatically deleted (which causes it to be
regenerated on every build). Also add *.dtb.S to clean-files so it gets
cleaned up by make clean.
Similarly, if the specified builtin dtb isn't already in dtb-y (e.g.
imported into the tree and specified in CONFIG_METAG_BUILTIN_DTB_NAME)
it too will be treated as an intermediate and deleted automatically
(again causing it to be regenerated on every build), so add it to dtb-y
so it gets added to targets and the dtbs target.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com>
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Sam Ravnborg <sam@ravnborg.org>
Reviewed-by: Stephen Warren <swarren@nvidia.com>
Borislav Petkov reported a lockdep splat warning about kzalloc()
done in an IPI (hardirq) handler.
This is a real bug, do not call kzalloc() in a smp_call_function_single()
handler because it can schedule and crash.
Reported-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Jacob Shin <jacob.shin@amd.com>
Tested-by: Borislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: <eranian@google.com>
Cc: <a.p.zijlstra@chello.nl>
Cc: <acme@ghostprotocols.net>
Cc: <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20130421180627.GA21049@jshin-Toonie
Signed-off-by: Ingo Molnar <mingo@kernel.org>
In a previous commit the en_core_tk_irqen flag has been added but we missed
the cpuidle_wrap_enter which was doing the job to measure the time for the
'omap3_enter_idle' function.
Actually, I don't see any reason to use this wrapper in the code. In the better
case, the time computation is not correctly done because of the different
operations done in omap3_enter_idle_bm which were not taken into account
before the en_core_tk_irqen flag was set.
As the time is reflected for the state overridden by the omap3_enter_idle_bm,
using the wrapper is pointless now, so removing it.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Kevin Hilman <khilman@linaro.org>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit 688036b538 removed the function
'shmobile_enter_wfi' but we forgot to remove the definition in the header file.
Note this function is just an alias to 'cpu_do_idle()' wrapped into a cpuidle
function callback prototype which already exists with the default WFI state
and the arm_simple_enter function.
Remove the function prototype.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Remove the shmobile_enter_wfi function which is the same as the
common WFI enter function from the arm cpuidle driver defined
with the ARM_CPUIDLE_WFI_STATE macro.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Registering the driver, or the device, can fail, let's check the return code
and return the error code to the PM layer.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Like all the other drivers, let's initialize the structure a compile time
instead of init time.
The states #1 and #2 are not enabled by default. The init function will
check the features of the board in order to enable the state.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The CPUIDLE_DRIVER_STATE_START constant is only set when the kernel compilation
option CONFIG_ARCH_HAS_CPU_RELAX is set, but this is only relatated to x86, so
it is always zero.
Remove the reference to this constant in the code.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The driver is a global static variable automatically initialized to zero.
Removing the useless initialization in the init function.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Support for NB counters, MSRs 0xc0010240 ~ 0xc0010247, got
moved to perf_event_amd_uncore.c in the following commit:
c43ca5091a perf/x86/amd: Add support for AMD NB and L2I "uncore" counters
AMD Family 10h NB events (events 0xe0 ~ 0xff, on MSRs 0xc001000 ~
0xc001007) will still continue to be handled by perf_event_amd.c
Signed-off-by: Jacob Shin <jacob.shin@amd.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Jacob Shin <jacob.shin@amd.com>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Link: http://lkml.kernel.org/r/1366046483-1765-2-git-send-email-jacob.shin@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
check_hw_exists() has a number of checks which go to two exit
paths: msr_fail and bios_fail. Checks classified as msr_fail
will cause check_hw_exists() to return false, causing the PMU
not to be used; bios_fail checks will only cause a warning to be
printed, but will return true.
The problem is that if there are both msr failures and bios
failures, and the routine hits a bios_fail check first, it will
exit early and return true, not finishing the rest of the msr
checks. If those msrs are in fact broken, it will cause them to
be used erroneously.
In the case of a Xen PV VM, the guest OS has read access to all
the MSRs, but write access is white-listed to supported
features. Writes to unsupported MSRs have no effect. The PMU
MSRs are not (typically) supported, because they are expensive
to save and restore on a VM context switch. One of the
"msr_fail" checks is supposed to detect this circumstance (ether
for Xen or KVM) and disable the harware PMU.
However, on one of my AMD boxen, there is (apparently) a broken
BIOS which triggers one of the bios_fail checks. In particular,
MSR_K7_EVNTSEL0 has the ARCH_PERFMON_EVENTSEL_ENABLE bit set.
The guest kernel detects this because it has read access to all
MSRs, and causes it to skip the rest of the checks and try to
use the non-existent hardware PMU. This minimally causes a lot
of useless instruction emulation and Xen console spam; it may
cause other issues with the watchdog as well.
This changset causes check_hw_exists() to go through all of the
msr checks, failing and returning false if any of them fail.
This makes sure that a guest running under Xen without a virtual
PMU will detect that there is no functioning PMU and not attempt
to use it.
This problem affects kernels as far back as 3.2, and should thus
be considered for backport.
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Konrad Wilk <konrad.wilk@oracle.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Link: http://lkml.kernel.org/r/1365000388-32448-1-git-send-email-george.dunlap@eu.citrix.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Add support for AMD Family 15h [and above] northbridge
performance counters. MSRs 0xc0010240 ~ 0xc0010247 are shared
across all cores that share a common northbridge.
Add support for AMD Family 16h L2 performance counters. MSRs
0xc0010230 ~ 0xc0010237 are shared across all cores that share a
common L2 cache.
We do not enable counter overflow interrupts. Sampling mode and
per-thread events are not supported.
Signed-off-by: Jacob Shin <jacob.shin@amd.com>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Stephane Eranian <eranian@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20130419213428.GA8229@jshin-Toonie
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The existing code assumes all Cbox and PCU events are using
filter, but actually the filter is event specific. Furthermore
the filter is sub-divided into multiple fields which are used
by different events.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: peterz@infradead.org
Cc: ak@linux.intel.com
Link: http://lkml.kernel.org/r/1366113067-3262-3-git-send-email-zheng.z.yan@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reported-by: Stephane Eranian <eranian@google.com>
Conflicts:
arch/x86/kernel/cpu/perf_event_intel.c
Merge in the latest fixes before applying new patches, resolve the conflict.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull kdump fixes from Peter Anvin:
"The kexec/kdump people have found several problems with the support
for loading over 4 GiB that was introduced in this merge cycle. This
is partly due to a number of design problems inherent in the way the
various pieces of kdump fit together (it is pretty horrifically manual
in many places.)
After a *lot* of iterations this is the patchset that was agreed upon,
but of course it is now very late in the cycle. However, because it
changes both the syntax and semantics of the crashkernel option, it
would be desirable to avoid a stable release with the broken
interfaces."
I'm not happy with the timing, since originally the plan was to release
the final 3.9 tomorrow. But apparently I'm doing an -rc8 instead...
* 'x86-kdump-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
kexec: use Crash kernel for Crash kernel low
x86, kdump: Change crashkernel_high/low= to crashkernel=,high/low
x86, kdump: Retore crashkernel= to allocate under 896M
x86, kdump: Set crashkernel_low automatically
Pull x86 fixes from Peter Anvin:
"Three groups of fixes:
1. Make sure we don't execute the early microcode patching if family
< 6, since it would touch MSRs which don't exist on those
families, causing crashes.
2. The Xen partial emulation of HyperV can be dealt with more
gracefully than just disabling the driver.
3. More EFI variable space magic. In particular, variables hidden
from runtime code need to be taken into account too."
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, microcode: Verify the family before dispatching microcode patching
x86, hyperv: Handle Xen emulation of Hyper-V more gracefully
x86,efi: Implement efi_no_storage_paranoia parameter
efi: Export efi_query_variable_store() for efivars.ko
x86/Kconfig: Make EFI select UCS2_STRING
efi: Distinguish between "remaining space" and actually used space
efi: Pass boot services variable info to runtime code
Move utf16 functions to kernel core and rename
x86,efi: Check max_size only if it is non-zero.
x86, efivars: firmware bug workarounds should be in platform code
Pull ARM fixes from Russell King:
"A set of fixes from various people - Will Deacon gets a prize for
removing code this time around. The biggest fix in this lot is
sorting out the ARM740T mess. The rest are relatively small fixes."
* 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
ARM: 7699/1: sched_clock: Add more notrace to prevent recursion
ARM: 7698/1: perf: fix group validation when using enable_on_exec
ARM: 7697/1: hw_breakpoint: do not use __cpuinitdata for dbg_cpu_pm_nb
ARM: 7696/1: Fix kexec by setting outer_cache.inv_all for Feroceon
ARM: 7694/1: ARM, TCM: initialize TCM in paging_init(), instead of setup_arch()
ARM: 7692/1: iop3xx: move IOP3XX_PERIPHERAL_VIRT_BASE
ARM: modules: don't export cpu_set_pte_ext when !MMU
ARM: mm: remove broken condition check for v4 flushing
ARM: mm: fix numerous hideous errors in proc-arm740.S
ARM: cache: remove ARMv3 support code
ARM: tlbflush: remove ARMv3 support
Pull sparc fixes from David Miller:
1) Fix race in sparc64 TLB shootdowns, we have to synchronize with the
sibling cpus completing if we are passing them a reference via
pointer to a data structure.
2) Fix cleaning of bitmaps in sparc32, from Akinobu Mita.
3) Fix various sparc header mistakes, some of which resulted in
userland build breakage. From Sam Ravnborg.
4) Kill ghost declarations and defines missed when several bits of code
got deleted recently.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc64: Fix race in TLB batch processing.
sparc: use asm-generic version of types.h
bbc_i2c: fix section mismatch warning
sparc: use generic headers
sparc:cleanup unused code in smp_32.h
sparc/iommu: fix typo s/265KB/256KB/
sparc/srmmu: clear trailing edge of bitmap properly
sparc:remove unused declaration smp_boot_cpus()
Matt Fleming (1):
x86, efivars: firmware bug workarounds should be in platform
code
Matthew Garrett (3):
Move utf16 functions to kernel core and rename
efi: Pass boot services variable info to runtime code
efi: Distinguish between "remaining space" and actually used
space
Richard Weinberger (2):
x86,efi: Check max_size only if it is non-zero.
x86,efi: Implement efi_no_storage_paranoia parameter
Sergey Vlasov (2):
x86/Kconfig: Make EFI select UCS2_STRING
efi: Export efi_query_variable_store() for efivars.ko
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
For each CPU vendor that implements CPU microcode patching, there will
be a minimum family for which this is implemented. Verify this
minimum level of support.
This can be done in the dispatch function or early in the application
functions. Doing the latter turned out to be somewhat awkward because
of the ineviable split between the BSP and the AP paths, and rather
than pushing deep into the application functions, do this in
the dispatch function.
Reported-by: "Bryan O'Donoghue" <bryan.odonoghue.lkml@nexus-software.ie>
Suggested-by: Borislav Petkov <bp@alien8.de>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Link: http://lkml.kernel.org/r/1366392183-4149-1-git-send-email-bryan.odonoghue.lkml@nexus-software.ie
As reported by Dave Kleikamp, when we emit cross calls to do batched
TLB flush processing we have a race because we do not synchronize on
the sibling cpus completing the cross call.
So meanwhile the TLB batch can be reset (tb->tlb_nr set to zero, etc.)
and either flushes are missed or flushes will flush the wrong
addresses.
Fix this by using generic infrastructure to synchonize on the
completion of the cross call.
This first required getting the flush_tlb_pending() call out from
switch_to() which operates with locks held and interrupts disabled.
The problem is that smp_call_function_many() cannot be invoked with
IRQs disabled and this is explicitly checked for with WARN_ON_ONCE().
We get the batch processing outside of locked IRQ disabled sections by
using some ideas from the powerpc port. Namely, we only batch inside
of arch_{enter,leave}_lazy_mmu_mode() calls. If we're not in such a
region, we flush TLBs synchronously.
1) Get rid of xcall_flush_tlb_pending and per-cpu type
implementations.
2) Do TLB batch cross calls instead via:
smp_call_function_many()
tlb_pending_func()
__flush_tlb_pending()
3) Batch only in lazy mmu sequences:
a) Add 'active' member to struct tlb_batch
b) Define __HAVE_ARCH_ENTER_LAZY_MMU_MODE
c) Set 'active' in arch_enter_lazy_mmu_mode()
d) Run batch and clear 'active' in arch_leave_lazy_mmu_mode()
e) Check 'active' in tlb_batch_add_one() and do a synchronous
flush if it's clear.
4) Add infrastructure for synchronous TLB page flushes.
a) Implement __flush_tlb_page and per-cpu variants, patch
as needed.
b) Likewise for xcall_flush_tlb_page.
c) Implement smp_flush_tlb_page() to invoke the cross-call.
d) Wire up global_flush_tlb_page() to the right routine based
upon CONFIG_SMP
5) It turns out that singleton batches are very common, 2 out of every
3 batch flushes have only a single entry in them.
The batch flush waiting is very expensive, both because of the poll
on sibling cpu completeion, as well as because passing the tlb batch
pointer to the sibling cpus invokes a shared memory dereference.
Therefore, in flush_tlb_pending(), if there is only one entry in
the batch perform a completely asynchronous global_flush_tlb_page()
instead.
Reported-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Dave Kleikamp <dave.kleikamp@oracle.com>
cyc_to_sched_clock() is called by sched_clock() and cyc_to_ns()
is called by cyc_to_sched_clock(). I suspect that some compilers
inline both of these functions into sched_clock() and so we've
been getting away without having a notrace marking. It seems that
my compiler isn't inlining cyc_to_sched_clock() though, so I'm
hitting a recursion bug when I enable the function graph tracer,
causing my system to crash. Marking these functions notrace fixes
it. Technically cyc_to_ns() doesn't need the notrace because it's
already marked inline, but let's just add it so that if we ever
remove inline from that function it doesn't blow up.
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Only one remaining fix for arm-soc platforms at this time, a small
bugfix for cpu hotplug on highbank platforms that has become much
easier to hit as of late. Details in the patch description, but it's
small and well-contained and definitely impacts users of the platform,
so 3.9 seems appropriate.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJRcYGyAAoJEIwa5zzehBx3sYYP/iiongz+96eVHVxBMGMWHg0f
BcTtHw2ef0h4vCDMNUXmPEmVu74hN5VMCutUd2qeKvckyC7gEIrbVnLFwej05RiA
AFwIJ0A2pAjdvsAkRfTNXPpFs7bvZ+8r6cUJd4JGCP7IC8Zi4sj4dET1ZNEsaccD
vMQ+Y8O3dvVu1lEFfGT87huppQgCr/jzc6O9oc3eHDZ242v8tVLS31PpuZe8Qt25
8QvKKsY/UG82/aiz+ijlcddDJz132byNzrOvmY0DkcZ5ZMxbTnGUNFUqFszFkBYu
FrtAyJyQEyUYwY7r6geogfgtj17mQzbq1/4azcApmDQHadBhVbXdDuTW0EPO63QC
sQzf9JgWR66H5hOYeDp2ka1RbBh00k6byvh7T5adzgoJDtbHtJJ8OxW16OR/eoCQ
umCZ2rQxAQCpw11qRnA8LDwnmujr5qXFMuj5NqepUaDCLbWAq2VWeNTicu0LMgCN
RJZ7ifk94o+uCQETd0D2ZrZZtqvrbLtaLAgfa54PiMYecV0rRF44FUVDCIbUBRKq
5ouN76JfvbOQutLj41nA/QBr0ATCAw4mPoxaeaIwie1G7lXvZvMRRkjKLxDhe/qs
+3gZivBhQQuKEe3CYooLx3xoC/bs1VoMsyiRXa8YgdKsZbY0bIsQAUZp8dYjAgoW
hYf3zRgRd69+k0uqWA1f
=j3Hq
-----END PGP SIGNATURE-----
Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"Only one remaining fix for arm-soc platforms at this time, a small
bugfix for cpu hotplug on highbank platforms that has become much
easier to hit as of late.
Details in the patch description, but it's small and well-contained
and definitely impacts users of the platform, so 3.9 seems
appropriate."
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: highbank: fix cache flush ordering for cpu hotplug
when compiling with allmodconfig, CONFIG_64BIT=y the file
drivers/base/regmap/regmap-mmio.c will use readq and writeq so we need
implement these functions.
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Add code to handle DRAM ECC errors decoding for Fam16h.
Tested on Fam16h with ECC turned on using the mce_amd_inj facility and
works fine.
Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
[ Boris: cleanups and clarifications ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Factor out the SP810 clocking code into a separate driver,
selecting better (faster) parent at clk_prepare() time.
This is to avoid problems with clocking infrastructure
initialisation order, in particular to avoid dependency
of fixed clock being initialized before SP810. It also
makes vexpress platform OF-based clock initialisation code
unnecessary.
Signed-off-by: Pawel Moll <pawel.moll@arm.com>
Tested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mike Turquette <mturquette@linaro.org>
[mturquette@linaro.org: add .unprepare, FIXME comment, cleaned up code]
The L1 data cache flush needs to be after highbank_set_cpu_jump call which
pollutes the cache with the l2x0_lock. This causes other cores to deadlock
waiting for the l2x0_lock. Moving the flush of the entire data cache after
highbank_set_cpu_jump fixes the problem. Use flush_cache_louis instead of
flush_cache_all are that is sufficient to flush only the L1 data cache.
flush_cache_louis did not exist when highbank_cpu_die was originally
written.
With PL310 errata 769419 enabled, a wmb is inserted into idle which takes
the l2x0_lock. This makes the problem much more easily hit and causes
reset to hang.
Reported-by: Paolo Pisati <p.pisati@gmail.com>
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
Install the Hyper-V specific interrupt handler only when needed. This would
permit us to get rid of the Xen check. Note that when the vmbus drivers invokes
the call to register its handler, we are sure to be running on Hyper-V.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Link: http://lkml.kernel.org/r/1366299886-6399-1-git-send-email-kys@microsoft.com
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
fixed the following compile error when use avr32 atstk1006_defconfig:
drivers/mtd/nand/atmel_nand.c: In function 'pmecc_err_location':
drivers/mtd/nand/atmel_nand.c:639: error: implicit declaration of function 'writel_relaxed'
which was introduced by commit 1c7b874d33 ("mtd: at91: atmel_nand: add
Programmable Multibit ECC controller support"). The PMECC for nand
flash code uses writel_relaxed(). But in avr32, there is no macro
"writel_relaxed" defined.
This patch add writex_relaxed macro definitions.
Signed-off-by: Josh Wu <josh.wu@atmel.com>
Acked-by: Havard Skinnemoen <havard@skinnemoen.net>
Acked-by: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: David Woodhouse <David.Woodhouse@intel.com>
Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In the very unlikely event where a guest would be foolish enough to
*read* from a write-only cache maintainance register, we end up
with preemption disabled, due to a misplaced get_cpu().
Just move the "is_write" test outside of the critical section.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Per hpa, use crashkernel=X,high crashkernel=Y,low instead of
crashkernel_hign=X crashkernel_low=Y. As that could be extensible.
-v2: according to Vivek, change delimiter to ;
-v3: let hign and low only handle simple form and it conforms to
description in kernel-parameters.txt
still keep crashkernel=X override any crashkernel=X,high
crashkernel=Y,low
-v4: update get_last_crashkernel returning and add more strict
checking in parse_crashkernel_simple() found by HATAYAMA.
-v5: Change delimiter back to , according to HPA.
also separate parse_suffix from parse_simper according to vivek.
so we can avoid @pos in that path.
-v6: Tight the checking about crashkernel=X,highblahblah,high
found by HTYAYAMA.
Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1366089828-19692-5-git-send-email-yinghai@kernel.org
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Vivek found old kexec-tools does not work new kernel anymore.
So change back crashkernel= back to old behavoir, and add crashkernel_high=
to let user decide if buffer could be above 4G, and also new kexec-tools will
be needed.
-v2: let crashkernel=X override crashkernel_high=
update description about _high will be ignored by crashkernel=X
-v3: update description about kernel-parameters.txt according to Vivek.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1366089828-19692-4-git-send-email-yinghai@kernel.org
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Chao said that kdump does does work well on his system on 3.8
without extra parameter, even iommu does not work with kdump.
And now have to append crashkernel_low=Y in first kernel to make
kdump work.
We have now modified crashkernel=X to allocate memory beyong 4G (if
available) and do not allocate low range for crashkernel if the user
does not specify that with crashkernel_low=Y. This causes regression
if iommu is not enabled. Without iommu, swiotlb needs to be setup in
first 4G and there is no low memory available to second kernel.
Set crashkernel_low automatically if the user does not specify that.
For system that does support IOMMU with kdump properly, user could
specify crashkernel_low=0 to save that 72M low ram.
-v3: add swiotlb_size() according to Konrad.
-v4: add comments what 8M is for according to hpa.
also update more crashkernel_low= in kernel-parameters.txt
-v5: update changelog according to Vivek.
-v6: Change description about swiotlb referring according to HATAYAMA.
Reported-by: WANG Chao <chaowang@redhat.com>
Tested-by: WANG Chao <chaowang@redhat.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1366089828-19692-2-git-send-email-yinghai@kernel.org
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Events may be created with attr->disabled == 1 and attr->enable_on_exec
== 1, which confuses the group validation code because events with the
PERF_EVENT_STATE_OFF are not considered candidates for scheduling, which
may lead to failure at group scheduling time.
This patch fixes the validation check for ARM, so that events in the
OFF state are still considered when enable_on_exec is true.
Cc: stable@vger.kernel.org
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Reported-by: Sudeep KarkadaNagesha <Sudeep.KarkadaNagesha@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
We must not declare dbg_cpu_pm_nb as __cpuinitdata as we need it after
system initialization for Suspend and CPUIdle.
This was done in commit 9a6eb310ea ("ARM: hw_breakpoint: Debug powerdown
support for self-hosted debug").
Cc: stable@vger.kernel.org
Cc: Dietmar Eggemann <Dietmar.Eggemann@arm.com>
Signed-off-by: Bastian Hecht <hechtb+renesas@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
On Feroceon the L2 cache becomes non-coherent with the CPU
when the L1 caches are disabled. Thus the L2 needs to be invalidated
after both L1 caches are disabled.
On kexec before the starting the code for relocation the kernel,
the L1 caches are disabled in cpu_froc_fin (cpu_v7_proc_fin for Feroceon),
but after L2 cache is never invalidated, because inv_all is not set
in cache-feroceon-l2.c.
So kernel relocation and decompression may has (and usually has) errors.
Setting the function enables L2 invalidation and fixes the issue.
Cc: <stable@vger.kernel.org>
Signed-off-by: Illia Ragozin <illia.ragozin@grapecom.com>
Acked-by: Jason Cooper <jason@lakedaemon.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
tcm_init() call iotable_init() and it use early_alloc variants which
do memblock allocation. Directly using memblock allocation after
initializing bootmem should not permitted, because bootmem can't know
where are additinally reserved.
So move tcm_init() to a safe place before initalizing bootmem.
(On the U300)
Tested-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Commit b4cbb197c7 ("vm: add vm_iomap_memory() helper function") added
a helper function wrapper around io_remap_pfn_range(), and every other
architecture defined it in <asm/pgtable.h>.
The s390 choice of <asm/io.h> may make sense, but is not very convenient
for this case, and gratuitous differences like that cause unexpected errors like this:
mm/memory.c: In function 'vm_iomap_memory':
mm/memory.c:2439:2: error: implicit declaration of function 'io_remap_pfn_range' [-Werror=implicit-function-declaration]
Glory be the kbuild test robot who noticed this, bisected it, and
reported it to the guilty parties (ie me).
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
For accurate accounting call contextidr_thread_switch before a
task is scheduled, rather than after, when the 'next' variable has a
different meaning since we switched the stacks.
Signed-off-by: Christopher Covington <cov@codeaurora.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The ESR_EL1 decoding process is a bit cryptic, and KVM has also
a need for the same constants.
Add a new esr.h file containing the appropriate exception classes
constants, and change entry.S to use it. Fix a small bug in the
EL1 breakpoint check while we're at it.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Using this parameter one can disable the storage_size/2 check if
he is really sure that the UEFI does sane gc and fulfills the spec.
This parameter is useful if a devices uses more than 50% of the
storage by default.
The Intel DQSW67 desktop board is such a sucker for exmaple.
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
pci_disable_device is called by a driver after it stops using the pci
function - e.g. during the removal of the driver. The current
implementation removes the architecture specific information of this
function such that even after a call to pci_enable_device the pci
function is no longer usable. Just remove pcibios_disable_device.
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Disable pci on s390. Enable with pci=on.
Suggested-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Access to pci config space via pci_ops should not fail silently.
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
If a pci load instruction fails the content of the register where the
data is stored is possibly unchanged. Fix the inline assembly wrapper
__pcilg to not return stale data. Additionally fix the callers of this
function who access uninitialized variables.
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Don't let pci_load and friends crash the kernel when called with
e.g. an invalid offset. Return -ENXIO instead.
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Use distinct (and hopefully sane) names for the pci instruction
wrappers.
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Uninline pci related instruction wrappers to de-bloat the code:
add/remove: 15/0 grow/shrink: 2/24 up/down: 1326/-12628 (-11302)
This is especially useful for the inlined pci read and write functions
which are used all over the kernel. Also remove the unused __stpcifc
while at it.
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Use pcibios_add_device to do arch specific device initialization.
This function will be called during pci_bus_add_device.
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Don't modify function handles to get a disabled handle - call
clp_disable_fh. With this change we also do no longer deconfigure
enabled functions.
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Use the debugfs to keep track of a pci function's status changes.
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The hash used for mapping irq numbers to msi descriptors does not
utilize all buckets that were allocated. Fix this by using the same
value (computed by the number of bits used for the hash function) at
relevant places.
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This patch adds the last breaking event address as parameter
for 31 bit compat program signal handlers as it is already
done for 64 bit programs.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
force_console is used to wake up the CCW based console device to
print a panic message in case something goes wrong in a suspend
or resume cycle. Stop using the static console_subchannel and add
a parameter to this function to specify which ccw device we have
to wake up.
Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
wait_cons_dev is used to busy wait for an interrupt on the console
ccw device. Stop using the static console_subchannel and add a
parameter to this function to specify on which ccw device/subchannel
we have to do the polling.
While at it rename the function to ccw_device_wait_idle and
move it to device.c
Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Since commit 5f954c34 ([S390] hibernation: fix lowcore handling)
the absolute zero lowcore is lost during suspend/resume.
For example, this leads to the problem that the re-IPL device
for kdump is no longer set after resume.
With this patch during suspend a buffer is allocated in the new PM
notifier "suspend_pm_cb" and then the absolute zero lowcore is saved
to that buffer. The resume code then copies back this buffer to
absolute zero and afterwards the PM notifier releases the memory.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Remove unused __BITOPS_ALIGN, and replace __BITOPS_WORDSIZE with
BITS_PER_LONG.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Use sske with multiple block control to initialize storage keys within
a 1 MB frame at once.
It turned out that the sske with mb=1 is an order of magnitude faster
than pfmf. This is only an issue for very large systems (several 100GB)
where storage key initialization could last more than a minute.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
dumpstack() did not always print a sane callchain when being called.
The reason is that show_trace() accessed register 15 directly to get
the current stack pointer and passed that pointer to __show_trace()
which expects a valid stack frame pointer as argument.
However due to tail call optimization the stack frame may not exist
anymore when __show_trace() gets called and therefore an invalid
stack frame pointer gets passed.
To prevent that disable tail call optimization for call chain walking
functions.
So move all the show_* functions to a dumpstack.c file like other
architectures have it already and add a -fno-optimize-sibling-calls
compile flag to both dumpstack.c and stacktrace.c to prevent tail
call optimization.
Fixes callchains that looked e.g. like this:
[ 12.868258] Call Trace:
[ 12.868262] ([<0000000000008000>] 0x8000)
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Used PTR_RET function instead of IS_ERR and PTR_ERR.
Patch found using coccinelle.
Signed-off-by: Alexandru Gheorghiu <gheorghiuandru@gmail.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Rewrote conditional statement and eliminated the out_kthread label.
Signed-off-by: Alexandru Gheorghiu <gheorghiuandru@gmail.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Pass buffer length in extra parameter.
Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Using kmem_cache_zalloc() instead of kmem_cache_alloc() and memset().
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
To avoid cache synonyms on System zEC12 32 independent zero pages are
required, one for each combination for bits 2**12 to 2**16 of the virtual
address. To avoid wasting too much memory on small virtual systems the
number of zero pages is limited to 4 if the memory size is less or equal
to 64MB.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Protection exception usually are suppressing and the fault handler
needs to rewind the PSW by the instruction length to get the correct
fault address. Except for protection exceptions while the CPU is in
the middle of a transaction. The CPU stores the transaction abort
PSW at the start of the transaction, if the transaction is aborted
the PSW is already correct and may not be modified by the fault
handler.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
In commit d166991234
idle: Implement generic idle function
Thomas Gleixner cleaned up many things but perturbed some
fragile code that was keeping ia64 alive. So we started
seeing:
WARNING: at kernel/cpu/idle.c:94 cpu_idle_loop+0x360/0x380()
and other unpleasantness like system hangs during boot.
We really shouldn't ever halt with interrupts disabled.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: magnus.damm@gmail.com
Link: http://lkml.kernel.org/r/516d9a0c26048eae9c@agluck-desk.sc.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pull ARM fix from Russell King:
"A build fix for an incomplete change to the ARM cpu suspend code"
* branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
ARM: Do 15e0d9e37c (ARM: pm: let platforms select cpu_suspend support) properly
Pull kvm fixes from Marcelo Tosatti:
"PPC and ARM KVM fixes"
* git://git.kernel.org/pub/scm/virt/kvm/kvm:
ARM: KVM: fix L_PTE_S2_RDWR to actually be Read/Write
ARM: KVM: fix KVM_CAP_ARM_SET_DEVICE_ADDR reporting
kvm/ppc/e500: eliminate tlb_refs
kvm/ppc/e500: g2h_tlb1_map: clear old bit before setting new bit
kvm/ppc/e500: h2g_tlb1_rmap: esel 0 is valid
kvm/powerpc/e500mc: fix tlb invalidation on cpu migration
this merge window.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQIcBAABCAAGBQJRbQJxAAoJEIn5HApB1cB6wPAP/Rb/Nm9byySw12YvXz2tdPKp
DFJ3J+yMhbh0Tjx5+K+iVO4lCqNk6EYXtOwCuvF9dfpp7IWtx0QwhoTM9U5VeFDT
y6+wgKY3CaG9X2IQx+8mUybEiSugsWpMtzOMIOWZLM1iDUtCWtRXybbc1bYWQrOq
YOdJICB8EefffghHhABMdtb9g9F/qzati4q+4jd53zJGF5jC9OVDKRCx1TiVyWHF
VD0stZoQ8Tb693gC0rLKkSSmg4G2Nj9tuL28YIRW1Un+xJ3eGY2RJ4yz6Pt49tKL
czvtU5zHJfIDfVAjMTScentTSBzh+5QxwnPwPYE7zirKKF6+SMSktfF36qTMDkXr
nI4x8FMr1iUNOZhK7Y5AAE1sWVtpUWTrR292xGIZeK80lhJDFFuYGd2ToH6ZMwJ/
wDe8Lp2wCbmMExOoI3yAhNRcQia7AyqYBwTfnDCMlmG9OG6f+erCYt8RFBy+hAhG
r+/IGrpQEldx+URWBnpc1SqjPCegM9ldw6K+xx0bg0l/dxgd7jnLSFiaOGfa5MRv
tdKt+cNv7jf9AM85nJ0fnm6iJwdubbyBRHVkuJXDd5z5twuBqJV9XYxrebh3Td9G
tkk5fprBbn7/mD0tr9eBju9eN+/vDDBUE3ENPufIP7j1wOcg2+1UR5BDkEBgJah0
P98LmOINkt+19poWm2P4
=I7AW
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sfr/next-fixes
Pull powerpc fixes from Stephen Rothwell:
"Three regresions in the PowerPC code. One from v3.7 the others from
this merge window."
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sfr/next-fixes:
powerpc: add a missing label in resume_kernel
powerpc: Fix audit crash due to save/restore PPR changes
powerpc: fix compiling CONFIG_PPC_TRANSACTIONAL_MEM when CONFIG_ALTIVEC=n
Looks like our L_PTE_S2_RDWR definition is slightly wrong,
and is actually write only (see ARM ARM Table B3-9, Stage 2 control
of access permissions). Didn't make a difference for normal pages,
as we OR the flags together, but I'm still wondering how it worked
for Stage-2 mapped devices, such as the GIC.
Brown paper bag time, again.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
Commit 3401d54696 (KVM: ARM: Introduce KVM_ARM_SET_DEVICE_ADDR
ioctl) added support for the KVM_CAP_ARM_SET_DEVICE_ADDR capability,
but failed to add a break in the relevant case statement, returning
the number of CPUs instead.
Luckilly enough, the CONFIG_NR_CPUS=0 patch hasn't been merged yet
(https://lkml.org/lkml/diff/2012/3/31/131/1), so the bug wasn't
noticed.
Just give it a break!
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
Since the ELF structures and access macros change size based on 32 vs
64 bits, build a separate 32-bit relocs tool (for handling realmode
and 32-bit relocations), and a 64-bit relocs tool (for handling 64-bit
kernel relocations).
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: http://lkml.kernel.org/r/1365797627-20874-5-git-send-email-keescook@chromium.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
This adds the ability to process relocations from the 64-bit kernel ELF,
if built with ELF_BITS=64 defined. The special case for the percpu area is
handled, along with some other symbols specific to the 64-bit kernel.
Based on work by Neill Clift and Michael Davidson.
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: http://lkml.kernel.org/r/1365797627-20874-4-git-send-email-keescook@chromium.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Instead of counting and then processing relocations, do it in a single
pass. This splits the processing logic into separate functions for
realmode and 32-bit (and paves the way for 64-bit). Also extracts helper
functions when emitting relocations.
Based on work by Neill Clift and Michael Davidson.
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: http://lkml.kernel.org/r/1365797627-20874-3-git-send-email-keescook@chromium.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
In preparation for making the reloc tool operate on 64-bit relocations,
generalize the structure names for easy recompilation via #defines.
Based on work by Neill Clift and Michael Davidson.
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: http://lkml.kernel.org/r/1365797627-20874-2-git-send-email-keescook@chromium.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
There is no need to use the PV version of the IRQ_WORKER mechanism
as under PVHVM we are using the native version. The native
version is using the SMP API.
They just sit around unused:
69: 0 0 xen-percpu-ipi irqwork0
83: 0 0 xen-percpu-ipi irqwork1
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
See git commit f10cd522c5
(xen: disable PV spinlocks on HVM) for details.
But we did not disable it everywhere - which means that when
we boot as PVHVM we end up allocating per-CPU irq line for
spinlock. This fixes that.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
The default (uninitialized) value of the IRQ line is -1.
Check if we already have allocated an spinlock interrupt line
and if somebody is trying to do it again. Also set it to -1
when we offline the CPU.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
If the timer interrupt has been de-init or is just now being
initialized, the default value of -1 should be preset as
interrupt line. Check for that and if something is odd
WARN us.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
When we online the CPU, we get this splat:
smpboot: Booting Node 0 Processor 1 APIC 0x2
installing Xen timer for CPU 1
BUG: sleeping function called from invalid context at /home/konrad/ssd/konrad/linux/mm/slab.c:3179
in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/1
Pid: 0, comm: swapper/1 Not tainted 3.9.0-rc6upstream-00001-g3884fad #1
Call Trace:
[<ffffffff810c1fea>] __might_sleep+0xda/0x100
[<ffffffff81194617>] __kmalloc_track_caller+0x1e7/0x2c0
[<ffffffff81303758>] ? kasprintf+0x38/0x40
[<ffffffff813036eb>] kvasprintf+0x5b/0x90
[<ffffffff81303758>] kasprintf+0x38/0x40
[<ffffffff81044510>] xen_setup_timer+0x30/0xb0
[<ffffffff810445af>] xen_hvm_setup_cpu_clockevents+0x1f/0x30
[<ffffffff81666d0a>] start_secondary+0x19c/0x1a8
The solution to that is use kasprintf in the CPU hotplug path
that 'online's the CPU. That is, do it in in xen_hvm_cpu_notify,
and remove the call to in xen_hvm_setup_cpu_clockevents.
Unfortunatly the later is not a good idea as the bootup path
does not use xen_hvm_cpu_notify so we would end up never allocating
timer%d interrupt lines when booting. As such add the check for
atomic() to continue.
CC: stable@vger.kernel.org
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
GCC can replace a strncat() call with constant second argument into a
strlen + store, which results in a link error:
ERROR: "strlen" [net/ipv4/ip_tunnel.ko] undefined!
The inline function is a simple for loop in C. Other architectures
either use an asm optimized variant, or use the generic function from
lib/string.c.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Michal Marek <mmarek@suse.cz>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Add platform devices used by the isp116x-hcd driver for EtherNAT and
NetUSBee. Note that the NetUSBee also contains a RTL8019 Ethernet chip,
so its platform device is used to cover the EtherNEC case, too.
Register definitions thanks to David Galvez <dgalvez75@gmail.com>
[Geert] Conditionalize isp1160_delay() definition
Signed-off-by: Michael Schmitz <schmitz@debian.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Add a ndelay macro modeled after the Coldfire udelay(). The ISP1160
driver needs a 150ns delay, so we need to have ndelay().
Signed-off-by: Michael Schmitz <schmitz@debian.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Add a dedicated interrupt chip definition for the EtherNAT CPLD interrupts.
SMC91C111 and ISP1160 chips have separate interrupts that can be enabled
and disabled in a CPLD register at offset 0x23 from the card base.
Note the CPLD interrupt control register is mapped on demand, whenever any
interrupt enable/disable action is requested. The EtherNAT USB driver still
needs interrupts disabled around reset and start actions.
In particular, we cannot entirely rely on the irq_startup being called
first.
The smc91x and isp116x-hcd drivers will use this feature.
Signed-off-by: Michael Schmitz <schmitz@debian.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Add platform device for the Atari ROM port ethernet adapter, EtherNEC.
This platform device will be used by the ne.c driver.
[Geert] Conditionalize platform device data structures
Signed-off-by: Michael Schmitz <schmitz@debian.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Add platform device and interrupt definitions necessary for the EtherNAT
Ethernet/USB adapter for the Falcon extension port. EtherNAT interrupt
numbers are 139/140 so the max. interrupt number for Atari has to be
increased.
[Geert] Conditionalize platform device data structures
Signed-off-by: Michael Schmitz <schmitz@debian.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
While we don't use the spinlock interrupt line (see for details
commit f10cd522c5 -
xen: disable PV spinlocks on HVM) - we should still do the proper
init / deinit sequence. We did not do that correctly and for the
CPU init for PVHVM guest we would allocate an interrupt line - but
failed to deallocate the old interrupt line.
This resulted in leakage of an irq_desc but more importantly this splat
as we online an offlined CPU:
genirq: Flags mismatch irq 71. 0002cc20 (spinlock1) vs. 0002cc20 (spinlock1)
Pid: 2542, comm: init.late Not tainted 3.9.0-rc6upstream #1
Call Trace:
[<ffffffff811156de>] __setup_irq+0x23e/0x4a0
[<ffffffff81194191>] ? kmem_cache_alloc_trace+0x221/0x250
[<ffffffff811161bb>] request_threaded_irq+0xfb/0x160
[<ffffffff8104c6f0>] ? xen_spin_trylock+0x20/0x20
[<ffffffff813a8423>] bind_ipi_to_irqhandler+0xa3/0x160
[<ffffffff81303758>] ? kasprintf+0x38/0x40
[<ffffffff8104c6f0>] ? xen_spin_trylock+0x20/0x20
[<ffffffff810cad35>] ? update_max_interval+0x15/0x40
[<ffffffff816605db>] xen_init_lock_cpu+0x3c/0x78
[<ffffffff81660029>] xen_hvm_cpu_notify+0x29/0x33
[<ffffffff81676bdd>] notifier_call_chain+0x4d/0x70
[<ffffffff810bb2a9>] __raw_notifier_call_chain+0x9/0x10
[<ffffffff8109402b>] __cpu_notify+0x1b/0x30
[<ffffffff8166834a>] _cpu_up+0xa0/0x14b
[<ffffffff816684ce>] cpu_up+0xd9/0xec
[<ffffffff8165f754>] store_online+0x94/0xd0
[<ffffffff8141d15b>] dev_attr_store+0x1b/0x20
[<ffffffff81218f44>] sysfs_write_file+0xf4/0x170
[<ffffffff811a2864>] vfs_write+0xb4/0x130
[<ffffffff811a302a>] sys_write+0x5a/0xa0
[<ffffffff8167ada9>] system_call_fastpath+0x16/0x1b
cpu 1 spinlock event irq -16
smpboot: Booting Node 0 Processor 1 APIC 0x2
And if one looks at the /proc/interrupts right after
offlining (CPU1):
70: 0 0 xen-percpu-ipi spinlock0
71: 0 0 xen-percpu-ipi spinlock1
77: 0 0 xen-percpu-ipi spinlock2
There is the oddity of the 'spinlock1' still being present.
CC: stable@vger.kernel.org
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
In the PVHVM path when we do CPU online/offline path we would
leak the timer%d IRQ line everytime we do a offline event. The
online path (xen_hvm_setup_cpu_clockevents via
x86_cpuinit.setup_percpu_clockev) would allocate a new interrupt
line for the timer%d.
But we would still use the old interrupt line leading to:
kernel BUG at /home/konrad/ssd/konrad/linux/kernel/hrtimer.c:1261!
invalid opcode: 0000 [#1] SMP
RIP: 0010:[<ffffffff810b9e21>] [<ffffffff810b9e21>] hrtimer_interrupt+0x261/0x270
.. snip..
<IRQ>
[<ffffffff810445ef>] xen_timer_interrupt+0x2f/0x1b0
[<ffffffff81104825>] ? stop_machine_cpu_stop+0xb5/0xf0
[<ffffffff8111434c>] handle_irq_event_percpu+0x7c/0x240
[<ffffffff811175b9>] handle_percpu_irq+0x49/0x70
[<ffffffff813a74a3>] __xen_evtchn_do_upcall+0x1c3/0x2f0
[<ffffffff813a760a>] xen_evtchn_do_upcall+0x2a/0x40
[<ffffffff8167c26d>] xen_hvm_callback_vector+0x6d/0x80
<EOI>
[<ffffffff81666d01>] ? start_secondary+0x193/0x1a8
[<ffffffff81666cfd>] ? start_secondary+0x18f/0x1a8
There is also the oddity (timer1) in the /proc/interrupts after
offlining CPU1:
64: 1121 0 xen-percpu-virq timer0
78: 0 0 xen-percpu-virq timer1
84: 0 2483 xen-percpu-virq timer2
This patch fixes it.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: stable@vger.kernel.org
Add a special irq_chip for the Atari MFP timer D interrupt,
which is used as a polling timer for EtherNEC and NetUSBee
Signed-off-by: Michael Schmitz <schmitz@debian.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Atari ROM port ISA adapter support for EtherNEC and NetUSBee adapters
16 bit access for ROM port adapters follows debugging and
clarification by David Galvez <dgalvez75@gmail.com>. The NetUSBee
ISP1160 USB chip uses these macros.
Signed-off-by: Michael Schmitz <schmitz@debian.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
If CONFIG_RMW_INSNS=y:
drivers/block/blockconsole.c: In function ‘bcon_advance_console_bytes’:
drivers/block/blockconsole.c:164: error: implicit declaration of function ‘cmpxchg64’
Map cmpxchg64 to cmpxchg64_local, which is already mapped to
__cmpxchg64_local_generic, just like for the CONFIG_RMW_INSNS=n case.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
For quite a few Xen versions, this wasn't the IRQ vector anymore
anyway, and it is not being used by the kernel for anything. Hence
drop the field from struct irq_info, and respective function
parameters.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
During early setup of a dom0 kernel, populate boot_params with the
Enhanced Disk Drive (EDD) and MBR signature data. This makes
information on the BIOS boot device available in /sys/firmware/edd/.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* pci/jiang-subdrivers:
PCI/ACPI: Remove support of ACPI PCI subdrivers
PCI: acpiphp: Protect acpiphp data structures from concurrent updates
PCI: acpiphp: Use normal list to simplify implementation
PCI: acpiphp: Do not use ACPI PCI subdriver mechanism
PCI: acpiphp: Convert acpiphp to be builtin only, not modular
PCI/ACPI: Handle PCI slot devices when creating/destroying PCI buses
x86/PCI: Implement pcibios_{add|remove}_bus() hooks
ia64/PCI: Implement pcibios_{add|remove}_bus() hooks
PCI/ACPI: Prepare stub functions to handle ACPI PCI (hotplug) slots
PCI: Add pcibios hooks for adding and removing PCI buses
PCI: acpiphp: Replace local macros with standard ACPI macros
PCI: acpiphp: Remove all functions even if function 0 doesn't exist
PCI: acpiphp: Use list_for_each_entry_safe() in acpiphp_sanitize_bus()
PCI: Clean up usages of pci_bus->is_added
PCI: When removing bus, always remove legacy files & unregister
Fixes build with CONFIG_EFI_VARS=m which was broken after the commit
"x86, efivars: firmware bug workarounds should be in platform code".
Signed-off-by: Sergey Vlasov <vsu@altlinux.ru>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
The commit "efi: Distinguish between "remaining space" and actually used
space" added usage of ucs2_*() functions to arch/x86/platform/efi/efi.c,
but the only thing which selected UCS2_STRING was EFI_VARS, which is
technically optional and can be built as a module.
Signed-off-by: Sergey Vlasov <vsu@altlinux.ru>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
The valid mask for both offcore_response_0 and
offcore_response_1 was wrong for SNB/SNB-EP,
IVB/IVB-EP. It was possible to write to
reserved bit and cause a GP fault crashing
the kernel.
This patch fixes the problem by correctly marking the
reserved bits in the valid mask for all the processors
mentioned above.
A distinction between desktop and server parts is introduced
because bits 24-30 are only available on the server parts.
This version of the patch is just a rebase to perf/urgent tree
and should apply to older kernels as well.
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: jolsa@redhat.com
Cc: gregkh@linuxfoundation.org
Cc: security@kernel.org
Cc: ak@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The idea with those routines is to slowly phase them out and not call
them on anything else besides K8. They even have a check for that which,
when called too early, fails. Let me explain:
It gets the cpuinfo_x86 pointer from the per_cpu array and when this
happens for cpu0, before its boot_cpu_data has been copied back to the
per_cpu array in smp_store_boot_cpu_info(), we get an empty struct and
thus the check fails.
Use boot_cpu_data directly instead.
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1365436666-9837-4-git-send-email-bp@alien8.de
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Pull uprobes updates from Oleg Nesterov:
- "uretprobes" - an optimization to uprobes, like kretprobes are an optimization
to kprobes. "perf probe -x file sym%return" now works like kretprobes.
- PowerPC fixes plus a couple of cleanups/optimizations in uprobes and trace_uprobes.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The memblock_find_in_range() return value addr is guaranteed
to be within "addr + aper_size" and not beyond GART_MAX_ADDR.
Signed-off-by: Wang YanQing <udknight@gmail.com>
Cc: yinghai@kernel.org
Link: http://lkml.kernel.org/r/20130416013734.GA14641@udknight
Signed-off-by: Ingo Molnar <mingo@kernel.org>
EFI implementations distinguish between space that is actively used by a
variable and space that merely hasn't been garbage collected yet. Space
that hasn't yet been garbage collected isn't available for use and so isn't
counted in the remaining_space field returned by QueryVariableInfo().
Combined with commit 68d9298 this can cause problems. Some implementations
don't garbage collect until the remaining space is smaller than the maximum
variable size, and as a result check_var_size() will always fail once more
than 50% of the variable store has been used even if most of that space is
marked as available for garbage collection. The user is unable to create
new variables, and deleting variables doesn't increase the remaining space.
The problem that 68d9298 was attempting to avoid was one where certain
platforms fail if the actively used space is greater than 50% of the
available storage space. We should be able to calculate that by simply
summing the size of each available variable and subtracting that from
the total storage space. With luck this will fix the problem described in
https://bugzilla.kernel.org/show_bug.cgi?id=55471 without permitting
damage to occur to the machines 68d9298 was attempting to fix.
Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
EFI variables can be flagged as being accessible only within boot services.
This makes it awkward for us to figure out how much space they use at
runtime. In theory we could figure this out by simply comparing the results
from QueryVariableInfo() to the space used by all of our variables, but
that fails if the platform doesn't garbage collect on every boot. Thankfully,
calling QueryVariableInfo() while still inside boot services gives a more
reliable answer. This patch passes that information from the EFI boot stub
up to the efi platform code.
Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
For s390 the page table mapping for the crashkernel memory is removed to
protect the pre-loaded kdump kernel and ramdisk. Because the crashkernel
memory is not included in the page tables for suspend/resume it is not
included in the suspend image. Therefore after resume the resumed system
does no longer contain the pre-loaded kdump kernel and when kdump is
triggered it fails.
This patch adds a PM notifier that creates the page tables before suspend
is done and removes them for resume. This ensures that the kdump kernel
is included in the suspend image.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
As suggested by Peter Anvin.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: H . Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Apparently 'byts' should be 'bytes'.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: H . Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
A label 0 was missed in the patch a9c4e541 (powerpc/kprobe: Complete
kprobe and migrate exception frame). This will cause the kernel
branch to an undetermined address if there really has a conflict when
updating the thread flags.
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Cc: stable@vger.kernel.org
Acked-By: Tiejun Chen <tiejun.chen@windriver.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
The current mainline crashes when hitting userspace with the following:
kernel BUG at kernel/auditsc.c:1769!
cpu 0x1: Vector: 700 (Program Check) at [c000000023883a60]
pc: c0000000001047a8: .__audit_syscall_entry+0x38/0x130
lr: c00000000000ed64: .do_syscall_trace_enter+0xc4/0x270
sp: c000000023883ce0
msr: 8000000000029032
current = 0xc000000023800000
paca = 0xc00000000f080380 softe: 0 irq_happened: 0x01
pid = 1629, comm = start_udev
kernel BUG at kernel/auditsc.c:1769!
enter ? for help
[c000000023883d80] c00000000000ed64 .do_syscall_trace_enter+0xc4/0x270
[c000000023883e30] c000000000009b08 syscall_dotrace+0xc/0x38
--- Exception: c00 (System Call) at 0000008010ec50dc
Bisecting found the following patch caused it:
commit 44e9309f1f
Author: Haren Myneni <haren@linux.vnet.ibm.com>
powerpc: Implement PPR save/restore
It was found this patch corrupted r9 when calling
SET_DEFAULT_THREAD_PPR()
Using r10 as a scratch register instead of r9 solved the problem.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
This patch replaces V4L2_STD_525_60/625_50 with V4L2_STD_NTSC/PAL
respectively as this are the proper video standards.
Signed-off-by: Lad, Prabhakar <prabhakar.csengg@gmail.com>
Reported-by: Hans Verkuil <hans.verkuil@cisco.com>
Acked-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
add support for V4L2 video display to DM355 EVM.
Support for SD modes is provided, along with Composite
output
Signed-off-by: Lad, Prabhakar <prabhakar.csengg@gmail.com>
Acked-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Create platform devices for various video modules like venc,osd,
vpbe and v4l2 driver for dm355.
Signed-off-by: Lad, Prabhakar <prabhakar.csengg@gmail.com>
Acked-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
add support for V4L2 video display to DM365 EVM.
Support for SD and ED modes is provided, along with Composite
and Component outputs.
Signed-off-by: Lad, Prabhakar <prabhakar.csengg@gmail.com>
Acked-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Create platform devices for various video modules like venc,osd,
vpbe and v4l2 driver for dm365.
Signed-off-by: Lad, Prabhakar <prabhakar.csengg@gmail.com>
Acked-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
By default the VPSS clocks were enabled in capture driver
for davinci family which creates duplicates for dm355/dm365/dm644x.
This patch adds support to enable the VPSS clocks in VPSS driver,
which avoids duplication of code and also adding clock aliases.
This patch uses PM runtime API to enable/disable clock, instead
of DaVinci clock framework. con_ids for master and slave clocks of
vpss is added in pm_domain.
This patch cleanups the VPSS clock enabling in the capture driver,
and also removes the clock alias in machine file. Along side adds
a vpss slave clock for DM365 as mentioned by Sekhar
(https://patchwork.kernel.org/patch/1221261/).
The Suspend/Resume in dm644x_ccdc.c which enabled/disabled the VPSS clock
is now implemented as part of the VPSS driver.
Signed-off-by: Lad, Prabhakar <prabhakar.csengg@gmail.com>
Acked-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Pull x86 fixes from Ingo Molnar:
"Misc fixes"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Flush lazy MMU when DEBUG_PAGEALLOC is set
x86/mm/cpa/selftest: Fix false positive in CPA self test
x86/mm/cpa: Convert noop to functional fix
x86, mm: Patch out arch_flush_lazy_mmu_mode() when running on bare metal
x86, mm, paravirt: Fix vmalloc_fault oops during lazy MMU updates
Pull m68knommu fix from Greg Ungerer:
"This contains only a single compilation fix for ColdFire m68k targets
that use local non-GPIOLIB support."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
m68k: define a local gpio_request_one() function
Hijack the return address and replace it with a trampoline address.
PowerPC implementation.
Signed-off-by: Anton Arapov <anton@redhat.com>
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Hijack the return address and replace it with a trampoline address.
Signed-off-by: Anton Arapov <anton@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
This patch attempts to fix:
https://bugzilla.kernel.org/show_bug.cgi?id=56461
The symptom is a crash and messages like this:
chrome: Corrupted page table at address 34a03000
*pdpt = 0000000000000000 *pde = 0000000000000000
Bad pagetable: 000f [#1] PREEMPT SMP
Ingo guesses this got introduced by commit 611ae8e3f5 ("x86/tlb:
enable tlb flush range support for x86") since that code started to free
unused pagetables.
On x86-32 PAE kernels, that new code has the potential to free an entire
PMD page and will clear one of the four page-directory-pointer-table
(aka pgd_t entries).
The hardware aggressively "caches" these top-level entries and invlpg
does not actually affect the CPU's copy. If we clear one we *HAVE* to
do a full TLB flush, otherwise we might continue using a freed pmd page.
(note, we do this properly on the population side in pud_populate()).
This patch tracks whenever we clear one of these entries in the 'struct
mmu_gather', and ensures that we follow up with a full tlb flush.
BTW, I disassembled and checked that:
if (tlb->fullmm == 0)
and
if (!tlb->fullmm && !tlb->need_flush_all)
generate essentially the same code, so there should be zero impact there
to the !PAE case.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Artem S Tashkinov <t.artem@mailcity.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The spi-s3c64xx uses a Samsung proprietary interface for
talking to the DMA engine, which does not work with
multiplatform kernels.
This version of the patch leaves the old code in place,
behind an #ifdef. This can be removed in the future,
after the s3c64xx platform start supporting the regular
dmaengine interface. An earlier version of this patch was
tested successfully on exynos5250 by Padma Venkat.
The conversion was rather mechanical, since the samsung
interface is just a shallow wrapper around the dmaengine
interface.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The last users of FIX_CYCLONE_TIMER were removed in v2.6.18. We
can remove this unneeded constant.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Link: http://lkml.kernel.org/r/1365698982.1427.3.camel@x61.thuisdomein
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When CONFIG_DEBUG_PAGEALLOC is set page table updates made by
kernel_map_pages() are not made visible (via TLB flush)
immediately if lazy MMU is on. In environments that support lazy
MMU (e.g. Xen) this may lead to fatal page faults, for example,
when zap_pte_range() needs to allocate pages in
__tlb_remove_page() -> tlb_next_batch().
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: konrad.wilk@oracle.com
Link: http://lkml.kernel.org/r/1365703192-2089-1-git-send-email-boris.ostrovsky@oracle.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
If the pmd is not present, _PAGE_PSE will not be set anymore.
Fix the false positive.
Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Stefan Bader <stefan.bader@canonical.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Borislav Petkov <bp@alien8.de>
Link: http://lkml.kernel.org/r/1365687369-30802-1-git-send-email-aarcange@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Some important bug fixes that came in over the last 10 days,
mostly mvebu and imx:
- Multiple regressions on i.mx following the conversion of
the clock code, hopefully the last we are seeing of those.
- a regression in the mvebu irq handling code
- An incorrect register offset in the rewritten s3c24xx irq code.
- Two bugs in setting up the iomega_ix2_200 machine
- Turning on an extra bus clock on imx
- A MAINTAINERS file entry for Roland Stigge
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIVAwUAUWbTVmCrR//JCVInAQKnAA/6AnxY2FRgtR/XuDTBkaJ/VUObbYBbsnRW
uvO9DRYzACIy9f4bAOtg8SIB29Zt/tqJCnqj6nG4tbTn3dqQZXSaKktwQB8e9Q3j
+ttiq4an9CFfl4/9hxTfwadg1G9coUND1774Y4qv3csHYbyd5jBLLx7MEfevQ1ry
lmzDIQRnf4j4tp3q1d2kZ3sh9O09f+V2Hxnle0JySOsXt3NrfMOQdR6PYt3ZbsyO
mhuXLsSmHk/5J4rrZD6OuThlZzDEat22gjK/HbBTa/OyVqdYyHMsOWf4O2HR0MGs
FBqg2dcTap8s/lHQ0jId0Zvt31e4OgLs00ehD7V2ZYeQjG/d2PYrvDaGuFY6kIir
eNUhAozWTF5FmOe/LJ46waUC7HoYsHqrSkFWcvGWLjLXvQT6G5jie5/lpp9yfq4G
Ou2RTG+tJ+jiPnsk7E3+Z4/YQXyDhIxvo94xF+kSR3zWgJIKjFjgQ41pSql97tMa
5AQ7eJgWsmPfsVvIIxnRYAMV3/4jfJq2gM4BviA6OjZ8nHUWNXukLIHPtcADCYYQ
J0jSbFBX7qCd6UtDDo9t/YvAcLM34FUFsqGP4BjBLZe4XAkI6aVkz6jAE5amEUVm
1HgTfE5U1QI5u0TgXbaEYii6lbgcLM9Lsdl0wFsXjvXI9rdIfzJYByeV4Mx5J1dB
V4D42xojJWE=
=J404
-----END PGP SIGNATURE-----
Merge tag 'arm-soc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC bug fixes from Arnd Bergmann:
"A little later during the week than the last few pull requests, since
there was very little that came in before 3.9-rc6. At least things
have calmed down again here.
Some important bug fixes that came in over the last 10 days, mostly
mvebu and imx:
- Multiple regressions on i.mx following the conversion of the clock
code, hopefully the last we are seeing of those.
- a regression in the mvebu irq handling code
- An incorrect register offset in the rewritten s3c24xx irq code.
- Two bugs in setting up the iomega_ix2_200 machine
- Turning on an extra bus clock on imx
- A MAINTAINERS file entry for Roland Stigge"
* tag 'arm-soc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
arm: mvebu: Fix the irq map function in SMP mode
Fix GE0/GE1 init on ix2-200 as GE0 has no PHY
ARM: S3C24XX: Fix interrupt pending register offset of the EINT controller
ARM: S3C24XX: Correct NR_IRQS definition for s3c2440
ARM i.MX6: Fix ldb_di clock selection
ARM: imx: provide twd clock lookup from device tree
ARM: imx35 Bugfix admux clock
ARM: clk-imx35: Bugfix iomux clock
ARM: mxs: Slow down the I2C clock speed
MAINTAINERS: Add maintainer for LPC32xx
ARM: Kirkwood: Fix typo in the definition of ix2-200 rebuild LED
We check the TSS descriptor before we try to dereference it.
Also we document what the value '9' actually means using the
AMD64 Architecture Programmer's Manual Volume 2, pg 90:
"Hex value 9: Available 64-bit TSS" and pg 91:
"The available 32-bit TSS (09h), which is redefined as the
available 64-bit TSS."
Without this, on Xen, where the GDT is available as R/O (to
protect the hypervisor from the guest modifying it), we end up
with a pagetable fault.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Link: http://lkml.kernel.org/r/1365194544-14648-5-git-send-email-konrad.wilk@oracle.com
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
The two use-cases where we needed to store the GDT were during ACPI S3 suspend
and resume. As the patches:
x86/gdt/i386: store/load GDT for ACPI S3 or hibernation/resume path is not needed
x86/gdt/64-bit: store/load GDT for ACPI S3 or hibernate/resume path is not needed.
have demonstrated - there are other mechanism by which the GDT is
saved and reloaded during early resume path.
Hence we do not need to worry about the pvops call-chain for saving the
GDT and can and can eliminate it. The other areas where the store_gdt is
used are never going to be hit when running under the pvops platforms.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Link: http://lkml.kernel.org/r/1365194544-14648-4-git-send-email-konrad.wilk@oracle.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
During the ACPI S3 suspend, we store the GDT in the wakup_header (see
wakeup_asm.s) field called 'pmode_gdt'.
Which is then used during the resume path and has the same exact
value as what the store/load_gdt do with the saved_context
(which is saved/restored via save/restore_processor_state()).
The flow during resume from ACPI S3 is simpler than the 64-bit
counterpart. We only use the early bootstrap once (wakeup_gdt) and
do various checks in real mode.
After the checks are completed, we load the saved GDT ('pmode_gdt') and
continue on with the resume (by heading to startup_32 in trampoline_32.S) -
which quickly jumps to what was saved in 'pmode_entry'
aka 'wakeup_pmode_return'.
The 'wakeup_pmode_return' restores the GDT (saved_gdt) again (which was
saved in do_suspend_lowlevel initially). After that it ends up calling
the 'ret_point' which calls 'restore_processor_state()'.
We have two opportunities to remove code where we restore the same GDT
twice.
Here is the call chain:
wakeup_start
|- lgdtl wakeup_gdt [the work-around broken BIOSes]
|
| - lgdtl pmode_gdt [the real one]
|
\-- startup_32 (in trampoline_32.S)
\-- wakeup_pmode_return (in wakeup_32.S)
|- lgdtl saved_gdt [the real one]
\-- ret_point
|..
|- call restore_processor_state
The hibernate path is much simpler. During the saving of the hibernation
image we call save_processor_state() and save the contents of that
along with the rest of the kernel in the hibernation image destination.
We save the EIP of 'restore_registers' (restore_jump_address) and
cr3 (restore_cr3).
During hibernate resume, the 'restore_registers' (via the
'restore_jump_address) in hibernate_asm_32.S is invoked which
restores the contents of most registers. Naturally the resume path benefits
from already being in 32-bit mode, so it does not have to reload the GDT.
It only reloads the cr3 (from restore_cr3) and continues on. Note
that the restoration of the restore image page-tables is done prior to
this.
After the 'restore_registers' it returns and we end up called
restore_processor_state() - where we reload the GDT. The reload of
the GDT is not needed as bootup kernel has already loaded the GDT
which is at the same physical location as the the restored kernel.
Note that the hibernation path assumes the GDT is correct during its
'restore_registers'. The assumption in the code is that the restored
image is the same as saved - meaning we are not trying to restore
an different kernel in the virtual address space of a new kernel.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Link: http://lkml.kernel.org/r/1365194544-14648-3-git-send-email-konrad.wilk@oracle.com
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
During the ACPI S3 resume path the trampoline code handles it already.
During the ACPI S3 suspend phase (acpi_suspend_lowlevel) we set:
early_gdt_descr.address = (..)get_cpu_gdt_table(smp_processor_id());
which is then used during the resume path and has the same exact
value as what the store/load_gdt do with the saved_context
(which is saved/restored via save/restore_processor_state()).
The flow during resume is complex and for 64-bit kernels we use three GDTs
- one early bootstrap GDT (wakeup_igdt) that we load to workaround
broken BIOSes, an early Protected Mode to Long Mode transition one
(tr_gdt), and the final one - early_gdt_descr (which points to the real GDT).
The early ('wakeup_gdt') is loaded in 'trampoline_start' for working
around broken BIOSes, and then when we end up in Protected Mode in the
startup_32 (in trampoline_64.s, not head_32.s) we use the 'tr_gdt'
(still in trampoline_64.s). This 'tr_gdt' has a a 32-bit code segment,
64-bit code segment with L=1, and a 32-bit data segment.
Once we have transitioned from Protected Mode to Long Mode we then
set the GDT to 'early_gdt_desc' and then via an iretq emerge in
wakeup_long64 (set via 'initial_code' variable in acpi_suspend_lowlevel).
In the wakeup_long64 we end up restoring the %rip (which is set to
'resume_point') and jump there.
In 'resume_point' we call 'restore_processor_state' which does
the load_gdt on the saved context. This load_gdt is redundant as the
GDT loaded via early_gdt_desc is the same.
Here is the call-chain:
wakeup_start
|- lgdtl wakeup_gdt [the work-around broken BIOSes]
|
\-- trampoline_start (trampoline_64.S)
|- lgdtl tr_gdt
|
\-- startup_32 (trampoline_64.S)
|
\-- startup_64 (trampoline_64.S)
|
\-- secondary_startup_64
|- lgdtl early_gdt_desc
| ...
|- movq initial_code(%rip), %eax
|-.. lretq
\-- wakeup_64
|-- other registers are reloaded
|-- call restore_processor_state
The hibernate path is much simpler. During the saving of the hibernation
image we call save_processor_state() and save the contents of that along
with the rest of the kernel in the hibernation image destination.
We save the EIP of 'restore_registers' (restore_jump_address) and cr3
(restore_cr3).
During hibernate resume, the 'restore_registers' (via the
'restore_jump_address) in hibernate_asm_64.S is invoked which restores
the contents of most registers. Naturally the resume path benefits from
already being in 64-bit mode, so it does not have to load the GDT.
It only reloads the cr3 (from restore_cr3) and continues on. Note that
the restoration of the restore image page-tables is done prior to this.
After the 'restore_registers' it returns and we end up called
restore_processor_state() - where we reload the GDT. The reload of
the GDT is not needed as bootup kernel has already loaded the GDT which
is at the same physical location as the the restored kernel.
Note that the hibernation path assumes the GDT is correct during its
'restore_registers'. The assumption in the code is that the restored
image is the same as saved - meaning we are not trying to restore
an different kernel in the virtual address space of a new kernel.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Link: http://lkml.kernel.org/r/1365194544-14648-2-git-send-email-konrad.wilk@oracle.com
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
The definitions have moved to include/linux/usb/samsung-usb-phy.h,
and plat/usb-phy.h is unavailable from drivers in a multiplatform
configuration.
Also fix up the plat/usb-phy.h header file to use the definitions
from the new header instead of providing a separate copy.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Make a copy of the IDT (as seen via the "sidt" instruction) read-only.
This primarily removes the IDT from being a target for arbitrary memory
write attacks, and has the added benefit of also not leaking the kernel
base offset, if it has been relocated.
We already did this on vendor == Intel and family == 5 because of the
F0 0F bug -- regardless of if a particular CPU had the F0 0F bug or
not. Since the workaround was so cheap, there simply was no reason to
be very specific. This patch extends the readonly alias to all CPUs,
but does not activate the #PF to #UD conversion code needed to deliver
the proper exception in the F0 0F case except on Intel family 5
processors.
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: http://lkml.kernel.org/r/20130410192422.GA17344@www.outflux.net
Cc: Eric Northup <digitaleric@google.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
The registers for the Samsung S3C serial port are currently defined in
the platform specific arch/arm/plat-samsung/include/plat/regs-serial.h
file, which is not visible to multiplatform capable drivers.
Unfortunately, it is not possible to move the file into a more local
place as we should normally try to, because the same registers
may be used in one of four places:
* In the driver itself
* In platform-independent ARM code for early debug output
* In platform_data definitions
* In the Samsung platform power management code
I have also found no way to logically split out a platform_data
file, other than possibly move everything into
include/linux/platform_data, which also felt wrong. The only
part of this file that makes sense to keep specific to the s3c24xx
platform are the virtual and physical addresses defined here,
which are needed in no other location.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
- Kirkwood
- a couple of small fixes for the Iomega ix2-200 board (ether and led)
- mvebu
- allow GPIO button to work on Mirabox when running SMP
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQEcBAABAgAGBQJRZZxwAAoJEAi3KVZQDZAerwcIAKR1xRKuagtXvfij+xfVVRQ6
1G645hTIthFXmNiEeo3Y/mswHhVooZvu8SWh3o3J2Wsbnzeh+ch6jTsl+g6Tnb3E
KmRFU0mcalRmsyYsh4PH9nDi8/oWyP+i3ZytgfcMlsSPNiE+Ek35NdTTWMLCTT8V
MkGysVc8MWoOmMf47mNboy5UUUTXRdnSUJSjv1ubrsTK33LT7Ii9Ce+eoNvpvF+5
+6RenfRMzRSwkZf9AaCRPAhXISQKbMAwz6lKGo2GGAW+73Z+JclXXmiCfQ8pWie2
pfyqiEYigZFqe6Ly5BUtGoVfDjmOLDs+ATTUDOlOj0uaEc7pOwwKoAtpVRdck24=
=yK1f
-----END PGP SIGNATURE-----
Merge tag 'mvebu_fixes_for_v3.9_round3' of git://git.infradead.org/users/jcooper/linux into fixes
From Jason Cooper <jason@lakedaemon.net>:
mvebu fixes for v3.9 round 3
- Kirkwood
- a couple of small fixes for the Iomega ix2-200 board (ether and led)
- mvebu
- allow GPIO button to work on Mirabox when running SMP
* tag 'mvebu_fixes_for_v3.9_round3' of git://git.infradead.org/users/jcooper/linux:
arm: mvebu: Fix the irq map function in SMP mode
Fix GE0/GE1 init on ix2-200 as GE0 has no PHY
ARM: Kirkwood: Fix typo in the definition of ix2-200 rebuild LED
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Some EFI implementations return always a MaximumVariableSize of 0,
check against max_size only if it is non-zero.
My Intel DQ67SW desktop board has such an implementation.
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Commit 523f0e5421 ("KVM: PPC: E500:
Explicitly mark shadow maps invalid") began using E500_TLB_VALID
for guest TLB1 entries, and skipping invalidations if it's not set.
However, when E500_TLB_VALID was set for such entries, it was on a
fake local ref, and so the invalidations never happen. gtlb_privs
is documented as being only for guest TLB0, though we already violate
that with E500_TLB_BITMAP.
Now that we have MMU notifiers, and thus don't need to actually
retain a reference to the mapped pages, get rid of tlb_refs, and
use gtlb_privs for E500_TLB_VALID in TLB1.
Since we can have more than one host TLB entry for a given tlbe_ref,
be careful not to clear existing flags that are relevant to other
host TLB entries when preparing a new host TLB entry.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
It's possible that we're using the same host TLB1 slot to map (a
presumably different portion of) the same guest TLB1 entry. Clear
the bit in the map before setting it, so that if the esels are the same
the bit will remain set.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Add one to esel values in h2g_tlb1_rmap, so that "no mapping" can be
distinguished from "esel 0". Note that we're not saved by the fact
that host esel 0 is reserved for non-KVM use, because KVM host esel
numbering is not the raw host numbering (see to_htlb1_esel).
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
The s3c-fb driver requires header files from the samsung platforms
to find its platform_data definition, but this no longer works on
multiplatform kernels, so let's move the data into a new header
file under include/linux/platform_data.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: linux-fbdev@vger.kernel.org
Acked-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Commit:
a8aed3e075 ("x86/mm/pageattr: Prevent PSE and GLOABL leftovers to confuse pmd/pte_present and pmd_huge")
introduced a valid fix but one location that didn't trigger the bug that
lead to finding those (small) problems, wasn't updated using the
right variable.
The wrong variable was also initialized for no good reason, that
may have been the source of the confusion. Remove the noop
initialization accordingly.
Commit a8aed3e075 also erroneously removed one canon_pgprot pass meant
to clear pmd bitflags not supported in hardware by older CPUs, that
automatically gets corrected by this patch too by applying it to the right
variable in the new location.
Reported-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Mel Gorman <mgorman@suse.de>
Link: http://lkml.kernel.org/r/1365600505-19314-1-git-send-email-aarcange@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
- Early bootup issue found on DL380 machines
- Fix for the timer interrupt not being processed right away.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (GNU/Linux)
iQEcBAABAgAGBQJRZbq0AAoJEFjIrFwIi8fJTnsIAIWYw7g9j0T9gijc/t5wEZrK
KpPBITlGFAeM7liEaUh5X5M2B86tBoI77uV5EGCvDDwth+FD5WsgeMesxV9KlMdj
vbWLGubJpmd8zy6Q1f/T3LsxGHGCjz8jASeN7YTPRdBqITOQDqXjj2VC/4n7AQCh
Le3ml3A/NZTZMiz2PK8lDzjpzY2lDgrIloevahVoYLe8Jxg2aW5JTaZQg7oPA6ir
lqC1Sgju6RDKR0kmPmM8wl5TOIMCkrygriP62B+Ww9wl9HlS+X5/JlK3zVj0vJXo
oxNyDKEK96M54oO5t/v7qzdfX2Xj+S/6JPZOlegCWGClS9rQoK3uDBZupGLiAh8=
=jaNW
-----END PGP SIGNATURE-----
Merge tag 'stable/for-linus-3.9-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
Pull Xen fixes from Konrad Rzeszutek Wilk:
"Two bug-fixes:
- Early bootup issue found on DL380 machines
- Fix for the timer interrupt not being processed right awaym leading
to quite delayed time skew on certain workloads"
* tag 'stable/for-linus-3.9-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/mmu: On early bootup, flush the TLB when changing RO->RW bits Xen provided pagetables.
xen/events: Handle VIRQ_TIMER before any other hardirq in event loop.
The existing check handles the case where we've migrated to a different
core than we last ran on, but it doesn't handle the case where we're
still on the same cpu we last ran on, but some other vcpu has run on
this cpu in the meantime.
Without this, guest segfaults (and other misbehavior) have been seen in
smp guests.
Cc: stable@vger.kernel.org # 3.8.x
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Commit 92702df357 ("ARM: OMAP4: PM: fix PM regression introduced by recent
clock cleanup") makes the 'ocp2scp_usb_phy_phy_48m' as optional
functional clock causing regression in MUSB. But this 48MHz clock is a
mandatory clock for usb phy attached to ocp2scp and hence made as the main
clock for ocp2scp.
Cc: Keerthy <j-keerthy@ti.com>
Cc: Benoît Cousson <b-cousson@ti.com>
Cc: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
[paul@pwsan.com: add comment to the hwmod data to try to prevent any
future mistakes here]
Signed-off-by: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Invoking arch_flush_lazy_mmu_mode() results in calls to
preempt_enable()/disable() which may have performance impact.
Since lazy MMU is not used on bare metal we can patch away
arch_flush_lazy_mmu_mode() so that it is never called in such
environment.
[ hpa: the previous patch "Fix vmalloc_fault oops during lazy MMU
updates" may cause a minor performance regression on
bare metal. This patch resolves that performance regression. It is
somewhat unclear to me if this is a good -stable candidate. ]
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: http://lkml.kernel.org/r/1364045796-10720-2-git-send-email-konrad.wilk@oracle.com
Tested-by: Josh Boyer <jwboyer@redhat.com>
Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: <stable@vger.kernel.org> SEE NOTE ABOVE
In paravirtualized x86_64 kernels, vmalloc_fault may cause an oops
when lazy MMU updates are enabled, because set_pgd effects are being
deferred.
One instance of this problem is during process mm cleanup with memory
cgroups enabled. The chain of events is as follows:
- zap_pte_range enables lazy MMU updates
- zap_pte_range eventually calls mem_cgroup_charge_statistics,
which accesses the vmalloc'd mem_cgroup per-cpu stat area
- vmalloc_fault is triggered which tries to sync the corresponding
PGD entry with set_pgd, but the update is deferred
- vmalloc_fault oopses due to a mismatch in the PUD entries
The OOPs usually looks as so:
------------[ cut here ]------------
kernel BUG at arch/x86/mm/fault.c:396!
invalid opcode: 0000 [#1] SMP
.. snip ..
CPU 1
Pid: 10866, comm: httpd Not tainted 3.6.10-4.fc18.x86_64 #1
RIP: e030:[<ffffffff816271bf>] [<ffffffff816271bf>] vmalloc_fault+0x11f/0x208
.. snip ..
Call Trace:
[<ffffffff81627759>] do_page_fault+0x399/0x4b0
[<ffffffff81004f4c>] ? xen_mc_extend_args+0xec/0x110
[<ffffffff81624065>] page_fault+0x25/0x30
[<ffffffff81184d03>] ? mem_cgroup_charge_statistics.isra.13+0x13/0x50
[<ffffffff81186f78>] __mem_cgroup_uncharge_common+0xd8/0x350
[<ffffffff8118aac7>] mem_cgroup_uncharge_page+0x57/0x60
[<ffffffff8115fbc0>] page_remove_rmap+0xe0/0x150
[<ffffffff8115311a>] ? vm_normal_page+0x1a/0x80
[<ffffffff81153e61>] unmap_single_vma+0x531/0x870
[<ffffffff81154962>] unmap_vmas+0x52/0xa0
[<ffffffff81007442>] ? pte_mfn_to_pfn+0x72/0x100
[<ffffffff8115c8f8>] exit_mmap+0x98/0x170
[<ffffffff810050d9>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e
[<ffffffff81059ce3>] mmput+0x83/0xf0
[<ffffffff810624c4>] exit_mm+0x104/0x130
[<ffffffff8106264a>] do_exit+0x15a/0x8c0
[<ffffffff810630ff>] do_group_exit+0x3f/0xa0
[<ffffffff81063177>] sys_exit_group+0x17/0x20
[<ffffffff8162bae9>] system_call_fastpath+0x16/0x1b
Calling arch_flush_lazy_mmu_mode immediately after set_pgd makes the
changes visible to the consistency checks.
Cc: <stable@vger.kernel.org>
RedHat-Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=914737
Tested-by: Josh Boyer <jwboyer@redhat.com>
Reported-and-Tested-by: Krishna Raman <kraman@redhat.com>
Signed-off-by: Samu Kallio <samu.kallio@aberdeencloud.com>
Link: http://lkml.kernel.org/r/1364045796-10720-1-git-send-email-konrad.wilk@oracle.com
Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
This patch fix the regression introduced by the commit 3202bf0157
"arm: mvebu: Improve the SMP support of the interrupt controller":
GPIO IRQ were no longer delivered to the CPUs.
To be delivered to a CPU an interrupt must be enabled at CPU level and
at interrupt source level. Before the offending patch, all the
interrupts were enabled at source level during map() function. Mask()
and unmask() was done by handling the per-CPU part. It was fine when
running in UP with only one CPU.
The offending patch added support for SMP, in this case mask() and
unmask() was done by handling the interrupt source level part. The
per-CPU level part was handled by the affinity API to select the CPU
which will receive the interrupt. (Due to some hardware limitation
only one CPU at a time can received a given interrupt).
For "normal" interrupt __setup_irq() was called when an irq was
registered. irq_set_affinity() is called from this function, which
enabled the interrupt on one of the CPUs. Whereas for GPIO IRQ which
were chained interrupts, the irq_set_affinity() was never called and
none of the CPUs was selected to receive the interrupt.
With this patch all the interrupt are enable on the current CPU during
map() function. Enabling the interrupts on a CPU doesn't depend
anymore on irq_set_affinity() and then the chained irq are not anymore
a special case. However the CPU which will receive the irq can still
be modify later using irq_set_affinity().
Tested with Mirabox (A370) and Openblocks AX3 (AXP), rootfs mounted
over NFS, compiled with CONFIG_SMP=y/N.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Reported-by: Ryan Press <ryan@presslab.us>
Investigated-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Tested-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Tested-by: Ryan Press <ryan@presslab.us>
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
The of_node field of the device assigned to a
PCI bus is used during scanning of the PCI bus.
However on MIPS, the of_node field is assigned
only after the bus has been scanned.
Implement the architecture specific version of
'pcibios_get_phb_of_node'. Which ensures that the
PCI driver core will initialize the of_node field
before starting the scan.
Also remove the local assignment of bus->dev.of_node,
it is not needed after the patch.
Signed-off-by: Gabor Juhos <juhosg@openwrt.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Add CYCLE_ACTIVITY.CYCLES_NO_DISPATCH/CYCLES_L1D_PENDING constraints.
These recently documented events have restrictions to counter
0-3 and counter 2 respectively. The perf scheduler needs to know
that to schedule them correctly.
IvyBridge already has the necessary constraints.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: a.p.zijlstra@chello.nl
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1362784968-12542-1-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
CONFIG_INVLPG got removed in commit
094ab1db7c ("x86, 386 removal:
Remove CONFIG_INVLPG").
That commit left one instance of CONFIG_INVLPG untouched, effectively
disabling DEBUG_TLBFLUSH for X86_32. Since all currently supported
x86 CPUs should now be able to support that option, just drop the entire
sub-dependency.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Link: http://lkml.kernel.org/r/1363262077.1335.71.camel@x61.thuisdomein
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The Kconfig symbol ARCH_HAS_DEFAULT_IDLE is unused. Commit
a0bfa13738 ("cpuidle: stop
depending on pm_idle") removed the only place were it was
actually used. But it did not remove its Kconfig entries (for sh
and x86). Remove those two entries now.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Cc: Len Brown <len.brown@intel.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Link: http://lkml.kernel.org/r/1363869683.1390.134.camel@x61.thuisdomein
Signed-off-by: Ingo Molnar <mingo@kernel.org>
So basically we're generating the pte_t * from a struct page and
we're handing it down to the __split_large_page() internal version
which then goes and gets back struct page * from it because it
needs it.
Change the caller to hand down struct page * directly and the
callee can compute the pte_t itself.
Net save is one virt_to_page() call and simpler code. While at
it, make __split_large_page() static.
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1363886217-24703-1-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
As multi-platform build is being adopted by more and more ARM platforms,
initcall function should be used very carefully. For example, when
CONFIG_ARM_OMAP2PLUS_CPUFREQ is built in the kernel, omap_cpufreq_init()
will be called on all the platforms to initialize omap-cpufreq driver.
Further, on OMAP, we now use Soc generic cpufreq-cpu0 driver using device
tree entries. To allow cpufreq-cpu0 and omap-cpufreq drivers to co-exist
for OMAP in a single image, we need to ensure the following:
1. With device tree boot, we use cpufreq-cpu0
2. With non device tree boot, we use omap-cpufreq
In the case of (1), we will have cpu OPPs and regulator registered
as part of the device tree nodes, to ensure that omap-cpufreq
and cpufreq-cpu0 don't conflict in managing the frequency of the
same CPU, we should not permit omap-cpufreq to be probed.
In the case of (2), we will not have the cpufreq-cpu0 device, hence
only omap-cpufreq will be active.
To eliminate this undesired these effects, we change omap-cpufreq
driver to have it instantiated as a platform_driver and register
"omap-cpufreq" device only when booted without device tree nodes on
OMAP platforms.
This allows the following:
a) Will only run on platforms that create the platform_device
"omap-cpufreq".
b) Since the platform_device is registered only when device tree nodes
are *not* populated, omap-cpufreq driver does not conflict with
the usage of cpufreq-cpu0 driver which is used on OMAP platforms when
device tree nodes are present.
Inspired by commit 5553f9e26f
(cpufreq: instantiate cpufreq-cpu0 as a platform_driver)
[robherring2@gmail.com: reported conflict of omap-cpufreq vs other
driver in an non-device tree supported boot]
Reported-by: Rob Herring <robherring2@gmail.com>
Signed-off-by: Nishanth Menon <nm@ti.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>