Commit graph

409 commits

Author SHA1 Message Date
Dave Hansen 6ba48ff46f x86: Remove arbitrary instruction size limit in instruction decoder
The current x86 instruction decoder steps along through the
instruction stream but always ensures that it never steps farther
than the largest possible instruction size (MAX_INSN_SIZE).

The MPX code is now going to be doing some decoding of userspace
instructions.  We copy those from userspace in to the kernel and
they're obviously completely untrusted coming from userspace.  In
addition to the constraint that instructions can only be so long,
we also have to be aware of how long the buffer is that came in
from userspace.  This _looks_ to be similar to what the perf and
kprobes is doing, but it's unclear to me whether they are
affected.

The whole reason we need this is that it is perfectly valid to be
executing an instruction within MAX_INSN_SIZE bytes of an
unreadable page. We should be able to gracefully handle short
reads in those cases.

This adds support to the decoder to record how long the buffer
being decoded is and to refuse to "validate" the instruction if
we would have gone over the end of the buffer to decode it.

The kprobes code probably needs to be looked at here a bit more
carefully.  This patch still respects the MAX_INSN_SIZE limit
there but the kprobes code does look like it might be able to
be a bit more strict than it currently is.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Jim Keniston <jkenisto@us.ibm.com>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: x86@kernel.org
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Link: http://lkml.kernel.org/r/20141114153957.E6B01535@viggo.jf.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-18 00:58:52 +01:00
Linus Torvalds 3b91270a0a x86-64: make csum_partial_copy_from_user() error handling consistent
Al Viro pointed out that the x86-64 csum_partial_copy_from_user() is
somewhat confused about what it should do on errors, notably it mostly
clears the uncopied end result buffer, but misses that for the initial
alignment case.

All users should check for errors, so it's dubious whether the clearing
is even necessary, and Al also points out that we should probably clean
up the calling conventions, but regardless of any future changes to this
function, the fact that it is inconsistent is just annoying.

So make the __get_user() failure path use the same error exit as all the
other errors do.

Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: David Miller <davem@davemloft.net>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-11-16 11:00:42 -08:00
Linus Torvalds 197fe6b0e6 Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 asm updates from Ingo Molnar:
 "The changes in this cycle were:

   - Speed up the x86 __preempt_schedule() implementation
   - Fix/improve low level asm code debug info annotations"

* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Unwind-annotate thunk_32.S
  x86: Improve cmpxchg8b_emu.S
  x86: Improve cmpxchg16b_emu.S
  x86/lib/Makefile: Remove the unnecessary "+= thunk_64.o"
  x86: Speed up ___preempt_schedule*() by using THUNK helpers
2014-10-13 18:14:50 +02:00
Jan Beulich f74954f01e x86: Unwind-annotate thunk_32.S
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Link: http://lkml.kernel.org/r/542291CA0200007800038085@mail.emea.novell.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-10-08 12:31:45 +02:00
Jan Beulich 5f1d919a8c x86: Improve cmpxchg8b_emu.S
- don't include unneeded headers
- drop redundant entry point label
- complete unwind annotations
- use .L prefix on local labels to not clutter the symbol table

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Link: http://lkml.kernel.org/r/5422917E0200007800038081@mail.emea.novell.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-10-08 10:05:49 +02:00
Jan Beulich 3f63572187 x86: Improve cmpxchg16b_emu.S
- don't include unneeded headers
- don't open-code PER_CPU_VAR()
- drop redundant entry point label
- complete unwind annotations
- use .L prefix on local label to not clutter the symbol table

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Link: http://lkml.kernel.org/r/542290BC020000780003807D@mail.emea.novell.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-10-08 10:05:49 +02:00
Oleg Nesterov 212be3b232 x86/lib/Makefile: Remove the unnecessary "+= thunk_64.o"
Trivial. We have "lib-y += thunk_$(BITS).o" at the start, no
need to add thunk_64.o if !CONFIG_X86_32.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Andy Lutomirski <luto@amacapital.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140921184232.GB23727@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-09-24 15:15:39 +02:00
Oleg Nesterov 0ad6e3c519 x86: Speed up ___preempt_schedule*() by using THUNK helpers
___preempt_schedule() does SAVE_ALL/RESTORE_ALL but this is
suboptimal, we do not need to save/restore the callee-saved
register. And we already have arch/x86/lib/thunk_*.S which
implements the similar asm wrappers, so it makes sense to
redefine ___preempt_schedule() as "THUNK ..." and remove
preempt.S altogether.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140921184153.GA23727@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-09-24 15:15:38 +02:00
Waiman Long 6157c7e1bb locking/rwlock, x86: Delete unused asm/rwlock.h and rwlock.S
This patch removes the unused asm/rwlock.h and rwlock.S files.

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1408037251-45918-3-git-send-email-Waiman.Long@hp.com
Cc: Scott J Norton <scott.norton@hp.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Francesco Fusco <ffusco@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Graf <tgraf@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-09-10 11:46:39 +02:00
Linus Torvalds 3737a12761 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull more perf updates from Ingo Molnar:
 "A second round of perf updates:

   - wide reaching kprobes sanitization and robustization, with the hope
     of fixing all 'probe this function crashes the kernel' bugs, by
     Masami Hiramatsu.

   - uprobes updates from Oleg Nesterov: tmpfs support, corner case
     fixes and robustization work.

   - perf tooling updates and fixes from Jiri Olsa, Namhyung Ki, Arnaldo
     et al:
        * Add support to accumulate hist periods (Namhyung Kim)
        * various fixes, refactorings and enhancements"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (101 commits)
  perf: Differentiate exec() and non-exec() comm events
  perf: Fix perf_event_comm() vs. exec() assumption
  uprobes/x86: Rename arch_uprobe->def to ->defparam, minor comment updates
  perf/documentation: Add description for conditional branch filter
  perf/x86: Add conditional branch filtering support
  perf/tool: Add conditional branch filter 'cond' to perf record
  perf: Add new conditional branch filter 'PERF_SAMPLE_BRANCH_COND'
  uprobes: Teach copy_insn() to support tmpfs
  uprobes: Shift ->readpage check from __copy_insn() to uprobe_register()
  perf/x86: Use common PMU interrupt disabled code
  perf/ARM: Use common PMU interrupt disabled code
  perf: Disable sampled events if no PMU interrupt
  perf: Fix use after free in perf_remove_from_context()
  perf tools: Fix 'make help' message error
  perf record: Fix poll return value propagation
  perf tools: Move elide bool into perf_hpp_fmt struct
  perf tools: Remove elide setup for SORT_MODE__MEMORY mode
  perf tools: Fix "==" into "=" in ui_browser__warning assignment
  perf tools: Allow overriding sysfs and proc finding with env var
  perf tools: Consider header files outside perf directory in tags target
  ...
2014-06-12 19:18:49 -07:00
Ingo Molnar ec00010972 Merge branch 'perf/urgent' into perf/core, to resolve conflict and to prepare for new patches
Conflicts:
	arch/x86/kernel/traps.c

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-06-06 07:55:06 +02:00
Borislav Petkov 1b1ded57a4 x86, boot: Carve out early cmdline parsing function
Carve out early cmdline parsing function into .../lib/cmdline.c so it
can be used by early code in the kernel proper as well.

Adapted from arch/x86/boot/cmdline.c.

Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1400525957-11525-2-git-send-email-bp@alien8.de
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2014-05-20 20:21:24 -07:00
Andres Freund 722a0d22d0 x86: Fix typo preventing msr_set/clear_bit from having an effect
Due to a typo the msr accessor function introduced in
22085a66c2 didn't have any lasting
effects because they accidentally wrote the old value back.

After c0a639ad0b this at the very least
this causes cpuid limits not to be lifted on some cpus leading to
missing capabilities for those.

Signed-off-by: Andres Freund <andres@anarazel.de>
Link: http://lkml.kernel.org/r/1399598957-7011-2-git-send-email-andres@anarazel.de
Cc: Borislav Petkov <bp@suse.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2014-05-09 08:42:32 -07:00
Masami Hiramatsu 98def1dedd kprobes, x86: Prohibit probing on thunk functions and restore
thunk/restore functions are also used for tracing irqoff etc.
and those are involved in kprobe's exception handling.
Prohibit probing on them to avoid kernel crash.

Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/20140417081726.26341.3872.stgit@ltc230.yrl.intra.hitachi.co.jp
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-04-24 10:02:58 +02:00
Linus Torvalds 176ab02d49 Merge branch 'x86-asmlinkage-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 LTO changes from Peter Anvin:
 "More infrastructure work in preparation for link-time optimization
  (LTO).  Most of these changes is to make sure symbols accessed from
  assembly code are properly marked as visible so the linker doesn't
  remove them.

  My understanding is that the changes to support LTO are still not
  upstream in binutils, but are on the way there.  This patchset should
  conclude the x86-specific changes, and remaining patches to actually
  enable LTO will be fed through the Kbuild tree (other than keeping up
  with changes to the x86 code base, of course), although not
  necessarily in this merge window"

* 'x86-asmlinkage-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits)
  Kbuild, lto: Handle basic LTO in modpost
  Kbuild, lto: Disable LTO for asm-offsets.c
  Kbuild, lto: Add a gcc-ld script to let run gcc as ld
  Kbuild, lto: add ld-version and ld-ifversion macros
  Kbuild, lto: Drop .number postfixes in modpost
  Kbuild, lto, workaround: Don't warn for initcall_reference in modpost
  lto: Disable LTO for sys_ni
  lto: Handle LTO common symbols in module loader
  lto, workaround: Add workaround for initcall reordering
  lto: Make asmlinkage __visible
  x86, lto: Disable LTO for the x86 VDSO
  initconst, x86: Fix initconst mistake in ts5500 code
  initconst: Fix initconst mistake in dcdbas
  asmlinkage: Make trace_hardirqs_on/off_caller visible
  asmlinkage, x86: Fix 32bit memcpy for LTO
  asmlinkage Make __stack_chk_failed and memcmp visible
  asmlinkage: Mark rwsem functions that can be called from assembler asmlinkage
  asmlinkage: Make main_extable_sort_needed visible
  asmlinkage, mutex: Mark __visible
  asmlinkage: Make trace_hardirq visible
  ...
2014-03-31 14:13:25 -07:00
Linus Torvalds d9fcca40eb Merge branch 'x86-hash-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 hashing changes from Ingo Molnar:
 "Small fixes and cleanups to the librarized arch_fast_hash() methods,
  used by the net/openvswitch code"

* 'x86-hash-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, hash: Simplify switch, add __init annotation
  x86, hash: Swap arguments passed to crc32_u32()
  x86, hash: Fix build failure with older binutils
2014-03-31 12:27:32 -07:00
Jan Beulich 7a5917e978 x86, hash: Simplify switch, add __init annotation
Minor cleanups:

- simplify switch statement
- add __init annotation to setup_arch_fast_hash()

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Link: http://lkml.kernel.org/r/530F09CE020000780011FBEF@nat28.tlf.novell.com
Cc: Francesco Fusco <ffusco@redhat.com>
Cc: Thomas Graf <tgraf@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-03-19 16:51:04 -07:00
Jan Beulich c5cdfdf909 x86, hash: Swap arguments passed to crc32_u32()
... to match the function's parameters. While reportedly commutative,
using the proper order allows for leveraging the instruction permitting
the source operand to be in memory.

[ hpa: This code originated in the dpdk toolkit.  This was a bug in dpdk
  which has recently been fixed in part due to an earlier version of
  this patch. ]

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Link: http://lkml.kernel.org/r/530F09B6020000780011FBEB@nat28.tlf.novell.com
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Francesco Fusco <ffusco@redhat.com>
Cc: Thomas Graf <tgraf@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-03-19 16:51:04 -07:00
Jan Beulich 06325190bd x86, hash: Fix build failure with older binutils
Just like for other ISA extension instruction uses we should check
whether the assembler actually supports them. The fallback here simply
is to encode an instruction  with fixed operands (%eax and %ecx).

[ hpa: tagging for -stable as a build fix ]

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Link: http://lkml.kernel.org/r/530F0996020000780011FBE7@nat28.tlf.novell.com
Cc: Francesco Fusco <ffusco@redhat.com>
Cc: Thomas Graf <tgraf@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: <stable@vger.kernel.org> # v3.14
2014-03-19 16:51:04 -07:00
Borislav Petkov 22085a66c2 x86: Add another set of MSR accessor functions
We very often need to set or clear a bit in an MSR as a result of doing
some sort of a hardware configuration. Add generic versions of that
repeated functionality in order to save us a bunch of duplicated code in
the early CPU vendor detection/config code.

Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1394384725-10796-2-git-send-email-bp@alien8.de
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-03-13 15:34:45 -07:00
Andi Kleen a9143296dd asmlinkage, x86: Fix 32bit memcpy for LTO
These functions can be called implicitely from gcc, and thus need to be
visible.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1391845930-28580-11-git-send-email-ak@linux.intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-02-13 18:14:46 -08:00
Linus Torvalds 4ba9920e5e Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) BPF debugger and asm tool by Daniel Borkmann.

 2) Speed up create/bind in AF_PACKET, also from Daniel Borkmann.

 3) Correct reciprocal_divide and update users, from Hannes Frederic
    Sowa and Daniel Borkmann.

 4) Currently we only have a "set" operation for the hw timestamp socket
    ioctl, add a "get" operation to match.  From Ben Hutchings.

 5) Add better trace events for debugging driver datapath problems, also
    from Ben Hutchings.

 6) Implement auto corking in TCP, from Eric Dumazet.  Basically, if we
    have a small send and a previous packet is already in the qdisc or
    device queue, defer until TX completion or we get more data.

 7) Allow userspace to manage ipv6 temporary addresses, from Jiri Pirko.

 8) Add a qdisc bypass option for AF_PACKET sockets, from Daniel
    Borkmann.

 9) Share IP header compression code between Bluetooth and IEEE802154
    layers, from Jukka Rissanen.

10) Fix ipv6 router reachability probing, from Jiri Benc.

11) Allow packets to be captured on macvtap devices, from Vlad Yasevich.

12) Support tunneling in GRO layer, from Jerry Chu.

13) Allow bonding to be configured fully using netlink, from Scott
    Feldman.

14) Allow AF_PACKET users to obtain the VLAN TPID, just like they can
    already get the TCI.  From Atzm Watanabe.

15) New "Heavy Hitter" qdisc, from Terry Lam.

16) Significantly improve the IPSEC support in pktgen, from Fan Du.

17) Allow ipv4 tunnels to cache routes, just like sockets.  From Tom
    Herbert.

18) Add Proportional Integral Enhanced packet scheduler, from Vijay
    Subramanian.

19) Allow openvswitch to mmap'd netlink, from Thomas Graf.

20) Key TCP metrics blobs also by source address, not just destination
    address.  From Christoph Paasch.

21) Support 10G in generic phylib.  From Andy Fleming.

22) Try to short-circuit GRO flow compares using device provided RX
    hash, if provided.  From Tom Herbert.

The wireless and netfilter folks have been busy little bees too.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2064 commits)
  net/cxgb4: Fix referencing freed adapter
  ipv6: reallocate addrconf router for ipv6 address when lo device up
  fib_frontend: fix possible NULL pointer dereference
  rtnetlink: remove IFLA_BOND_SLAVE definition
  rtnetlink: remove check for fill_slave_info in rtnl_have_link_slave_info
  qlcnic: update version to 5.3.55
  qlcnic: Enhance logic to calculate msix vectors.
  qlcnic: Refactor interrupt coalescing code for all adapters.
  qlcnic: Update poll controller code path
  qlcnic: Interrupt code cleanup
  qlcnic: Enhance Tx timeout debugging.
  qlcnic: Use bool for rx_mac_learn.
  bonding: fix u64 division
  rtnetlink: add missing IFLA_BOND_AD_INFO_UNSPEC
  sfc: Use the correct maximum TX DMA ring size for SFC9100
  Add Shradha Shah as the sfc driver maintainer.
  net/vxlan: Share RX skb de-marking and checksum checks with ovs
  tulip: cleanup by using ARRAY_SIZE()
  ip_tunnel: clear IPCB in ip_tunnel_xmit() in case dst_link_failure() is called
  net/cxgb4: Don't retrieve stats during recovery
  ...
2014-01-25 11:17:34 -08:00
Linus Torvalds c9cdd9a6ae Merge branch 'x86/mpx' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cpufeature and mpx updates from Peter Anvin:
 "This includes the basic infrastructure for MPX (Memory Protection
  Extensions) support, but does not include MPX support itself.  It is,
  however, a prerequisite for KVM support for MPX, which I believe will
  be pushed later this merge window by the KVM team.

  This includes moving the functionality in
  futex_atomic_cmpxchg_inatomic() into a new function in uaccess.h so it
  can be reused - this will be used by the final MPX patches.

  The actual MPX functionality (map management and so on) will be pushed
  in a future merge window, when ready"

* 'x86/mpx' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/intel/mpx: Remove unused LWP structure
  x86, mpx: Add MPX related opcodes to the x86 opcode map
  x86: replace futex_atomic_cmpxchg_inatomic() with user_atomic_cmpxchg_inatomic
  x86: add user_atomic_cmpxchg_inatomic at uaccess.h
  x86, xsave: Support eager-only xsave features, add MPX support
  x86, cpufeature: Define the Intel MPX feature flag
2014-01-20 14:46:32 -08:00
Linus Torvalds 2a0fede97f Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cleanups from Ingo Molnar:
 "Misc cleanups"

* 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, cpu, amd: Fix a shadowed variable situation
  um, x86: Fix vDSO build
  x86: Delete non-required instances of include <linux/init.h>
  x86, realmode: Pointer walk cleanups, pull out invariant use of __pa()
  x86/traps: Clean up error exception handler definitions
2014-01-20 12:03:57 -08:00
Qiaowei Ren fb09b78151 x86, mpx: Add MPX related opcodes to the x86 opcode map
This patch adds all the MPX instructions to x86 opcode map, so the x86
instruction decoder can decode MPX instructions.

Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
Link: http://lkml.kernel.org/r/1389518403-7715-4-git-send-email-qiaowei.ren@intel.com
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-01-17 11:04:09 -08:00
Paul Gortmaker 663b55b9b3 x86: Delete non-required instances of include <linux/init.h>
None of these files are actually using any __init type directives
and hence don't need to include <linux/init.h>.  Most are just a
left over from __devinit and __cpuinit removal, or simply due to
code getting copied from one driver to the next.

[ hpa: undid incorrect removal from arch/x86/kernel/head_32.S ]

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Link: http://lkml.kernel.org/r/1389054026-12947-1-git-send-email-paul.gortmaker@windriver.com
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2014-01-06 21:25:18 -08:00
Francesco Fusco 71ae8aac3e lib: introduce arch optimized hash library
We introduce a new hashing library that is meant to be used in
the contexts where speed is more important than uniformity of the
hashed values. The hash library leverages architecture specific
implementation to achieve high performance and fall backs to
jhash() for the generic case.

On Intel-based x86 architectures, the library can exploit the crc32l
instruction, part of the Intel SSE4.2 instruction set, if the
instruction is supported by the processor. This implementation
is twice as fast as the jhash() implementation on an i7 processor.

Additional architectures, such as Arm64 provide instructions for
accelerating the computation of CRC, so they could be added as well
in follow-up work.

Signed-off-by: Francesco Fusco <ffusco@redhat.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Thomas Graf <tgraf@redhat.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-17 14:27:17 -05:00
H. Peter Anvin 661c80192d x86-64, copy_user: Use leal to produce 32-bit results
When we are using lea to produce a 32-bit result, we can use the leal
form, rather than using leaq and worry about truncation elsewhere.

Make the leal explicit, both to be more obvious and since that is what
gcc generates and thus is less likely to trigger obscure gas bugs.

Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1384634221-6006-1-git-send-email-fenghua.yu@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-11-20 13:57:07 -08:00
Fenghua Yu f4cb1cc18f x86-64, copy_user: Remove zero byte check before copy user buffer.
Operation of rep movsb instruction handles zero byte copy. As pointed out by
Linus, there is no need to check zero size in kernel. Removing this redundant
check saves a few cycles in copy user functions.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: http://lkml.kernel.org/r/1384634221-6006-1-git-send-email-fenghua.yu@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-11-16 18:00:58 -08:00
Linus Torvalds f9300eaaac ACPI and power management updates for 3.13-rc1
- New power capping framework and the the Intel Running Average Power
    Limit (RAPL) driver using it from Srinivas Pandruvada and Jacob Pan.
 
  - Addition of the in-kernel switching feature to the arm_big_little
    cpufreq driver from Viresh Kumar and Nicolas Pitre.
 
  - cpufreq support for iMac G5 from Aaro Koskinen.
 
  - Baytrail processors support for intel_pstate from Dirk Brandewie.
 
  - cpufreq support for Midway/ECX-2000 from Mark Langsdorf.
 
  - ARM vexpress/TC2 cpufreq support from Sudeep KarkadaNagesha.
 
  - ACPI power management support for the I2C and SPI bus types from
    Mika Westerberg and Lv Zheng.
 
  - cpufreq core fixes and cleanups from Viresh Kumar, Srivatsa S Bhat,
    Stratos Karafotis, Xiaoguang Chen, Lan Tianyu.
 
  - cpufreq drivers updates (mostly fixes and cleanups) from Viresh Kumar,
    Aaro Koskinen, Jungseok Lee, Sudeep KarkadaNagesha, Lukasz Majewski,
    Manish Badarkhe, Hans-Christian Egtvedt, Evgeny Kapaev.
 
  - intel_pstate updates from Dirk Brandewie and Adrian Huang.
 
  - ACPICA update to version 20130927 includig fixes and cleanups and
    some reduction of divergences between the ACPICA code in the kernel
    and ACPICA upstream in order to improve the automatic ACPICA patch
    generation process.  From Bob Moore, Lv Zheng, Tomasz Nowicki,
    Naresh Bhat, Bjorn Helgaas, David E Box.
 
  - ACPI IPMI driver fixes and cleanups from Lv Zheng.
 
  - ACPI hotplug fixes and cleanups from Bjorn Helgaas, Toshi Kani,
    Zhang Yanfei, Rafael J Wysocki.
 
  - Conversion of the ACPI AC driver to the platform bus type and
    multiple driver fixes and cleanups related to ACPI from Zhang Rui.
 
  - ACPI processor driver fixes and cleanups from Hanjun Guo, Jiang Liu,
    Bartlomiej Zolnierkiewicz, Mathieu Rhéaume, Rafael J Wysocki.
 
  - Fixes and cleanups and new blacklist entries related to the ACPI
    video support from Aaron Lu, Felipe Contreras, Lennart Poettering,
    Kirill Tkhai.
 
  - cpuidle core cleanups from Viresh Kumar and Lorenzo Pieralisi.
 
  - cpuidle drivers fixes and cleanups from Daniel Lezcano, Jingoo Han,
    Bartlomiej Zolnierkiewicz, Prarit Bhargava.
 
  - devfreq updates from Sachin Kamat, Dan Carpenter, Manish Badarkhe.
 
  - Operation Performance Points (OPP) core updates from Nishanth Menon.
 
  - Runtime power management core fix from Rafael J Wysocki and update
    from Ulf Hansson.
 
  - Hibernation fixes from Aaron Lu and Rafael J Wysocki.
 
  - Device suspend/resume lockup detection mechanism from Benoit Goby.
 
  - Removal of unused proc directories created for various ACPI drivers
    from Lan Tianyu.
 
  - ACPI LPSS driver fix and new device IDs for the ACPI platform scan
    handler from Heikki Krogerus and Jarkko Nikula.
 
  - New ACPI _OSI blacklist entry for Toshiba NB100 from Levente Kurusa.
 
  - Assorted fixes and cleanups related to ACPI from Andy Shevchenko,
    Al Stone, Bartlomiej Zolnierkiewicz, Colin Ian King, Dan Carpenter,
    Felipe Contreras, Jianguo Wu, Lan Tianyu, Yinghai Lu, Mathias Krause,
    Liu Chuansheng.
 
  - Assorted PM fixes and cleanups from Andy Shevchenko, Thierry Reding,
    Jean-Christophe Plagniol-Villard.
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQIcBAABCAAGBQJSfPKLAAoJEILEb/54YlRxH6YQAJwDKi25RCZziFSIenXuqzC/
 c6JxoH/tSnDHJHhcTgqh7H7Raa+zmatMDf0m2oEv2Wjfx4Lt4BQK4iefhe/zY4lX
 yJ8uXDg+U8DYhDX2XwbwnFpd1M1k/A+s2gIHDTHHGnE0kDngXdd8RAFFktBmooTZ
 l5LBQvOrTlgX/ZfqI/MNmQ6lfY6kbCABGSHV1tUUsDA6Kkvk/LAUTOMSmptv1q22
 hcs6k55vR34qADPkUX5GghjmcYJv+gNtvbDEJUjcmCwVoPWouF415m7R5lJ8w3/M
 49Q8Tbu5HELWLwca64OorS8qh/P7sgUOf1BX5IDzHnJT+TGeDfvcYbMv2Z275/WZ
 /bqhuLuKBpsHQ2wvEeT+lYV3FlifKeTf1FBxER3ApjzI3GfpmVVQ+dpEu8e9hcTh
 ZTPGzziGtoIsHQ0unxb+zQOyt1PmIk+cU4IsKazs5U20zsVDMcKzPrb19Od49vMX
 gCHvRzNyOTqKWpE83Ss4NGOVPAG02AXiXi/BpuYBHKDy6fTH/liKiCw5xlCDEtmt
 lQrEbupKpc/dhCLo5ws6w7MZzjWJs2eSEQcNR4DlR++pxIpYOOeoPTXXrghgZt2X
 mmxZI2qsJ7GAvPzII8OBeF3CRO3fabZ6Nez+M+oEZjGe05ZtpB3ccw410HwieqBn
 dYpJFt/BHK189odhV9CM
 =JCxk
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI and power management updates from Rafael J Wysocki:

 - New power capping framework and the the Intel Running Average Power
   Limit (RAPL) driver using it from Srinivas Pandruvada and Jacob Pan.

 - Addition of the in-kernel switching feature to the arm_big_little
   cpufreq driver from Viresh Kumar and Nicolas Pitre.

 - cpufreq support for iMac G5 from Aaro Koskinen.

 - Baytrail processors support for intel_pstate from Dirk Brandewie.

 - cpufreq support for Midway/ECX-2000 from Mark Langsdorf.

 - ARM vexpress/TC2 cpufreq support from Sudeep KarkadaNagesha.

 - ACPI power management support for the I2C and SPI bus types from Mika
   Westerberg and Lv Zheng.

 - cpufreq core fixes and cleanups from Viresh Kumar, Srivatsa S Bhat,
   Stratos Karafotis, Xiaoguang Chen, Lan Tianyu.

 - cpufreq drivers updates (mostly fixes and cleanups) from Viresh
   Kumar, Aaro Koskinen, Jungseok Lee, Sudeep KarkadaNagesha, Lukasz
   Majewski, Manish Badarkhe, Hans-Christian Egtvedt, Evgeny Kapaev.

 - intel_pstate updates from Dirk Brandewie and Adrian Huang.

 - ACPICA update to version 20130927 includig fixes and cleanups and
   some reduction of divergences between the ACPICA code in the kernel
   and ACPICA upstream in order to improve the automatic ACPICA patch
   generation process.  From Bob Moore, Lv Zheng, Tomasz Nowicki, Naresh
   Bhat, Bjorn Helgaas, David E Box.

 - ACPI IPMI driver fixes and cleanups from Lv Zheng.

 - ACPI hotplug fixes and cleanups from Bjorn Helgaas, Toshi Kani, Zhang
   Yanfei, Rafael J Wysocki.

 - Conversion of the ACPI AC driver to the platform bus type and
   multiple driver fixes and cleanups related to ACPI from Zhang Rui.

 - ACPI processor driver fixes and cleanups from Hanjun Guo, Jiang Liu,
   Bartlomiej Zolnierkiewicz, Mathieu Rhéaume, Rafael J Wysocki.

 - Fixes and cleanups and new blacklist entries related to the ACPI
   video support from Aaron Lu, Felipe Contreras, Lennart Poettering,
   Kirill Tkhai.

 - cpuidle core cleanups from Viresh Kumar and Lorenzo Pieralisi.

 - cpuidle drivers fixes and cleanups from Daniel Lezcano, Jingoo Han,
   Bartlomiej Zolnierkiewicz, Prarit Bhargava.

 - devfreq updates from Sachin Kamat, Dan Carpenter, Manish Badarkhe.

 - Operation Performance Points (OPP) core updates from Nishanth Menon.

 - Runtime power management core fix from Rafael J Wysocki and update
   from Ulf Hansson.

 - Hibernation fixes from Aaron Lu and Rafael J Wysocki.

 - Device suspend/resume lockup detection mechanism from Benoit Goby.

 - Removal of unused proc directories created for various ACPI drivers
   from Lan Tianyu.

 - ACPI LPSS driver fix and new device IDs for the ACPI platform scan
   handler from Heikki Krogerus and Jarkko Nikula.

 - New ACPI _OSI blacklist entry for Toshiba NB100 from Levente Kurusa.

 - Assorted fixes and cleanups related to ACPI from Andy Shevchenko, Al
   Stone, Bartlomiej Zolnierkiewicz, Colin Ian King, Dan Carpenter,
   Felipe Contreras, Jianguo Wu, Lan Tianyu, Yinghai Lu, Mathias Krause,
   Liu Chuansheng.

 - Assorted PM fixes and cleanups from Andy Shevchenko, Thierry Reding,
   Jean-Christophe Plagniol-Villard.

* tag 'pm+acpi-3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (386 commits)
  cpufreq: conservative: fix requested_freq reduction issue
  ACPI / hotplug: Consolidate deferred execution of ACPI hotplug routines
  PM / runtime: Use pm_runtime_put_sync() in __device_release_driver()
  ACPI / event: remove unneeded NULL pointer check
  Revert "ACPI / video: Ignore BIOS initial backlight value for HP 250 G1"
  ACPI / video: Quirk initial backlight level 0
  ACPI / video: Fix initial level validity test
  intel_pstate: skip the driver if ACPI has power mgmt option
  PM / hibernate: Avoid overflow in hibernate_preallocate_memory()
  ACPI / hotplug: Do not execute "insert in progress" _OST
  ACPI / hotplug: Carry out PCI root eject directly
  ACPI / hotplug: Merge device hot-removal routines
  ACPI / hotplug: Make acpi_bus_hot_remove_device() internal
  ACPI / hotplug: Simplify device ejection routines
  ACPI / hotplug: Fix handle_root_bridge_removal()
  ACPI / hotplug: Refuse to hot-remove all objects with disabled hotplug
  ACPI / scan: Start matching drivers after trying scan handlers
  ACPI: Remove acpi_pci_slot_init() headers from internal.h
  ACPI / blacklist: fix name of ThinkPad Edge E530
  PowerCap: Fix build error with option -Werror=format-security
  ...

Conflicts:
	arch/arm/mach-omap2/opp.c
	drivers/Kconfig
	drivers/spi/spi.c
2013-11-14 13:41:48 +09:00
Linus Torvalds 014d595c23 Merge branch 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 boot changes from Ingo Molnar:
 "Two changes that prettify and compactify the SMP bootup output from:

     smpboot: Booting Node   0, Processors  #1 #2 #3 OK
     smpboot: Booting Node   1, Processors  #4 #5 #6 #7 OK
     smpboot: Booting Node   2, Processors  #8 #9 #10 #11 OK
     smpboot: Booting Node   3, Processors  #12 #13 #14 #15 OK
     Brought up 16 CPUs

  to something like:

     x86: Booting SMP configuration:
     .... node  #0, CPUs:        #1  #2  #3
     .... node  #1, CPUs:    #4  #5  #6  #7
     .... node  #2, CPUs:    #8  #9 #10 #11
     .... node  #3, CPUs:   #12 #13 #14 #15
     x86: Booted up 4 nodes, 16 CPUs"

* 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/boot: Further compress CPUs bootup message
  x86: Improve the printout of the SMP bootup CPU table
2013-11-12 10:41:10 +09:00
Linus Torvalds ae795fe760 Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 user access changes from Ingo Molnar:
 "This tree contains two copy_[from/to]_user() build time checking
  changes/enhancements from Jan Beulich.

  The desired outcome is to get better compiler warnings with
  CONFIG_DEBUG_STRICT_USER_COPY_CHECKS=y, to keep people from
  introducing bugs such as overflows and information leaks"

* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Unify copy_to_user() and add size checking to it
  x86: Unify copy_from_user() size checking
2013-11-12 10:39:08 +09:00
Peter Zijlstra 0a196848ca perf: Fix arch_perf_out_copy_user default
The arch_perf_output_copy_user() default of
__copy_from_user_inatomic() returns bytes not copied, while all other
argument functions given DEFINE_OUTPUT_COPY() return bytes copied.

Since copy_from_user_nmi() is the odd duck out by returning bytes
copied where all other *copy_{to,from}* functions return bytes not
copied, change it over and ammend DEFINE_OUTPUT_COPY() to expect bytes
not copied.

Oddly enough DEFINE_OUTPUT_COPY() already returned bytes not copied
while expecting its worker functions to return bytes copied.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: will.deacon@arm.com
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20131030201622.GR16117@laptop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 12:34:25 +01:00
Peter Zijlstra e00b12e64b perf/x86: Further optimize copy_from_user_nmi()
Now that we can deal with nested NMI due to IRET re-enabling NMIs and
can deal with faults from NMI by making sure we preserve CR2 over NMIs
we can in fact simply access user-space memory from NMI context.

So rewrite copy_from_user_nmi() to use __copy_from_user_inatomic() and
rework the fault path to do the minimal required work before taking
the in_atomic() fault handler.

In particular avoid perf_sw_event() which would make perf recurse on
itself (it should be harmless as our recursion protections should be
able to deal with this -- but why tempt fate).

Also rename notify_page_fault() to kprobes_fault() as that is a much
better name; there is no notifier in it and its specific to kprobes.

Don measured that his worst case NMI path shrunk from ~300K cycles to
~150K cycles.

Cc: Stephane Eranian <eranian@google.com>
Cc: jmario@redhat.com
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: dave.hansen@linux.intel.com
Tested-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20131024105206.GM2490@laptop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-29 12:02:54 +01:00
Jan Beulich 7a3d9b0f3a x86: Unify copy_to_user() and add size checking to it
Similarly to copy_from_user(), where the range check is to
protect against kernel memory corruption, copy_to_user() can
benefit from such checking too: Here it protects against kernel
information leaks.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: <arjan@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/5265059502000078000FC4F6@nat28.tlf.novell.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
2013-10-26 12:27:37 +02:00
Jan Beulich 3df7b41aa5 x86: Unify copy_from_user() size checking
Commits 4a31276930 ("x86: Turn the
copy_from_user check into an (optional) compile time warning")
and 63312b6a6f ("x86: Add a
Kconfig option to turn the copy_from_user warnings into errors")
touched only the 32-bit variant of copy_from_user(), whereas the
original commit 9f0cf4adb6 ("x86:
Use __builtin_object_size() to validate the buffer size for
copy_from_user()") also added the same code to the 64-bit one.

Further the earlier conversion from an inline WARN() to the call
to copy_from_user_overflow() went a little too far: When the
number of bytes to be copied is not a constant (e.g. [looking at
3.11] in drivers/net/tun.c:__tun_chr_ioctl() or
drivers/pci/pcie/aer/aer_inject.c:aer_inject_write()), the
compiler will always have to keep the funtion call, and hence
there will always be a warning. By using __builtin_constant_p()
we can avoid this.

And then this slightly extends the effect of
CONFIG_DEBUG_STRICT_USER_COPY_CHECKS in that apart from
converting warnings to errors in the constant size case, it
retains the (possibly wrong) warnings in the non-constant size
case, such that if someone is prepared to get a few false
positives, (s)he'll be able to recover the current behavior
(except that these diagnostics now will never be converted to
errors).

Since the 32-bit variant (intentionally) didn't call
might_fault(), the unification results in this being called
twice now. Adding a suitable #ifdef would be the alternative if
that's a problem.

I'd like to point out though that with
__compiletime_object_size() being restricted to gcc before 4.6,
the whole construct is going to become more and more pointless
going forward. I would question however that commit
2fb0815c9e ("gcc4: disable
__compiletime_object_size for GCC 4.6+") was really necessary,
and instead this should have been dealt with as is done here
from the beginning.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/5265056D02000078000FC4F3@nat28.tlf.novell.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-26 12:27:36 +02:00
Jacob Pan 1a6b991a98 x86 / msr: add 64bit _on_cpu access functions
Having 64-bit MSR access methods on given CPU can avoid shifting and
simplify MSR content manipulation. We already have other combinations
of rdmsrl_xxx and wrmsrl_xxx but missing the _on_cpu version.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-10-17 00:36:06 +02:00
Borislav Petkov a17bce4d1d x86/boot: Further compress CPUs bootup message
Turn it into (for example):

[    0.073380] x86: Booting SMP configuration:
[    0.074005] .... node   #0, CPUs:          #1   #2   #3   #4   #5   #6   #7
[    0.603005] .... node   #1, CPUs:     #8   #9  #10  #11  #12  #13  #14  #15
[    1.200005] .... node   #2, CPUs:    #16  #17  #18  #19  #20  #21  #22  #23
[    1.796005] .... node   #3, CPUs:    #24  #25  #26  #27  #28  #29  #30  #31
[    2.393005] .... node   #4, CPUs:    #32  #33  #34  #35  #36  #37  #38  #39
[    2.996005] .... node   #5, CPUs:    #40  #41  #42  #43  #44  #45  #46  #47
[    3.600005] .... node   #6, CPUs:    #48  #49  #50  #51  #52  #53  #54  #55
[    4.202005] .... node   #7, CPUs:    #56  #57  #58  #59  #60  #61  #62  #63
[    4.811005] .... node   #8, CPUs:    #64  #65  #66  #67  #68  #69  #70  #71
[    5.421006] .... node   #9, CPUs:    #72  #73  #74  #75  #76  #77  #78  #79
[    6.032005] .... node  #10, CPUs:    #80  #81  #82  #83  #84  #85  #86  #87
[    6.648006] .... node  #11, CPUs:    #88  #89  #90  #91  #92  #93  #94  #95
[    7.262005] .... node  #12, CPUs:    #96  #97  #98  #99 #100 #101 #102 #103
[    7.865005] .... node  #13, CPUs:   #104 #105 #106 #107 #108 #109 #110 #111
[    8.466005] .... node  #14, CPUs:   #112 #113 #114 #115 #116 #117 #118 #119
[    9.073006] .... node  #15, CPUs:   #120 #121 #122 #123 #124 #125 #126 #127
[    9.679901] x86: Booted up 16 nodes, 128 CPUs

and drop useless elements.

Change num_digits() to hpa's division-avoiding, cell-phone-typed
version which he went at great lengths and pains to submit on a
Saturday evening.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: huawei.libin@huawei.com
Cc: wangyijing@huawei.com
Cc: fenghua.yu@intel.com
Cc: guohanjun@huawei.com
Cc: paul.gortmaker@windriver.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20130930095624.GB16383@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-01 10:52:30 +02:00
Borislav Petkov 646e29a178 x86: Improve the printout of the SMP bootup CPU table
As the new x86 CPU bootup printout format code maintainer, I am
taking immediate action to improve and clean (and thus indulge
my OCD) the reporting of the cores when coming up online.

Fix padding to a right-hand alignment, cleanup code and bind
reporting width to the max number of supported CPUs on the
system, like this:

 [    0.074509] smpboot: Booting Node   0, Processors:      #1  #2  #3  #4  #5  #6  #7 OK
 [    0.644008] smpboot: Booting Node   1, Processors:  #8  #9 #10 #11 #12 #13 #14 #15 OK
 [    1.245006] smpboot: Booting Node   2, Processors: #16 #17 #18 #19 #20 #21 #22 #23 OK
 [    1.864005] smpboot: Booting Node   3, Processors: #24 #25 #26 #27 #28 #29 #30 #31 OK
 [    2.489005] smpboot: Booting Node   4, Processors: #32 #33 #34 #35 #36 #37 #38 #39 OK
 [    3.093005] smpboot: Booting Node   5, Processors: #40 #41 #42 #43 #44 #45 #46 #47 OK
 [    3.698005] smpboot: Booting Node   6, Processors: #48 #49 #50 #51 #52 #53 #54 #55 OK
 [    4.304005] smpboot: Booting Node   7, Processors: #56 #57 #58 #59 #60 #61 #62 #63 OK
 [    4.961413] Brought up 64 CPUs

and this:

 [    0.072367] smpboot: Booting Node   0, Processors:    #1 #2 #3 #4 #5 #6 #7 OK
 [    0.686329] Brought up 8 CPUs

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Libin <huawei.libin@huawei.com>
Cc: wangyijing@huawei.com
Cc: fenghua.yu@intel.com
Cc: guohanjun@huawei.com
Cc: paul.gortmaker@windriver.com
Link: http://lkml.kernel.org/r/20130927143554.GF4422@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-09-28 10:10:26 +02:00
Linus Torvalds f357a82048 Merge branch 'x86-smap-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 SMAP fixes from Ingo Molnar:
 "Fixes for Intel SMAP support, to fix SIGSEGVs during bootup"

* 'x86-smap-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  Introduce [compat_]save_altstack_ex() to unbreak x86 SMAP
  x86, smap: Handle csum_partial_copy_*_user()
2013-09-04 11:08:32 -07:00
Linus Torvalds 2a475501b8 Merge branch 'x86-asmlinkage-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/asmlinkage changes from Ingo Molnar:
 "As a preparation for Andi Kleen's LTO patchset (link time
  optimizations using GCC's -flto which build time optimization has
  steadily increased in quality over the past few years and might
  eventually be usable for the kernel too) this tree includes a handful
  of preparatory patches that make function calling convention
  annotations consistent again:

   - Mark every function without arguments (or 64bit only) that is used
     by assembly code with asmlinkage()

   - Mark every function with parameters or variables that is used by
     assembly code as __visible.

  For the vanilla kernel this has documentation, consistency and
  debuggability advantages, for the time being"

* 'x86-asmlinkage-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/asmlinkage: Fix warning in xen asmlinkage change
  x86, asmlinkage, vdso: Mark vdso variables __visible
  x86, asmlinkage, power: Make various symbols used by the suspend asm code visible
  x86, asmlinkage: Make dump_stack visible
  x86, asmlinkage: Make 64bit checksum functions visible
  x86, asmlinkage, paravirt: Add __visible/asmlinkage to xen paravirt ops
  x86, asmlinkage, apm: Make APM data structure used from assembler visible
  x86, asmlinkage: Make syscall tables visible
  x86, asmlinkage: Make several variables used from assembler/linker script visible
  x86, asmlinkage: Make kprobes code visible and fix assembler code
  x86, asmlinkage: Make various syscalls asmlinkage
  x86, asmlinkage: Make 32bit/64bit __switch_to visible
  x86, asmlinkage: Make _*_start_kernel visible
  x86, asmlinkage: Make all interrupt handlers asmlinkage / __visible
  x86, asmlinkage: Change dotraplinkage into __visible on 32bit
  x86: Fix sys_call_table type in asm/syscall.h
2013-09-04 08:42:44 -07:00
H. Peter Anvin 7263dda41b x86, smap: Handle csum_partial_copy_*_user()
Add SMAP annotations to csum_partial_copy_to/from_user().  These
functions legitimately access user space and thus need to set the AC
flag.

TODO: add explicit checks that the side with the kernel space pointer
really points into kernel space.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/n/tip-2aps0u00eer658fd5xyanan7@git.kernel.org
Cc: <stable@vger.kernel.org> # v3.7+
2013-09-01 14:09:48 -07:00
Andi Kleen 277d5b40b7 x86, asmlinkage: Make several variables used from assembler/linker script visible
Plus one function, load_gs_index().

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1375740170-7446-10-git-send-email-andi@firstfloor.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-08-06 14:20:13 -07:00
Masami Hiramatsu 3e21bb092d x86, insn: Add new opcodes as of June, 2013
Add TSX-NI related instructions and new instructions to
x86-opcode-map.txt according to the Intel(R) 64 and IA-32
Architectures Software Developer's Manual Vol2C (June, 2013).
This also includes below updates.
 - Fix a typo of MWAIT (the lack of (11B)).
 - Change NOP Ev to prefetchw Ev
 - Add CRC32 new prefix style (66&F2)
 - Add ADCX, ADOX, RDSEED, CLAC and STAC instructions

Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Link: http://lkml.kernel.org/r/20130806073750.4049.12365.stgit@udc4-manage.rcp.hitachi.co.jp
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-08-06 08:08:47 -07:00
Stephen Boyd 446f24d119 Kconfig: consolidate CONFIG_DEBUG_STRICT_USER_COPY_CHECKS
The help text for this config is duplicated across the x86, parisc, and
s390 Kconfig.debug files.  Arnd Bergman noted that the help text was
slightly misleading and should be fixed to state that enabling this
option isn't a problem when using pre 4.4 gcc.

To simplify the rewording, consolidate the text into lib/Kconfig.debug
and modify it there to be more explicit about when you should say N to
this config.

Also, make the text a bit more generic by stating that this option
enables compile time checks so we can cover architectures which emit
warnings vs.  ones which emit errors.  The details of how an
architecture decided to implement the checks isn't as important as the
concept of compile time checking of copy_from_user() calls.

While we're doing this, remove all the copy_from_user_overflow() code
that's duplicated many times and place it into lib/ so that any
architecture supporting this option can get the function for free.

Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Helge Deller <deller@gmx.de>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-30 17:04:09 -07:00
Linus Torvalds 874f6d1be7 Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cleanups from Ingo Molnar:
 "Misc smaller cleanups"

* 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/lib: Fix spelling, put space between a numeral and its units
  x86/lib: Fix spelling in the comments
  x86, quirks: Shut-up a long-standing gcc warning
  x86, msr: Unify variable names
  x86-64, docs, mm: Add vsyscall range to virtual address space layout
  x86: Drop KERNEL_IMAGE_START
  x86_64: Use __BOOT_DS instead_of __KERNEL_DS for safety
2013-04-30 08:34:07 -07:00
Andy Shevchenko d50ba3687b x86/lib: Fix spelling, put space between a numeral and its units
As suggested by Peter Anvin.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: H . Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-04-15 11:40:32 +02:00
Andy Shevchenko bb916ff7cd x86/lib: Fix spelling in the comments
Apparently 'byts' should be 'bytes'.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: H . Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-04-15 11:40:31 +02:00
CQ Tang 66db3feb48 x86-64: Fix the failure case in copy_user_handle_tail()
The increment of "to" in copy_user_handle_tail() will have incremented
before a failure has been noted.  This causes us to skip a byte in the
failure case.

Only do the increment when assured there is no failure.

Signed-off-by: CQ Tang <cq.tang@intel.com>
Link: http://lkml.kernel.org/r/20130318150221.8439.993.stgit@phlsvslse11.ph.intel.com
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: <stable@vger.kernel.org>
2013-03-18 11:32:03 -07:00
H. Peter Anvin 16640165c9 x86: Be consistent with data size in getuser.S
Consistently use the data register by name and use a sized assembly
instruction in getuser.S.  There is never any reason to macroize it,
and being inconsistent in the same file is just annoying.

No actual code change.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2013-02-11 23:14:48 -08:00
Ville Syrjälä 96477b4cd7 x86-32: Add support for 64bit get_user()
Implement __get_user_8() for x86-32. It will return the
64-bit result in edx:eax register pair, and ecx is used
to pass in the address and return the error value.

For consistency, change the register assignment for all
other __get_user_x() variants, so that address is passed in
ecx/rcx, the error value is returned in ecx/rcx, and eax/rax
contains the actual value.

[ hpa: I modified the patch so that it does NOT change the calling
  conventions for the existing callsites, this also means that the code
  is completely unchanged for 64 bits.

  Instead, continue to use eax for address input/error output and use
  the ecx:edx register pair for the output. ]

This is a partial refresh of a patch [1] by Jamie Lokier from
2004. Only the minimal changes to implement 64bit get_user()
were picked from the original patch.

[1] http://article.gmane.org/gmane.linux.kernel/198823

Originally-by: Jamie Lokier <jamie@shareable.org>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link:
http://lkml.kernel.org/r/1355312043-11467-1-git-send-email-ville.syrjala@linux.intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-02-07 15:07:28 -08:00
Greg Kroah-Hartman a18e3690a5 X86: drivers: remove __dev* attributes.
CONFIG_HOTPLUG is going away as an option.  As a result, the __dev*
markings need to be removed.

This change removes the use of __devinit, __devexit_p, __devinitconst,
and __devexit from these drivers.

Based on patches originally written by Bill Pemberton, but redone by me
in order to handle some of the coding style issues better, by hand.

Cc: Bill Pemberton <wfp5p@virginia.edu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Daniel Drake <dsd@laptop.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-03 15:57:04 -08:00
Linus Torvalds 743aa456c1 Merge branch 'x86-nuke386-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull "Nuke 386-DX/SX support" from Ingo Molnar:
 "This tree removes ancient-386-CPUs support and thus zaps quite a bit
  of complexity:

    24 files changed, 56 insertions(+), 425 deletions(-)

  ... which complexity has plagued us with extra work whenever we wanted
  to change SMP primitives, for years.

  Unfortunately there's a nostalgic cost: your old original 386 DX33
  system from early 1991 won't be able to boot modern Linux kernels
  anymore.  Sniff."

I'm not sentimental.  Good riddance.

* 'x86-nuke386-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, 386 removal: Document Nx586 as a 386 and thus unsupported
  x86, cleanups: Simplify sync_core() in the case of no CPUID
  x86, 386 removal: Remove CONFIG_X86_POPAD_OK
  x86, 386 removal: Remove CONFIG_X86_WP_WORKS_OK
  x86, 386 removal: Remove CONFIG_INVLPG
  x86, 386 removal: Remove CONFIG_BSWAP
  x86, 386 removal: Remove CONFIG_XADD
  x86, 386 removal: Remove CONFIG_CMPXCHG
  x86, 386 removal: Remove CONFIG_M386 from Kconfig
2012-12-11 19:59:32 -08:00
H. Peter Anvin a5c2a893db x86, 386 removal: Remove CONFIG_X86_WP_WORKS_OK
All 486+ CPUs support WP in supervisor mode, so remove the fallback
386 support code.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1354132230-21854-7-git-send-email-hpa@linux.intel.com
2012-11-29 13:23:03 -08:00
H. Peter Anvin d55c5a93db x86, 386 removal: Remove CONFIG_CMPXCHG
All 486+ CPUs support CMPXCHG, so remove the fallback 386 support
code.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1354132230-21854-3-git-send-email-hpa@linux.intel.com
2012-11-29 13:23:01 -08:00
Ma Ling 269833bd5a x86/asm: Clean up copy_page_*() comments and code
Modern CPUs use fast-string instruction to accelerate copy
performance, by combining data into 128 bit chunks.

Modify comments and coding style to match it.

Signed-off-by: Ma Ling <ling.ma@intel.com>
Cc: iant@google.com
Link: http://lkml.kernel.org/r/1350503565-19167-1-git-send-email-ling.ma@intel.com
[ Cleaned up the clean up. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-10-24 12:42:47 +02:00
David Howells c0522b6cc1 UAPI: x86: Fix insn_sanity build failure after UAPI split
Fix a build failure in the x86 insn_sanity program after the UAPI split.  The
problem is that insn_sanity.c #includes arch/x86/lib/insn.c - which uses the
kernel string header.  This leads to conflicts for various definitions against
the /usr/include/ headers.

linux/string.h can be replaced with the normal userspace string.h if __KERNEL__
is not specified.

  HOSTCC  arch/x86/tools/insn_sanity
In file included from /data/fs/linux-2.6-hdr/include/linux/string.h:6:0,
                 from /data/fs/linux-2.6-hdr/arch/x86/lib/insn.c:21,
                 from arch/x86/tools/insn_sanity.c:36:
/data/fs/linux-2.6-hdr/include/linux/types.h:14:26: error: conflicting types for 'fd_set'
/usr/include/sys/select.h:76:5: note: previous declaration of 'fd_set' was here
/data/fs/linux-2.6-hdr/include/linux/types.h:15:25: error: conflicting types for 'dev_t'
/usr/include/sys/types.h:61:17: note: previous declaration of 'dev_t' was here
/data/fs/linux-2.6-hdr/include/linux/types.h:25:26: error: conflicting types for 'timer_t'
/usr/include/time.h:104:19: note: previous declaration of 'timer_t' was here
/data/fs/linux-2.6-hdr/include/linux/types.h:45:26: error: conflicting types for 'loff_t'
/usr/include/sys/types.h:45:18: note: previous declaration of 'loff_t' was here
/data/fs/linux-2.6-hdr/include/linux/types.h:112:17: error: conflicting types for 'u_int64_t'
/usr/include/sys/types.h:204:1: note: previous declaration of 'u_int64_t' was here
/data/fs/linux-2.6-hdr/include/linux/types.h:113:17: error: conflicting types for 'int64_t'
/usr/include/sys/types.h:198:1: note: previous declaration of 'int64_t' was here
/data/fs/linux-2.6-hdr/include/linux/types.h:134:23: error: conflicting types for 'blkcnt_t'
/usr/include/sys/types.h:236:20: note: previous declaration of 'blkcnt_t' was here
In file included from /data/fs/linux-2.6-hdr/arch/x86/lib/insn.c:21:0,
                 from arch/x86/tools/insn_sanity.c:36:
/data/fs/linux-2.6-hdr/include/linux/string.h:38:12: error: expected identifier or '(' before '__extension__'
/data/fs/linux-2.6-hdr/include/linux/string.h:38:12: error: expected identifier or '(' before ')' token
/data/fs/linux-2.6-hdr/include/linux/string.h:41:12: error: expected identifier or '(' before '__extension__'
/data/fs/linux-2.6-hdr/include/linux/string.h:53:15: error: expected identifier or '(' before '__extension__'
/data/fs/linux-2.6-hdr/include/linux/string.h:61:28: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'skip_spaces'
/data/fs/linux-2.6-hdr/include/linux/string.h:65:28: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'char'
/data/fs/linux-2.6-hdr/include/linux/string.h:83:15: error: expected identifier or '(' before '__extension__'
/data/fs/linux-2.6-hdr/include/linux/string.h:83:15: error: expected identifier or '(' before ')' token
/data/fs/linux-2.6-hdr/include/linux/string.h:86:15: error: expected identifier or '(' before '__extension__'
/data/fs/linux-2.6-hdr/include/linux/string.h:86:15: error: expected identifier or '(' before ')' token
/data/fs/linux-2.6-hdr/include/linux/string.h:89:24: error: expected identifier or '(' before '__extension__'
/data/fs/linux-2.6-hdr/include/linux/string.h:89:24: error: expected identifier or '(' before ')' token
/data/fs/linux-2.6-hdr/include/linux/string.h:92:24: error: expected identifier or '(' before '__extension__'
/data/fs/linux-2.6-hdr/include/linux/string.h:92:24: error: expected identifier or '(' before ')' token

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Dave Jones <davej@redhat.com>
2012-10-02 18:01:56 +01:00
H. Peter Anvin 63bcff2a30 x86, smap: Add STAC and CLAC instructions to control user space access
When Supervisor Mode Access Prevention (SMAP) is enabled, access to
userspace from the kernel is controlled by the AC flag.  To make the
performance of manipulating that flag acceptable, there are two new
instructions, STAC and CLAC, to set and clear it.

This patch adds those instructions, via alternative(), when the SMAP
feature is enabled.  It also adds X86_EFLAGS_AC unconditionally to the
SYSCALL entry mask; there is simply no reason to make that one
conditional.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1348256595-29119-9-git-send-email-hpa@linux.intel.com
2012-09-21 12:45:27 -07:00
Ingo Molnar b0338e99b2 Merge branch 'x86/cpu' into perf/core
Merge this branch because we changed the wrmsr*_safe() API and there's
a conflict.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-07-05 21:12:11 +02:00
Linus Torvalds 15b77435ed Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar.

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, cpufeature: Remove stray %s, add -w to mkcapflags.pl
  x86, cpufeature: Catch duplicate CPU feature strings
  x86, cpufeature: Rename X86_FEATURE_DTS to X86_FEATURE_DTHERM
  x86: Fix kernel-doc warnings
  x86, compat: Use test_thread_flag(TIF_IA32) in compat signal delivery
2012-06-29 10:29:54 -07:00
Wanpeng Li c15acff337 x86: Fix kernel-doc warnings
Signed-off-by: Wanpeng Li <liwp@linux.vnet.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Gavin Shan <shangw@linux.vnet.ibm.com>
Cc: Wanpeng Li <liwp.linux@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-18 10:53:18 +02:00
Stephane Eranian 25f4298582 perf/x86: Fix broken LBR fixup code
I noticed that the LBR fixups were not working anymore
on programs where they used to. I tracked this down to
a recent change to copy_from_user_nmi():

 db0dc75d64 ("perf/x86: Check user address explicitly in copy_from_user_nmi()")

This commit added a call to __range_not_ok() to the
copy_from_user_nmi() routine. The problem is that the logic
of the test must be reversed. __range_not_ok() returns 0 if the
range is VALID. We want to return early from copy_from_user_nmi()
if the range is NOT valid.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Arun Sharma <asharma@fb.com>
Link: http://lkml.kernel.org/r/20120611134426.GA7542@quad
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-13 15:00:28 +02:00
Andre Przywara 1f975f78c8 x86, pvops: Remove hooks for {rd,wr}msr_safe_regs
There were paravirt_ops hooks for the full register set variant of
{rd,wr}msr_safe which are actually not used by anyone anymore. Remove
them to make the code cleaner and avoid silent breakages when the pvops
members were uninitialized. This has been boot-tested natively and under
Xen with PVOPS enabled and disabled on one machine.

Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Link: http://lkml.kernel.org/r/1338562358-28182-2-git-send-email-bp@amd64.org
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2012-06-07 11:41:08 -07:00
Arun Sharma db0dc75d64 perf/x86: Check user address explicitly in copy_from_user_nmi()
Signed-off-by: Arun Sharma <asharma@fb.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1334961696-19580-5-git-send-email-asharma@fb.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 17:08:04 +02:00
Masami Hiramatsu 436d03faf6 x86/decoder: Fix bsr/bsf/jmpe decoding with operand-size prefix
Fix the x86 instruction decoder to decode bsr/bsf/jmpe with
operand-size prefix (66h). This fixes the test case failure
reported by Linus, attached below.

bsf/bsr/jmpe have a special encoding. Opcode map in
Intel Software Developers Manual vol2 says they have
TZCNT/LZCNT variants if it has F3h prefix. However, there
is no information if it has other 66h or F2h prefixes.
Current instruction decoder supposes that those are
bad instructions, but it actually accepts at least
operand-size prefixes.

H. Peter Anvin further explains:

 " TZCNT/LZCNT are F3 + BSF/BSR exactly because the F2 and
   F3 prefixes have historically been no-ops with most instructions.
   This allows software to unconditionally use the prefixed versions
   and get TZCNT/LZCNT on the processors that have them if they don't
   care about the difference. "

This fixes errors reported by test_get_len:

  Warning: arch/x86/tools/test_get_len found difference at <em_bsf>:ffffffff81036d87
  Warning: ffffffff81036de5:	66 0f bc c2          	bsf    %dx,%ax
  Warning: objdump says 4 bytes, but insn_get_length() says 3
  Warning: arch/x86/tools/test_get_len found difference at <em_bsr>:ffffffff81036ea6
  Warning: ffffffff81036f04:	66 0f bd c2          	bsr    %dx,%ax
  Warning: objdump says 4 bytes, but insn_get_length() says 3
  Warning: decoded and checked 13298882 instructions with 2 warnings

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Reported-by: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <yrl.pp-manager.tt@hitachi.com>
Link: http://lkml.kernel.org/r/20120604150911.22338.43296.stgit@localhost.localdomain
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 08:54:18 +02:00
Linus Torvalds 5723aa993d x86: use the new generic strnlen_user() function
This throws away the old x86-specific functions in favor of the generic
optimized version.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-26 11:33:54 -07:00
Linus Torvalds 4ae73f2d53 x86: use generic strncpy_from_user routine
The generic strncpy_from_user() is not really optimal, since it is
designed to work on both little-endian and big-endian.  And on
little-endian you can simplify much of the logic to find the first zero
byte, since little-endian arithmetic doesn't have to worry about the
carry bit propagating into earlier bytes (only later bytes, which we
don't care about).

But I have patches to make the generic routines use the architecture-
specific <asm/word-at-a-time.h> infrastructure, so that we can regain
the little-endian optimizations.  But before we do that, switch over to
the generic routines to make the patches each do just one well-defined
thing.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-26 10:14:39 -07:00
Linus Torvalds 269af9a1a0 Merge branch 'x86-extable-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull exception table generation updates from Ingo Molnar:
 "The biggest change here is to allow the build-time sorting of the
  exception table, to speed up booting.  This is achieved by the
  architecture enabling BUILDTIME_EXTABLE_SORT.  This option is enabled
  for x86 and MIPS currently.

  On x86 a number of fixes and changes were needed to allow build-time
  sorting of the exception table, in particular a relocation invariant
  exception table format was needed.  This required the abstracting out
  of exception table protocol and the removal of 20 years of accumulated
  assumptions about the x86 exception table format.

  While at it, this tree also cleans up various other aspects of
  exception handling, such as early(er) exception handling for
  rdmsr_safe() et al.

  All in one, as the result of these changes the x86 exception code is
  now pretty nice and modern.  As an added bonus any regressions in this
  code will be early and violent crashes, so if you see any of those,
  you'll know whom to blame!"

Fix up trivial conflicts in arch/{mips,x86}/Kconfig files due to nearby
modifications of other core architecture options.

* 'x86-extable-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (35 commits)
  Revert "x86, extable: Disable presorted exception table for now"
  scripts/sortextable: Handle relative entries, and other cleanups
  x86, extable: Switch to relative exception table entries
  x86, extable: Disable presorted exception table for now
  x86, extable: Add _ASM_EXTABLE_EX() macro
  x86, extable: Remove open-coded exception table entries in arch/x86/ia32/ia32entry.S
  x86, extable: Remove open-coded exception table entries in arch/x86/include/asm/xsave.h
  x86, extable: Remove open-coded exception table entries in arch/x86/include/asm/kvm_host.h
  x86, extable: Remove the now-unused __ASM_EX_SEC macros
  x86, extable: Remove open-coded exception table entries in arch/x86/xen/xen-asm_32.S
  x86, extable: Remove open-coded exception table entries in arch/x86/um/checksum_32.S
  x86, extable: Remove open-coded exception table entries in arch/x86/lib/usercopy_32.c
  x86, extable: Remove open-coded exception table entries in arch/x86/lib/putuser.S
  x86, extable: Remove open-coded exception table entries in arch/x86/lib/getuser.S
  x86, extable: Remove open-coded exception table entries in arch/x86/lib/csum-copy_64.S
  x86, extable: Remove open-coded exception table entries in arch/x86/lib/copy_user_nocache_64.S
  x86, extable: Remove open-coded exception table entries in arch/x86/lib/copy_user_64.S
  x86, extable: Remove open-coded exception table entries in arch/x86/lib/checksum_32.S
  x86, extable: Remove open-coded exception table entries in arch/x86/kernel/test_rodata.c
  x86, extable: Remove open-coded exception table entries in arch/x86/kernel/entry_64.S
  ...
2012-05-23 10:44:35 -07:00
Linus Torvalds 0749708352 x86: make word-at-a-time strncpy_from_user clear bytes at the end
This makes the newly optimized x86 strncpy_from_user clear the final
bytes in the word past the final NUL character, rather than copy them as
the word they were in the source.

NOTE! Unlike the silly semantics of the libc 'strncpy()' function, the
kernel strncpy_from_user() has never cleared all of the end of the
destination buffer.  And neither does it do so now: it only clears the
bytes at the end of the last word it copied.

So why make this change at all? It doesn't really cost us anything extra
(we have to calculate the mask to get the length anyway), and it means
that *if* any user actually cares about zeroing the whole buffer, they
can do a "memset()" before the strncpy_from_user(), and we will no
longer write random bytes after the NUL character.

In particular, the buffer contents will now at no point contain random
source data from beyond the end of the string.

In other words, it makes behavior a bit more repeatable at no new cost,
so it's a small cleanup.  I've been carrying this as a patch for the
last few weeks or so in my tree (done at the same time the sign error
was fixed in commit 12e993b894), I might as well commit it.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-28 14:27:38 -07:00
H. Peter Anvin 9c6751280b x86, extable: Remove open-coded exception table entries in arch/x86/lib/usercopy_32.c
Remove open-coded exception table entries in arch/x86/lib/usercopy_32.c,
and replace them with _ASM_EXTABLE() macros; this will allow us to
change the format and type of the exception table entries.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: David Daney <david.daney@cavium.com>
Link: http://lkml.kernel.org/r/CA%2B55aFyijf43qSu3N9nWHEBwaGbb7T2Oq9A=9EyR=Jtyqfq_cQ@mail.gmail.com
2012-04-20 13:51:39 -07:00
H. Peter Anvin a53a96e541 x86, extable: Remove open-coded exception table entries in arch/x86/lib/putuser.S
Remove open-coded exception table entries in arch/x86/lib/putuser.S,
and replace them with _ASM_EXTABLE() macros; this will allow us to
change the format and type of the exception table entries.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: David Daney <david.daney@cavium.com>
Link: http://lkml.kernel.org/r/CA%2B55aFyijf43qSu3N9nWHEBwaGbb7T2Oq9A=9EyR=Jtyqfq_cQ@mail.gmail.com
2012-04-20 13:51:39 -07:00
H. Peter Anvin 1a27bc0d99 x86, extable: Remove open-coded exception table entries in arch/x86/lib/getuser.S
Remove open-coded exception table entries in arch/x86/lib/getuser.S,
and replace them with _ASM_EXTABLE() macros; this will allow us to
change the format and type of the exception table entries.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: David Daney <david.daney@cavium.com>
Link: http://lkml.kernel.org/r/CA%2B55aFyijf43qSu3N9nWHEBwaGbb7T2Oq9A=9EyR=Jtyqfq_cQ@mail.gmail.com
2012-04-20 13:51:39 -07:00
H. Peter Anvin 015e6f11a9 x86, extable: Remove open-coded exception table entries in arch/x86/lib/csum-copy_64.S
Remove open-coded exception table entries in arch/x86/lib/csum-copy_64.S,
and replace them with _ASM_EXTABLE() macros; this will allow us to
change the format and type of the exception table entries.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: David Daney <david.daney@cavium.com>
Link: http://lkml.kernel.org/r/CA%2B55aFyijf43qSu3N9nWHEBwaGbb7T2Oq9A=9EyR=Jtyqfq_cQ@mail.gmail.com
2012-04-20 13:51:39 -07:00
H. Peter Anvin 0d8559feaf x86, extable: Remove open-coded exception table entries in arch/x86/lib/copy_user_nocache_64.S
Remove open-coded exception table entries in arch/x86/lib/copy_user_nocache_64.S,
and replace them with _ASM_EXTABLE() macros; this will allow us to
change the format and type of the exception table entries.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: David Daney <david.daney@cavium.com>
Link: http://lkml.kernel.org/r/CA%2B55aFyijf43qSu3N9nWHEBwaGbb7T2Oq9A=9EyR=Jtyqfq_cQ@mail.gmail.com
2012-04-20 13:51:38 -07:00
H. Peter Anvin 9732da8ca8 x86, extable: Remove open-coded exception table entries in arch/x86/lib/copy_user_64.S
Remove open-coded exception table entries in arch/x86/lib/copy_user_64.S,
and replace them with _ASM_EXTABLE() macros; this will allow us to
change the format and type of the exception table entries.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: David Daney <david.daney@cavium.com>
Link: http://lkml.kernel.org/r/CA%2B55aFyijf43qSu3N9nWHEBwaGbb7T2Oq9A=9EyR=Jtyqfq_cQ@mail.gmail.com
2012-04-20 13:51:38 -07:00
H. Peter Anvin 5f2e8a84f0 x86, extable: Remove open-coded exception table entries in arch/x86/lib/checksum_32.S
Remove open-coded exception table entries in arch/x86/lib/checksum_32.S,
and replace them with _ASM_EXTABLE() macros; this will allow us to
change the format and type of the exception table entries.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: David Daney <david.daney@cavium.com>
Link: http://lkml.kernel.org/r/CA%2B55aFyijf43qSu3N9nWHEBwaGbb7T2Oq9A=9EyR=Jtyqfq_cQ@mail.gmail.com
2012-04-20 13:51:38 -07:00
Linus Torvalds 4643b05662 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar.

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Handle failures of parsing immediate operands in the instruction decoder
  perf archive: Correct cutting of symbolic link
  perf tools: Ignore auto-generated bison/flex files
  perf tools: Fix parsers' rules to dependencies
  perf tools: fix NO_GTK2 Makefile config error
  perf session: Skip event correctly for unknown id/machine
2012-04-16 18:35:21 -07:00
Masami Hiramatsu 6c7b8e82aa x86: Handle failures of parsing immediate operands in the instruction decoder
This can happen if the instruction is much longer than the maximum length,
or if insn->opnd_bytes is manually changed.

This patch also fixes warnings from -Wswitch-default flag.

Reported-by: Prashanth Nageshappa <prashanth@linux.vnet.ibm.com>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Jim Keniston <jkenisto@linux.vnet.ibm.com>
Cc: Linux-mm <linux-mm@kvack.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Anton Arapov <anton@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: yrl.pp-manager.tt@hitachi.com
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20120413032427.32577.42602.stgit@localhost.localdomain
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-04-16 08:56:11 +02:00
Linus Torvalds 12e993b894 x86-32: fix up strncpy_from_user() sign error
The 'max' range needs to be unsigned, since the size of the user address
space is bigger than 2GB.

We know that 'count' is positive in 'long' (that is checked in the
caller), so we will truncate 'max' down to something that fits in a
signed long, but before we actually do that, that comparison needs to be
done in unsigned.

Bug introduced in commit 92ae03f2ef ("x86: merge 32/64-bit versions of
'strncpy_from_user()' and speed it up").  On x86-64 you can't trigger
this, since the user address space is much smaller than 63 bits, and on
x86-32 it works in practice, since you would seldom hit the strncpy
limits anyway.

I had actually tested the corner-cases, I had only tested them on
x86-64.  Besides, I had only worried about the case of a pointer *close*
to the end of the address space, rather than really far away from it ;)

This also changes the "we hit the user-specified maximum" to return
'res', for the trivial reason that gcc seems to generate better code
that way.  'res' and 'count' are the same in that case, so it really
doesn't matter which one we return.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-15 17:23:00 -07:00
Linus Torvalds 92ae03f2ef x86: merge 32/64-bit versions of 'strncpy_from_user()' and speed it up
This merges the 32- and 64-bit versions of the x86 strncpy_from_user()
by just rewriting it in C rather than the ancient inline asm versions
that used lodsb/stosb and had been duplicated for (trivial) differences
between the 32-bit and 64-bit versions.

While doing that, it also speeds them up by doing the accesses a word at
a time.  Finally, the new routines also properly handle the case of
hitting the end of the address space, which we have never done correctly
before (fs/namei.c has a hack around it for that reason).

Despite all these improvements, it actually removes more lines than it
adds, due to the de-duplication.  Also, we no longer export (or define)
the legacy __strncpy_from_user() function (that was defined to not do
the user permission checks), since it's not actually used anywhere, and
the user address space checks are built in to the new code.

Other architecture maintainers have been notified that the old hack in
fs/namei.c will be going away in the 3.5 merge window, in case they
copied the x86 approach of being a bit cavalier about the end of the
address space.

Cc: linux-arch@vger.kernel.org
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Anvin" <hpa@zytor.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-11 09:41:28 -07:00
Linus Torvalds 1b674bf106 Merge branch 'x86-atomic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/atomic changes from Ingo Molnar.

* 'x86-atomic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: atomic64 assembly improvements
  x86: Adjust asm constraints in atomic64 wrappers
2012-03-22 09:23:57 -07:00
Linus Torvalds e17fdf5c67 Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/asm changes from Ingo Molnar

* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Include probe_roms.h in probe_roms.c
  x86/32: Print control and debug registers for kerenel context
  x86: Tighten dependencies of CPU_SUP_*_32
  x86/numa: Improve internode cache alignment
  x86: Fix the NMI nesting comments
  x86-64: Improve insn scheduling in SAVE_ARGS_IRQ
  x86-64: Fix CFI annotations for NMI nesting code
  bitops: Add missing parentheses to new get_order macro
  bitops: Optimise get_order()
  bitops: Adjust the comment on get_order() to describe the size==0 case
  x86/spinlocks: Eliminate TICKET_MASK
  x86-64: Handle byte-wise tail copying in memcpy() without a loop
  x86-64: Fix memcpy() to support sizes of 4Gb and above
  x86-64: Fix memset() to support sizes of 4Gb and above
  x86-64: Slightly shorten copy_page()
2012-03-22 09:13:24 -07:00
Linus Torvalds 9f3938346a Merge branch 'kmap_atomic' of git://github.com/congwang/linux
Pull kmap_atomic cleanup from Cong Wang.

It's been in -next for a long time, and it gets rid of the (no longer
used) second argument to k[un]map_atomic().

Fix up a few trivial conflicts in various drivers, and do an "evil
merge" to catch some new uses that have come in since Cong's tree.

* 'kmap_atomic' of git://github.com/congwang/linux: (59 commits)
  feature-removal-schedule.txt: schedule the deprecated form of kmap_atomic() for removal
  highmem: kill all __kmap_atomic() [swarren@nvidia.com: highmem: Fix ARM build break due to __kmap_atomic rename]
  drbd: remove the second argument of k[un]map_atomic()
  zcache: remove the second argument of k[un]map_atomic()
  gma500: remove the second argument of k[un]map_atomic()
  dm: remove the second argument of k[un]map_atomic()
  tomoyo: remove the second argument of k[un]map_atomic()
  sunrpc: remove the second argument of k[un]map_atomic()
  rds: remove the second argument of k[un]map_atomic()
  net: remove the second argument of k[un]map_atomic()
  mm: remove the second argument of k[un]map_atomic()
  lib: remove the second argument of k[un]map_atomic()
  power: remove the second argument of k[un]map_atomic()
  kdb: remove the second argument of k[un]map_atomic()
  udf: remove the second argument of k[un]map_atomic()
  ubifs: remove the second argument of k[un]map_atomic()
  squashfs: remove the second argument of k[un]map_atomic()
  reiserfs: remove the second argument of k[un]map_atomic()
  ocfs2: remove the second argument of k[un]map_atomic()
  ntfs: remove the second argument of k[un]map_atomic()
  ...
2012-03-21 09:40:26 -07:00
Cong Wang 8fd75e1216 x86: remove the second argument of k[un]map_atomic()
Acked-by: Avi Kivity <avi@redhat.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Cong Wang <amwang@redhat.com>
2012-03-20 21:48:15 +08:00
Ingo Molnar 35239e23c6 Merge branch 'perf/urgent' into perf/core
Merge reason: We are going to queue up a dependent patch.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-12 20:44:11 +01:00
Thomas Gleixner a7f4255f90 x86: Derandom delay_tsc for 64 bit
Commit f0fbf0abc0 ("x86: integrate delay functions") converted
delay_tsc() into a random delay generator for 64 bit.  The reason is
that it merged the mostly identical versions of delay_32.c and
delay_64.c.  Though the subtle difference of the result was:

 static void delay_tsc(unsigned long loops)
 {
-	unsigned bclock, now;
+	unsigned long bclock, now;

Now the function uses rdtscl() which returns the lower 32bit of the
TSC. On 32bit that's not problematic as unsigned long is 32bit. On 64
bit this fails when the lower 32bit are close to wrap around when
bclock is read, because the following check

       if ((now - bclock) >= loops)
       	  	break;

evaluated to true on 64bit for e.g. bclock = 0xffffffff and now = 0
because the unsigned long (now - bclock) of these values results in
0xffffffff00000001 which is definitely larger than the loops
value. That explains Tvortkos observation:

"Because I am seeing udelay(500) (_occasionally_) being short, and
 that by delaying for some duration between 0us (yep) and 491us."

Make those variables explicitely u32 again, so this works for both 32
and 64 bit.

Reported-by: Tvrtko Ursulin <tvrtko.ursulin@onelan.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org # >= 2.6.27
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-09 12:43:27 -08:00
Ingo Molnar 458ce2910a Merge branch 'linus' into x86/asm
Sync up the latest NMI fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-02-28 10:27:36 +01:00
Masami Hiramatsu f8d98f1095 x86: Fix to decode grouped AVX with VEX pp bits
Fix to decode grouped AVX with VEX pp bits which should be
handled as same as last-prefixes. This fixes below warnings
in posttest with CONFIG_CRYPTO_SHA1_SSSE3=y.

 Warning: arch/x86/tools/test_get_len found difference at <sha1_transform_avx>:ffffffff810d5fc0
 Warning: ffffffff810d6069:	c5 f9 73 de 04       	vpsrldq $0x4,%xmm6,%xmm0
 Warning: objdump says 5 bytes, but insn_get_length() says 4
 ...

With this change, test_get_len can decode it correctly.

 $ arch/x86/tools/test_get_len -v -y
 ffffffff810d6069:       c5 f9 73 de 04          vpsrldq $0x4,%xmm6,%xmm0
 Succeed: decoded and checked 1 instructions

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: yrl.pp-manager.tt@hitachi.com
Link: http://lkml.kernel.org/r/20120210053340.30429.73410.stgit@localhost.localdomain
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-02-11 15:11:35 +01:00
Jan Beulich 9d8e22777e x86-64: Handle byte-wise tail copying in memcpy() without a loop
While hard to measure, reducing the number of possibly/likely
mis-predicted branches can generally be expected to be slightly
better.

Other than apparent at the first glance, this also doesn't grow
the function size (the alignment gap to the next function just
gets smaller).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/4F218584020000780006F422@nat28.tlf.novell.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-01-26 21:19:20 +01:00
Jan Beulich 2ab560911a x86-64: Fix memcpy() to support sizes of 4Gb and above
While currently there doesn't appear to be any reachable in-tree
case where such large memory blocks may be passed to memcpy(),
we already had hit the problem in our Xen kernels. Just like
done recently for mmeset(), rather than working around it,
prevent others from falling into the same trap by fixing this
long standing limitation.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/4F21846F020000780006F3FA@nat28.tlf.novell.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-01-26 21:19:18 +01:00
Jan Beulich 5d7244e7c9 x86-64: Fix memset() to support sizes of 4Gb and above
While currently there doesn't appear to be any reachable in-tree
case where such large memory blocks may be passed to memset()
(alloc_bootmem() being the primary non-reachable one, as it gets
called with suitably large sizes in FLATMEM configurations), we
have recently hit the problem a second time in our Xen kernels.

Rather than working around it a second time, prevent others from
falling into the same trap by fixing this long standing
limitation.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/4F05D992020000780006AA09@nat28.tlf.novell.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-01-26 11:50:04 +01:00
Jan Beulich cb8095bba6 x86: atomic64 assembly improvements
In the "xchg" implementation, %ebx and %ecx don't need to be copied
into %eax and %edx respectively (this is only necessary when desiring
to only read the stored value).

In the "add_unless" implementation, swapping the use of %ecx and %esi
for passing arguments allows %esi to become an input only (i.e.
permitting the register to be re-used to address the same object
without reload).

In "{add,sub}_return", doing the initial read64 through the passed in
%ecx decreases a register dependency.

In "inc_not_zero", a branch can be eliminated by or-ing together the
two halves of the current (64-bit) value, and code size can be further
reduced by adjusting the arithmetic slightly.

v2: Undo the folding of "xchg" and "set".

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Link: http://lkml.kernel.org/r/4F19A2BC020000780006E0DC@nat28.tlf.novell.com
Cc: Luca Barbieri <luca@luca-barbieri.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-01-20 17:29:49 -08:00
Jan Beulich 819165fb34 x86: Adjust asm constraints in atomic64 wrappers
Eric pointed out overly restrictive constraints in atomic64_set(), but
there are issues throughout the file. In the cited case, %ebx and %ecx
are inputs only (don't get changed by either of the two low level
implementations). This was also the case elsewhere.

Further in many cases early-clobber indicators were missing.

Finally, the previous implementation rolled a custom alternative
instruction macro from scratch, rather than using alternative_call()
(which was introduced with the commit that the description of the
change in question actually refers to). Adjusting has the benefit of
not hiding referenced symbols from the compiler, which however requires
them to be declared not just in the exporting source file (which, as a
desirable side effect, in turn allows that exporting file to become a
real 5-line stub).

This patch does not eliminate the overly restrictive memory clobbers,
however: Doing so would occasionally make the compiler set up a second
register for accessing the memory object (to satisfy the added "m"
constraint), and it's not clear which of the two non-optimal
alternatives is better.

v2: Re-do the declaration and exporting of the internal symbols.

Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Link: http://lkml.kernel.org/r/4F19A2A5020000780006E0D9@nat28.tlf.novell.com
Cc: Luca Barbieri <luca@luca-barbieri.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-01-20 17:29:31 -08:00
Linus Torvalds 567e47935a Merge branches 'sched-urgent-for-linus', 'perf-urgent-for-linus' and 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/accounting, proc: Fix /proc/stat interrupts sum

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  tracepoints/module: Fix disabling tracepoints with taint CRAP or OOT
  x86/kprobes: Add arch/x86/tools/insn_sanity to .gitignore
  x86/kprobes: Fix typo transferred from Intel manual

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, syscall: Need __ARCH_WANT_SYS_IPC for 32 bits
  x86, tsc: Fix SMI induced variation in quick_pit_calibrate()
  x86, opcode: ANDN and Group 17 in x86-opcode-map.txt
  x86/kconfig: Move the ZONE_DMA entry under a menu
  x86/UV2: Add accounting for BAU strong nacks
  x86/UV2: Ack BAU interrupt earlier
  x86/UV2: Remove stale no-resources test for UV2 BAU
  x86/UV2: Work around BAU bug
  x86/UV2: Fix BAU destination timeout initialization
  x86/UV2: Fix new UV2 hardware by using native UV2 broadcast mode
  x86: Get rid of dubious one-bit signed bitfield
2012-01-19 14:53:06 -08:00
Ulrich Drepper ce79dac861 x86, opcode: ANDN and Group 17 in x86-opcode-map.txt
The Intel documentation at

http://software.intel.com/file/36945

shows the ANDN opcode and Group 17 with encoding f2 and f3 encoding
respectively.  The current version of x86-opcode-map.txt shows them
with f3 and f4.  Unless someone can point to documentation which shows
the currently used encoding the following patch be applied.

Signed-off-by: Ulrich Drepper <drepper@gmail.com>
Link: http://lkml.kernel.org/r/CAOPLpQdq5SuVo9=023CYhbFLAX9rONyjmYq7jJkqc5xwctW5eA@mail.gmail.com
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2012-01-17 12:11:54 -08:00
Ulrich Drepper 8d973b624e x86/kprobes: Fix typo transferred from Intel manual
The arch/x86/lib/x86-opcode-map.txt file [used by the
kprobes instruction decoder] contains the line:

  af: SCAS/W/D/Q rAX,Xv

This is what the Intel manuals show, but it's not correct.
The 'X' stands for:

  Memory addressed by the DS:rSI register pair (for example, MOVS, CMPS, OUTS, or LODS).

On the other hand 'Y' means (also see the ae byte entry for
SCASB):

  Memory addressed by the ES:rDI register pair (for example, MOVS, CMPS, INS, STOS, or SCAS).

Signed-off-by: Ulrich Drepper <drepper@gmail.com>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: yrl.pp-manager.tt@hitachi.com
Link: http://lkml.kernel.org/r/CAOPLpQfytPyDEBF1Hbkpo7ovUerEsstVGxBr%3DEpDL-BKEMaqLA@mail.gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-01-16 08:20:36 +01:00
Linus Torvalds 69734b644b Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
  x86: Fix atomic64_xxx_cx8() functions
  x86: Fix and improve cmpxchg_double{,_local}()
  x86_64, asm: Optimise fls(), ffs() and fls64()
  x86, bitops: Move fls64.h inside __KERNEL__
  x86: Fix and improve percpu_cmpxchg{8,16}b_double()
  x86: Report cpb and eff_freq_ro flags correctly
  x86/i386: Use less assembly in strlen(), speed things up a bit
  x86: Use the same node_distance for 32 and 64-bit
  x86: Fix rflags in FAKE_STACK_FRAME
  x86: Clean up and extend do_int3()
  x86: Call do_notify_resume() with interrupts enabled
  x86/div64: Add a micro-optimization shortcut if base is power of two
  x86-64: Cleanup some assembly entry points
  x86-64: Slightly shorten line system call entry and exit paths
  x86-64: Reduce amount of redundant code generated for invalidate_interruptNN
  x86-64: Slightly shorten int_ret_from_sys_call
  x86, efi: Convert efi_phys_get_time() args to physical addresses
  x86: Default to vsyscall=emulate
  x86-64: Set siginfo and context on vsyscall emulation faults
  x86: consolidate xchg and xadd macros
  ...
2012-01-06 13:59:14 -08:00
Jan Beulich 4269329090 x86-64: Slightly shorten copy_page()
%r13 got saved and restored without ever getting touched, so
there's no need to do so.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/4F05D9F9020000780006AA0D@nat28.tlf.novell.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-01-06 12:25:37 +01:00
Alexey Dobriyan 890890cb8e x86/i386: Use less assembly in strlen(), speed things up a bit
Current i386 strlen() hardcodes NOT/DEC sequence. DEC is
mentioned to be suboptimal on Core2. So, put only REPNE SCASB
sequence in assembly, compiler can do the rest.

The difference in generated code is like below (MCORE2=y):

	<strlen>:
		push   %edi
		mov    $0xffffffff,%ecx
		mov    %eax,%edi
		xor    %eax,%eax
		repnz scas %es:(%edi),%al
		not    %ecx

	-	dec    %ecx
	-	mov    %ecx,%eax
	+	lea    -0x1(%ecx),%eax

		pop    %edi
		ret

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jan Beulich <JBeulich@suse.com>
Link: http://lkml.kernel.org/r/20111211181319.GA17097@p183.telecom.by
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-12-12 18:33:42 +01:00
Masami Hiramatsu a9c373d033 x86: Update instruction decoder to support new AVX formats
Since new Intel software developers manual introduces
new format for AVX instruction set (including AVX2),
it is important to update x86-opcode-map.txt to fit
those changes.

Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: yrl.pp-manager.tt@hitachi.com
Link: http://lkml.kernel.org/r/20111205120557.15475.13236.stgit@cloud
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-12-05 14:53:21 +01:00
Masami Hiramatsu 130b78b2bf x86: Fix instruction decoder to handle grouped AVX instructions
For reducing memory usage of attribute table, x86 instruction
decoder puts "Group" attribute only on "no-last-prefix"
attribute table (same as vex_p == 0 case).

Thus, the decoder should look no-last-prefix table first, and
then only if it is not a group, move on to "with-last-prefix"
table (vex_p != 0).

However, current implementation, inat_get_avx_attribute()
looks with-last-prefix directly. So, when decoding
a grouped AVX instruction, the decoder fails to find correct
group because there is no "Group" attribute on the table.
This ends up with the mis-decoding of instructions, as Ingo
reported in http://thread.gmane.org/gmane.linux.kernel/1214103

This patch fixes it to check no-last-prefix table first
even if that is an AVX instruction, and get an attribute from
"with last-prefix" table only if that is not a group.

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: yrl.pp-manager.tt@hitachi.com
Link: http://lkml.kernel.org/r/20111205120539.15475.91428.stgit@cloud
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-12-05 14:53:15 +01:00
Masami Hiramatsu 53a019a951 x86: Fix insn decoder for longer instruction
Fix x86 insn decoder for hardening against invalid length
instructions. This adds length checkings for each byte-read
site and if it exceeds MAX_INSN_SIZE, returns immediately.
This can happen when decoding user-space binary.

Caller can check whether it happened by checking insn.*.got
member is set or not.

Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: acme@redhat.com
Cc: ming.m.lin@intel.com
Cc: robert.richter@amd.com
Cc: ravitillo@lbl.gov
Cc: yrl.pp-manager.tt@hitachi.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20111007133155.10933.58577.stgit@localhost.localdomain
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-10-10 09:05:51 +02:00
Arun Sharma 60063497a9 atomic: use <linux/atomic.h>
This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>

Signed-off-by: Arun Sharma <asharma@fb.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-26 16:49:47 -07:00
Linus Torvalds 8e204874db Merge branch 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86-64, vdso: Do not allocate memory for the vDSO
  clocksource: Change __ARCH_HAS_CLOCKSOURCE_DATA to a CONFIG option
  x86, vdso: Drop now wrong comment
  Document the vDSO and add a reference parser
  ia64: Replace clocksource.fsys_mmio with generic arch data
  x86-64: Move vread_tsc and vread_hpet into the vDSO
  clocksource: Replace vread with generic arch data
  x86-64: Add --no-undefined to vDSO build
  x86-64: Allow alternative patching in the vDSO
  x86: Make alternative instruction pointers relative
  x86-64: Improve vsyscall emulation CS and RIP handling
  x86-64: Emulate legacy vsyscalls
  x86-64: Fill unused parts of the vsyscall page with 0xcc
  x86-64: Remove vsyscall number 3 (venosys)
  x86-64: Map the HPET NX
  x86-64: Remove kernel.vsyscall64 sysctl
  x86-64: Give vvars their own page
  x86-64: Document some of entry_64.S
  x86-64: Fix alignment of jiffies variable
2011-07-22 17:05:15 -07:00
Linus Torvalds eb47418dc5 Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: Fix write lock scalability 64-bit issue
  x86: Unify rwsem assembly implementation
  x86: Unify rwlock assembly implementation
  x86, asm: Fix binutils 2.16 issue with __USER32_CS
  x86, asm: Cleanup thunk_64.S
  x86, asm: Flip RESTORE_ARGS arguments logic
  x86, asm: Flip SAVE_ARGS arguments logic
  x86, asm: Thin down SAVE/RESTORE_* asm macros
2011-07-22 17:02:24 -07:00
Robert Richter 1ac2e6ca44 x86, perf: Make copy_from_user_nmi() a library function
copy_from_user_nmi() is used in oprofile and perf. Moving it to other
library functions like copy_from_user(). As this is x86 code for 32
and 64 bits, create a new file usercopy.c for unified code.

Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110607172413.GJ20052@erda.amd.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-21 20:41:57 +02:00
Jan Beulich a750036f35 x86: Fix write lock scalability 64-bit issue
With the write lock path simply subtracting RW_LOCK_BIAS there
is, on large systems, the theoretical possibility of overflowing
the 32-bit value that was used so far (namely if 128 or more
CPUs manage to do the subtraction, but don't get to do the
inverse addition in the failure path quickly enough).

A first measure is to modify RW_LOCK_BIAS itself - with the new
value chosen, it is good for up to 2048 CPUs each allowed to
nest over 2048 times on the read path without causing an issue.
Quite possibly it would even be sufficient to adjust the bias a
little further, assuming that allowing for significantly less
nesting would suffice.

However, as the original value chosen allowed for even more
nesting levels, to support more than 2048 CPUs (possible
currently only for 64-bit kernels) the lock itself gets widened
to 64 bits.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/4E258E0D020000780004E3F0@nat28.tlf.novell.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-21 09:03:36 +02:00
Jan Beulich a738669464 x86: Unify rwsem assembly implementation
Rather than having two functionally identical implementations
for 32- and 64-bit configurations, use the previously extended
assembly abstractions to fold the rwsem two implementations into
a shared one.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/4E258DF3020000780004E3ED@nat28.tlf.novell.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-21 09:03:32 +02:00
Jan Beulich 4625cd6379 x86: Unify rwlock assembly implementation
Rather than having two functionally identical implementations
for 32- and 64-bit configurations, extend the existing assembly
abstractions enough to fold the two rwlock implementations into
a shared one.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/4E258DD7020000780004E3EA@nat28.tlf.novell.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-21 09:03:31 +02:00
Andy Lutomirski 59e97e4d6f x86: Make alternative instruction pointers relative
This save a few bytes on x86-64 and means that future patches can
apply alternatives to unrelocated code.

Signed-off-by: Andy Lutomirski <luto@mit.edu>
Link: http://lkml.kernel.org/r/ff64a6b9a1a3860ca4a7b8b6dc7b4754f9491cd7.1310563276.git.luto@mit.edu
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-07-13 11:22:56 -07:00
Borislav Petkov 38e6b75d3b x86, asm: Cleanup thunk_64.S
Drop thunk_ra macro in favor of an additional argument to the thunk
macro since their bodies are almost identical. Do a whitespace scrubbing
and use CFI-aware macros for full annotation.

Signed-off-by: Borislav Petkov <bp@alien8.de>
Link: http://lkml.kernel.org/r/1306873314-32523-5-git-send-email-bp@alien8.de
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-06-03 14:38:55 -07:00
Linus Torvalds 17b141803c Merge branches 'x86-apic-for-linus', 'x86-asm-for-linus' and 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, apic: Print verbose error interrupt reason on apic=debug

* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: Demacro CONFIG_PARAVIRT cpu accessors

* 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: Fix mrst sparse complaints
  x86: Fix spelling error in the memcpy() source code comment
  x86, mpparse: Remove unnecessary variable
2011-05-19 17:49:35 -07:00
Jiri Olsa 26afb7c661 x86, 64-bit: Fix copy_[to/from]_user() checks for the userspace address limit
As reported in BZ #30352:

  https://bugzilla.kernel.org/show_bug.cgi?id=30352

there's a kernel bug related to reading the last allowed page on x86_64.

The _copy_to_user() and _copy_from_user() functions use the following
check for address limit:

  if (buf + size >= limit)
	fail();

while it should be more permissive:

  if (buf + size > limit)
	fail();

That's because the size represents the number of bytes being
read/write from/to buf address AND including the buf address.
So the copy function will actually never touch the limit
address even if "buf + size == limit".

Following program fails to use the last page as buffer
due to the wrong limit check:

 #include <sys/mman.h>
 #include <sys/socket.h>
 #include <assert.h>

 #define PAGE_SIZE       (4096)
 #define LAST_PAGE       ((void*)(0x7fffffffe000))

 int main()
 {
        int fds[2], err;
        void * ptr = mmap(LAST_PAGE, PAGE_SIZE, PROT_READ | PROT_WRITE,
                          MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED, -1, 0);
        assert(ptr == LAST_PAGE);
        err = socketpair(AF_LOCAL, SOCK_STREAM, 0, fds);
        assert(err == 0);
        err = send(fds[0], ptr, PAGE_SIZE, 0);
        perror("send");
        assert(err == PAGE_SIZE);
        err = recv(fds[1], ptr, PAGE_SIZE, MSG_WAITALL);
        perror("recv");
        assert(err == PAGE_SIZE);
        return 0;
 }

The other place checking the addr limit is the access_ok() function,
which is working properly. There's just a misleading comment
for the __range_not_ok() macro - which this patch fixes as well.

The last page of the user-space address range is a guard page and
Brian Gerst observed that the guard page itself due to an erratum on K8 cpus
(#121 Sequential Execution Across Non-Canonical Boundary Causes Processor
Hang).

However, the test code is using the last valid page before the guard page.
The bug is that the last byte before the guard page can't be read
because of the off-by-one error. The guard page is left in place.

This bug would normally not show up because the last page is
part of the process stack and never accessed via syscalls.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Brian Gerst <brgerst@gmail.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: <stable@kernel.org>
Link: http://lkml.kernel.org/r/1305210630-7136-1-git-send-email-jolsa@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-18 12:49:00 +02:00
Fenghua Yu 2f19e06ac3 x86, mem: memset_64.S: Optimize memset by enhanced REP MOVSB/STOSB
Support memset() with enhanced rep stosb. On processors supporting enhanced
REP MOVSB/STOSB, the alternative memset_c_e function using enhanced rep stosb
overrides the fast string alternative memset_c and the original function.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: http://lkml.kernel.org/r/1305671358-14478-10-git-send-email-fenghua.yu@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-05-17 15:40:31 -07:00
Fenghua Yu 057e05c1d6 x86, mem: memmove_64.S: Optimize memmove by enhanced REP MOVSB/STOSB
Support memmove() by enhanced rep movsb. On processors supporting enhanced
REP MOVSB/STOSB, the alternative memmove() function using enhanced rep movsb
overrides the original function.

The patch doesn't change the backward memmove case to use enhanced rep
movsb.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: http://lkml.kernel.org/r/1305671358-14478-9-git-send-email-fenghua.yu@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-05-17 15:40:30 -07:00
Fenghua Yu 101068c1f4 x86, mem: memcpy_64.S: Optimize memcpy by enhanced REP MOVSB/STOSB
Support memcpy() with enhanced rep movsb. On processors supporting enhanced
rep movsb, the alternative memcpy() function using enhanced rep movsb overrides the original function and the fast string
function.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: http://lkml.kernel.org/r/1305671358-14478-8-git-send-email-fenghua.yu@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-05-17 15:40:29 -07:00
Fenghua Yu 4307bec934 x86, mem: copy_user_64.S: Support copy_to/from_user by enhanced REP MOVSB/STOSB
Support copy_to_user/copy_from_user() by enhanced REP MOVSB/STOSB.
On processors supporting enhanced REP MOVSB/STOSB, the alternative
copy_user_enhanced_fast_string function using enhanced rep movsb overrides the
original function and the fast string function.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: http://lkml.kernel.org/r/1305671358-14478-7-git-send-email-fenghua.yu@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-05-17 15:40:28 -07:00
Fenghua Yu e365c9df2f x86, mem: clear_page_64.S: Support clear_page() with enhanced REP MOVSB/STOSB
Intel processors are adding enhancements to REP MOVSB/STOSB and the use of
REP MOVSB/STOSB for optimal memcpy/memset or similar functions is recommended.
Enhancement availability is indicated by CPUID.7.0.EBX[9] (Enhanced REP MOVSB/
STOSB).

Support clear_page() with rep stosb for processor supporting enhanced REP MOVSB
/STOSB. On processors supporting enhanced REP MOVSB/STOSB, the alternative
clear_page_c_e function using enhanced REP STOSB overrides the original function
and the fast string function.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: http://lkml.kernel.org/r/1305671358-14478-6-git-send-email-fenghua.yu@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-05-17 15:40:27 -07:00
Ingo Molnar 4cb1f43ce8 Merge commit 'v2.6.39-rc6' into x86/cleanups
Merge reason: move to a (much) newer upstream base.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-07 10:51:48 +02:00
Bart Van Assche 9de4966a4d x86: Fix spelling error in the memcpy() source code comment
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Link: http://lkml.kernel.org/r/201105011409.21629.bvanassche@acm.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-01 19:16:18 +02:00
Christoph Lameter d7c3f8cee8 percpu: Omit segment prefix in the UP case for cmpxchg_double
Omit the segment prefix in the UP case. GS is not used then
and we will generate segfaults if cmpxchg16b is used otherwise.

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-27 19:25:36 -07:00
Linus Torvalds f2e1fbb5f2 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: Flush TLB if PGD entry is changed in i386 PAE mode
  x86, dumpstack: Correct stack dump info when frame pointer is available
  x86: Clean up csum-copy_64.S a bit
  x86: Fix common misspellings
  x86: Fix misspelling and align params
  x86: Use PentiumPro-optimized partial_csum() on VIA C7
2011-03-18 10:45:21 -07:00
Ingo Molnar 2c76397bdd x86: Clean up csum-copy_64.S a bit
The many stray whitespaces and other uncleanlinesses made this code
almost unreadable to me - so fix those.

No changes to the code.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-18 10:44:26 +01:00
Lucas De Marchi 0d2eb44f63 x86: Fix common misspellings
They were generated by 'codespell' and then manually reviewed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
Cc: trivial@kernel.org
LKML-Reference: <1300389856-1099-3-git-send-email-lucas.demarchi@profusion.mobi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-18 10:39:30 +01:00
Linus Torvalds 79d8a8f736 Merge branch 'for-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
* 'for-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
  percpu, x86: Add arch-specific this_cpu_cmpxchg_double() support
  percpu: Generic support for this_cpu_cmpxchg_double()
  alpha: use L1_CACHE_BYTES for cacheline size in the linker script
  percpu: align percpu readmostly subsection to cacheline

Fix up trivial conflict in arch/x86/kernel/vmlinux.lds.S due to the
percpu alignment having changed ("x86: Reduce back the alignment of the
per-CPU data section")
2011-03-16 08:22:41 -07:00
Linus Torvalds d5d42399bd Merge branch 'x86-mem-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-mem-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86-64, mem: Convert memmove() to assembly file and fix return value bug
2011-03-15 19:41:42 -07:00
Jan Beulich e938c287ea x86: Fix a bogus unwind annotation in lib/semaphore_32.S
'simple' would have required specifying current frame address
and return address location manually, but that's obviously not
the case (and not necessary) here.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
LKML-Reference: <4D6D1082020000780003454C@vpn.id2.novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-02 08:16:44 +01:00
Jan Beulich 039e13890b x86: Remove unused bits from lib/thunk_*.S
Some of the items removed were apparently never used, others
simply didn't get removed with their last user.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
LKML-Reference: <4D6BD3A002000078000341F1@vpn.id2.novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-28 18:06:22 +01:00
Jan Beulich 60cf637a13 x86: Use {push,pop}_cfi in more places
Cleaning up and shortening code...

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Alexander van Heukelum <heukelum@fastmail.fm>
LKML-Reference: <4D6BD35002000078000341DA@vpn.id2.novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-28 18:06:22 +01:00
Jan Beulich 39f2205e1a x86-64: Add CFI annotations to lib/rwsem_64.S
These weren't part of the initial commit of this code.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Alexander van Heukelum <heukelum@fastmail.fm>
LKML-Reference: <4D6BCDFF02000078000341B0@vpn.id2.novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-28 18:06:21 +01:00
Christoph Lameter b9ec40af0e percpu, x86: Add arch-specific this_cpu_cmpxchg_double() support
Support this_cpu_cmpxchg_double() using the cmpxchg16b and cmpxchg8b
instructions.

-tj: s/percpu_cmpxchg16b/percpu_cmpxchg16b_double/ for consistency and
     other cosmetic changes.

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2011-02-28 11:20:49 +01:00
Fenghua Yu 9599ec0471 x86-64, mem: Convert memmove() to assembly file and fix return value bug
memmove_64.c only implements memmove() function which is completely written in
inline assembly code. Therefore it doesn't make sense to keep the assembly code
in .c file.

Currently memmove() doesn't store return value to rax. This may cause issue if
caller uses the return value. The patch fixes this issue.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
LKML-Reference: <1295314755-6625-1-git-send-email-fenghua.yu@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-01-25 16:58:39 -08:00
Christoph Lameter 357089fca9 x86: udelay: Use this_cpu_read to avoid address calculation
The code will use a segment prefix instead of doing the lookup and
calculation.

Signed-off-by: Christoph Lameter <cl@linux.com>
Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2011-01-04 06:08:55 +01:00
Ma Ling 3b4b682bec x86, mem: Optimize memmove for small size and unaligned cases
movs instruction will combine data to accelerate moving data,
however we need to concern two cases about it.

1. movs instruction need long lantency to startup,
   so here we use general mov instruction to copy data.
2. movs instruction is not good for unaligned case,
   even if src offset is 0x10, dest offset is 0x0,
   we avoid and handle the case by general mov instruction.

Signed-off-by: Ma Ling <ling.ma@intel.com>
LKML-Reference: <1284664360-6138-1-git-send-email-ling.ma@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-09-24 18:57:11 -07:00
Ma Ling 59daa706fb x86, mem: Optimize memcpy by avoiding memory false dependece
All read operations after allocation stage can run speculatively,
all write operation will run in program order, and if addresses are
different read may run before older write operation, otherwise wait
until write commit. However CPU don't check each address bit,
so read could fail to recognize different address even they
are in different page.For example if rsi is 0xf004, rdi is 0xe008,
in following operation there will generate big performance latency.
1. movq (%rsi),	%rax
2. movq %rax,	(%rdi)
3. movq 8(%rsi), %rax
4. movq %rax,	8(%rdi)

If %rsi and rdi were in really the same meory page, there are TRUE
read-after-write dependence because instruction 2 write 0x008 and
instruction 3 read 0x00c, the two address are overlap partially.
Actually there are in different page and no any issues,
but without checking each address bit CPU could think they are
in the same page, and instruction 3 have to wait for instruction 2
to write data into cache from write buffer, then load data from cache,
the cost time read spent is equal to mfence instruction. We may avoid it by
tuning operation sequence as follow.

1. movq 8(%rsi), %rax
2. movq %rax,	8(%rdi)
3. movq (%rsi),	%rax
4. movq %rax,	(%rdi)

Instruction 3 read 0x004, instruction 2 write address 0x010, no any
dependence.  At last on Core2 we gain 1.83x speedup compared with
original instruction sequence.  In this patch we first handle small
size(less 20bytes), then jump to different copy mode. Based on our
micro-benchmark small bytes from 1 to 127 bytes, we got up to 2X
improvement, and up to 1.5X improvement for 1024 bytes on Corei7.  (We
use our micro-benchmark, and will do further test according to your
requirment)

Signed-off-by: Ma Ling <ling.ma@intel.com>
LKML-Reference: <1277753065-18610-1-git-send-email-ling.ma@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-08-23 14:56:41 -07:00
Ma, Ling fdf4289679 x86, mem: Don't implement forward memmove() as memcpy()
memmove() allow source and destination address to be overlap, but
there is no such limitation for memcpy().  Therefore, explicitly
implement memmove() in both the forwards and backward directions, to
give us the ability to optimize memcpy().

Signed-off-by: Ma Ling <ling.ma@intel.com>
LKML-Reference: <C10D3FB0CD45994C8A51FEC1227CE22F0E483AD86A@shsmsx502.ccr.corp.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-08-23 14:14:27 -07:00
Linus Torvalds c029b55af7 Merge branch 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, asm: Use a lower case name for the end macro in atomic64_386_32.S
  x86, asm: Refactor atomic64_386_32.S to support old binutils and be cleaner
  x86: Document __phys_reloc_hide() usage in __pa_symbol()
  x86, apic: Map the local apic when parsing the MP table.
2010-08-13 10:35:48 -07:00
Luca Barbieri 417484d47e x86, asm: Use a lower case name for the end macro in atomic64_386_32.S
Use a lowercase name for the end macro, which somehow fixes a binutils 2.16
problem.

Signed-off-by: Luca Barbieri <luca@luca-barbieri.com>
LKML-Reference: <tip-30246557a06bb20618bed906a06d1e1e0faa8bb4@git.kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-08-12 07:04:16 -07:00
Luca Barbieri 30246557a0 x86, asm: Refactor atomic64_386_32.S to support old binutils and be cleaner
The old code didn't work on binutils 2.12 because setting a symbol to
a register apparently requires a fairly recent version.

This commit refactors the code to use the C preprocessor instead, and
in the process makes the whole code a bit easier to understand.

The object code produced is unchanged as expected.

This fixes kernel bugzilla 16506.

Reported-by: Dieter Stussy <kd6lvw+software@kd6lvw.ampr.org>
Signed-off-by: Luca Barbieri <luca@luca-barbieri.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: <stable@kernel.org> 2.6.35
LKML-Reference: <tip-*@git.kernel.org>
2010-08-11 21:03:28 -07:00
Linus Torvalds 66cd55d2b9 Merge branch 'x86-alternatives-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-alternatives-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, alternatives: BUG on encountering an invalid CPU feature number
  x86, alternatives: Fix one more open-coded 8-bit alternative number
  x86, alternatives: Use 16-bit numbers for cpufeature index
2010-08-06 16:24:17 -07:00
H. Peter Anvin a378d9338e x86, asm: Merge cmpxchg_486_u64() and cmpxchg8b_emu()
We have two functions for doing exactly the same thing -- emulating
cmpxchg8b on 486 and older hardware -- with different calling
conventions, and yet doing the same thing.  Drop the C version and use
the assembly version, via alternatives, for both the local and
non-local versions of cmpxchg8b.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
LKML-Reference: <AANLkTikAmaDPji-TVDarmG1yD=fwbffcsmEU=YEuP+8r@mail.gmail.com>
2010-07-28 17:05:11 -07:00
H. Peter Anvin 90c8f92f5c x86, asm: Move cmpxchg emulation code to arch/x86/lib
Move cmpxchg emulation code from arch/x86/kernel/cpu (which is
otherwise CPU identification) to arch/x86/lib, where other emulation
code lives already.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
LKML-Reference: <AANLkTikAmaDPji-TVDarmG1yD=fwbffcsmEU=YEuP+8r@mail.gmail.com>
2010-07-28 16:53:49 -07:00
H. Peter Anvin df378ccfc4 x86, alternatives: Fix one more open-coded 8-bit alternative number
Fix a missing case of an 8-bit alternative number, buried inside an
assembly macro.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Reported-by: Yinghai Lu <yinhai@kernel.org>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <4C3BDDA3.2060900@kernel.org>
2010-07-13 14:56:16 -07:00
H. Peter Anvin 83a7a2ad2a x86, alternatives: Use 16-bit numbers for cpufeature index
We already have cpufeature indicies above 255, so use a 16-bit number
for the alternatives index.  This consumes a padding field and so
doesn't add any size, but it means that abusing the padding field to
create assembly errors on overflow no longer works.  We can retain the
test simply by redirecting it to the .discard section, however.

[ v3: updated to include open-coded locations ]

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
LKML-Reference: <tip-f88731e3068f9d1392ba71cc9f50f035d26a0d4f@git.kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-07-07 10:36:28 -07:00
Linus Torvalds 93c9d7f60c Merge branch 'x86-atomic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-atomic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: Fix LOCK_PREFIX_HERE for uniprocessor build
  x86, atomic64: In selftest, distinguish x86-64 from 586+
  x86-32: Fix atomic64_inc_not_zero return value convention
  lib: Fix atomic64_inc_not_zero test
  lib: Fix atomic64_add_unless return value convention
  x86-32: Fix atomic64_add_unless return value convention
  lib: Fix atomic64_add_unless test
  x86: Implement atomic[64]_dec_if_positive()
  lib: Only test atomic64_dec_if_positive on archs having it
  x86-32: Rewrite 32-bit atomic64 functions in assembly
  lib: Add self-test for atomic64_t
  x86-32: Allow UP/SMP lock replacement in cmpxchg64
  x86: Add support for lock prefix in alternatives
2010-05-18 08:40:05 -07:00
Ingo Molnar cce9131781 Merge branch 'perf/urgent' into perf/core
Merge reason: Resolve patch dependency

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-07 11:30:30 +02:00
David Howells a66f6375bd Fix the x86_64 implementation of call_rwsem_wait()
The x86_64 call_rwsem_wait() treats the active state counter part of the
R/W semaphore state as being 16-bit when it's actually 32-bit (it's half
of the 64-bit state).  It should do "decl %edx" not "decw %dx".

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-04 15:24:14 -07:00
H. Peter Anvin d9c5841e22 Merge branch 'x86/asm' into x86/atomic
Merge reason:
	Conflict between LOCK_PREFIX_HERE and relative alternatives
	pointers

Resolved Conflicts:
	arch/x86/include/asm/alternative.h
	arch/x86/kernel/alternative.c

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-04-29 16:53:17 -07:00
Ingo Molnar ba7e4d13fc perf, x86: Add INSTRUCTION_DECODER config flag
The PEBS+LBR decoding magic needs the insn_get_length() infrastructure
to be able to decode x86 instruction length.

So split it out of KPROBES dependency and make it enabled when either
KPROBES or PERF_EVENTS is enabled.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-10 13:34:12 +01:00
Luca Barbieri f3e8313146 x86-32: Fix atomic64_inc_not_zero return value convention
atomic64_inc_not_zero must return 1 if it perfomed the add and 0 otherwise.
It was doing the opposite thing.

Signed-off-by: Luca Barbieri <luca@luca-barbieri.com>
LKML-Reference: <1267469749-11878-6-git-send-email-luca@luca-barbieri.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-03-01 11:39:03 -08:00