Commit graph

7204 commits

Author SHA1 Message Date
Andreas Herrmann 7a6f9cbb37 x86: hpet: fix periodic mode programming on AMD 81xx
(See http://bugzilla.kernel.org/show_bug.cgi?id=12961)

It partially reverts commit c23e253e67
(x86: hpet: stop HPET_COUNTER when programming periodic mode)

HPET on AMD 81xx chipset needs a second write (with HPET_TN_SETVAL
cleared) to T0_CMP register to set the period in periodic mode.

With this patch HPET_COUNTER is still stopped but not reset when HPET
is programmed in periodic mode. This should help to avoid races when
HPET is programmed in periodic mode and fixes a boot time hang that
I've observed on a machine when using 1000HZ.

[ Impact: fix boot time hang on machines with AMD 81xx chipset ]

Reported-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Tested-by: Jeff Mahoney <jeffm@suse.com>
LKML-Reference: <20090421180037.GA2763@alberich.amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-22 15:53:40 +02:00
Jan Kiszka 888d256e9c KVM: Unregister cpufreq notifier on unload
Properly unregister cpufreq notifier on onload if it was registered
during init.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-04-22 13:54:33 +03:00
Joerg Roedel 7f1ea20896 KVM: x86: release time_page on vcpu destruction
Not releasing the time_page causes a leak of that page or the compound
page it is situated in.

Cc: stable@kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-04-22 13:52:10 +03:00
Marcelo Tosatti bf47a760f6 KVM: MMU: disable global page optimization
Complexity to fix it not worthwhile the gains, as discussed
in http://article.gmane.org/gmane.comp.emulators.kvm.devel/28649.

Cc: stable@kernel.org
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-04-22 13:52:09 +03:00
Ingo Molnar 3568b71d46 Merge commit 'v2.6.30-rc3' into x86/urgent
Merge reason: hpet.c changed upstream, make sure we test against that

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-22 12:44:20 +02:00
David Howells 9b8de7479d FRV: Fix the section attribute on UP DECLARE_PER_CPU()
In non-SMP mode, the variable section attribute specified by DECLARE_PER_CPU()
does not agree with that specified by DEFINE_PER_CPU().  This means that
architectures that have a small data section references relative to a base
register may throw up linkage errors due to too great a displacement between
where the base register points and the per-CPU variable.

On FRV, the .h declaration says that the variable is in the .sdata section, but
the .c definition says it's actually in the .data section.  The linker throws
up the following errors:

kernel/built-in.o: In function `release_task':
kernel/exit.c:78: relocation truncated to fit: R_FRV_GPREL12 against symbol `per_cpu__process_counts' defined in .data section in kernel/built-in.o
kernel/exit.c:78: relocation truncated to fit: R_FRV_GPREL12 against symbol `per_cpu__process_counts' defined in .data section in kernel/built-in.o

To fix this, DECLARE_PER_CPU() should simply apply the same section attribute
as does DEFINE_PER_CPU().  However, this is made slightly more complex by
virtue of the fact that there are several variants on DEFINE, so these need to
be matched by variants on DECLARE.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-21 19:39:59 -07:00
Michael K. Johnson 2a3313f494 x86: more than 8 32-bit CPUs requires X86_BIGSMP
$ cat x86-more-than-8-cpus-requires-bigsmp.patch

Enforce NR_CPUS <= 8 limitation if X86_BIGSMP not set

Configuring more than 8 logical CPUs on 32-bit x86 requires
X86_BIGSMP to be set in order to boot successfully, if more than 8
logical CPUs are actually found at boot time.  The X86_BIGSMP help
text describes that it is required to be set if more than 8 CPUs
are configured, but this was previously not enforced.

This configuration error has affected multiple distributions:
    https://bugzilla.redhat.com/show_bug.cgi?id=480844
    https://issues.rpath.com/browse/RPL-3022

Signed-off-by: Michael K Johnson <johnsonm@rpath.com>
LKML-Reference: <20090422014448.GB32541@logo.rdu.rpath.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-04-21 19:34:54 -07:00
Magnus Damm 8e19608e8b clocksource: pass clocksource to read() callback
Pass clocksource pointer to the read() callback for clocksources.  This
allows us to share the callback between multiple instances.

[hugh@veritas.com: fix powerpc build of clocksource pass clocksource mods]
[akpm@linux-foundation.org: cleanup]
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Acked-by: John Stultz <johnstul@us.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-21 13:41:47 -07:00
Rusty Russell fcc5c4a2fe x86: avoid theoretical spurious NMI backtraces with CONFIG_CPUMASK_OFFSTACK=y
In theory (though not shown in practice) alloc_cpumask_var() doesn't zero
memory, so CPUs might print an "NMI backtrace for cpu %d" once on boot.

(Bug introduced in fcef8576d8).

[ Impact: avoid theoretical syslog noise in rare configs ]

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <alpine.DEB.2.00.0904202113520.10097@gandalf.stny.rr.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-21 10:09:50 +02:00
Rusty Russell 2f537a9f8e x86: fix boot crash in NMI watchdog with CONFIG_CPUMASK_OFFSTACK=y and flat APIC
fcef8576d8 converted backtrace_mask to a
cpumask_var_t, and assumed check_nmi_watchdog was called before
nmi_watchdog_tick was ever called.  Steven's oops shows I was wrong.

This is something of a bandaid: I'm not sure we *should* be calling
nmi_watchdog_tick before check_nmi_watchdog.  Note that gcc eliminates
this test for the CONFIG_CPUMASK_OFFSTACK=n case.

[ Impact: fix boot crash in rare configs ]

Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
LKML-Reference: <alpine.DEB.2.00.0904202113520.10097@gandalf.stny.rr.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-21 10:09:49 +02:00
Oleg Drokin 0112fc2229 Separate out common fstatat code into vfs_fstatat
This is a version incorporating Christoph's suggestion.

Separate out common *fstatat functionality into a single function
instead of duplicating it all over the code.

Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-04-20 23:02:51 -04:00
Suresh Siddha 06c38d5e36 x86-64: fix FPU corruption with signals and preemption
In 64bit signal delivery path, clear_used_math() was happening before saving
the current active FPU state on to the user stack for signal handling. Between
clear_used_math() and the state store on to the user stack, potentially we
can get a page fault for the user address and can block. Infact, while testing
we were hitting the might_fault() in __clear_user() which can do a schedule().

At a later point in time, we will schedule back into this process and
resume the save state (using "xsave/fxsave" instruction) which can lead
to DNA fault. And as used_math was cleared before, we will reinit the FP state
in the DNA fault and continue. This reinit will result in loosing the
FPU state of the process.

Move clear_used_math() to a point after the FPU state has been stored
onto the user stack.

This issue is present from a long time (even before the xsave changes
and the x86 merge). But it can easily be exposed in 2.6.28.x and 2.6.29.x
series because of the __clear_user() in this path, which has an explicit
__cond_resched() leading to a context switch with CONFIG_PREEMPT_VOLUNTARY.

[ Impact: fix FPU state corruption ]

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: <stable@kernel.org>			[2.6.28.x, 2.6.29.x]
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-04-20 14:33:00 -07:00
Jack Steiner fc61e6636d x86/uv: fix for no memory at paddr 0
Fix endcase where the memory at physical address 0 does not really
exist AND one of the sockets on blade 0 has no active cpus.

The memory that _appears_ to be at physical address 0 is actually
memory that located at a different address but has been remapped by
the chipset so that it appears to be at physical address 0.

When determining the UV pnode, the algorithm for determining the pnode
incorrectly used the relocated physical address instead of the actual
(global) address.

[ Impact: boot failure on partitioned systems ]

Signed-off-by: Jack Steiner <steiner@sgi.com>
LKML-Reference: <20090420132530.GA23156@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-20 18:09:18 +02:00
Ingo Molnar 62d1702909 Merge branch 'linus' into x86/urgent
Merge reason: We need the x86/uv updates from upstream, to queue up
              dependent fix.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-20 18:08:12 +02:00
Thomas Renninger d876dfbbf5 acpi-cpufreq: Do not let get_measured perf depend on internal variable
Take already available policy->cpuinfo.max_freq and get rid of acpi-cpufreq
specific max_freq variable.

This implies that P0 is always the highest frequency which should always
be true as ACPI spec says:
As a result, the zeroth entry describes the highest performance state

Signed-off-by: Thomas Renninger <trenn@suse.de>
Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-04-19 22:47:21 -04:00
Thomas Renninger d91758f5dd acpi-cpufreq: style-only: add parens to math expression
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-04-19 22:47:21 -04:00
Thomas Renninger e0e8c4e512 acpi-cpufreq: Cleanup: Use printk_once
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-04-19 22:47:20 -04:00
Pallipadi, Venkatesh 093f13e231 x86, acpi_cpufreq: Fix the NULL pointer dereference in get_measured_perf
Fix for a regression that was introduced by earlier commit
18b2646fe3 on Mon Apr 6 11:26:08 2009

Regression resulted in the below error happened on systems with
software coordination where per_cpu acpi data will not be initiated for
secondary CPUs in a P-state domain.

On Tue, 2009-04-14 at 23:01 -0700, Zhang, Yanmin wrote:
 My machine hanged with kernel 2.6.30-rc2 when script read
> /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor.
>
> opps happens in get_measured_perf:
>
>         cur.aperf.whole = readin.aperf.whole -
>                                 per_cpu(drv_data, cpu)->saved_aperf;
>
> Because per_cpu(drv_data, cpu)=NULL.
>
> So function get_measured_perf should check if (per_cpu(drv_data,
> cpu)==NULL)
> and return 0 if it's NULL.

--------------sys log------------------

BUG: unable to handle kernel NULL pointer dereference at
0000000000000020
IP: [<ffffffff8021af75>] get_measured_perf+0x4a/0xf9
PGD a7dd88067 PUD a7ccf5067 PMD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
CPU 0
Modules linked in: video output
Pid: 2091, comm: kondemand/0 Not tainted 2.6.30-rc2 #1 MP Server
RIP: 0010:[<ffffffff8021af75>]  [<ffffffff8021af75>]
get_measured_perf+0x4a/0xf9
RSP: 0018:ffff880a7d56de20  EFLAGS: 00010246
RAX: 0000000000000000 RBX: 00000046241a42b6 RCX: ffff88004d219000
RDX: 000000000000b660 RSI: 0000000000000020 RDI: 0000000000000001
RBP: ffff880a7f052000 R08: 00000046241a42b6 R09: ffffffff807639f0
R10: 00000000ffffffea R11: ffffffff802207f4 R12: ffff880a7f052000
R13: ffff88004d20e460 R14: 0000000000ddd5a6 R15: 0000000000000001
FS:  0000000000000000(0000) GS:ffff88004d200000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000020 CR3: 0000000a7f1bf000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process kondemand/0 (pid: 2091, threadinfo ffff880a7d56c000, task
ffff880a7d4d18c0)
Stack:
 ffff880a7f052078 ffffffff803efd54 00000046241a42b6 000000462ffa9e95
 0000000000000001 0000000000000001 00000000ffffffea ffffffff8064f41a
 0000000000000012 0000000000000012 ffff880a7f052000 ffffffff80650547
Call Trace:
 [<ffffffff803efd54>] ? kobject_get+0x12/0x17
 [<ffffffff8064f41a>] ? __cpufreq_driver_getavg+0x42/0x57
 [<ffffffff80650547>] ? do_dbs_timer+0x147/0x272
 [<ffffffff80650400>] ? do_dbs_timer+0x0/0x272
 [<ffffffff802474ca>] ? worker_thread+0x15b/0x1f5
 [<ffffffff8024a02c>] ? autoremove_wake_function+0x0/0x2e
 [<ffffffff8024736f>] ? worker_thread+0x0/0x1f5
 [<ffffffff80249f0d>] ? kthread+0x54/0x83
 [<ffffffff8020c87a>] ? child_rip+0xa/0x20
 [<ffffffff80249eb9>] ? kthread+0x0/0x83
 [<ffffffff8020c870>] ? child_rip+0x0/0x20
Code: 99 a6 03 00 31 c9 85 c0 0f 85 c3 00 00 00 89 df 4c 8b 44 24 10 48
c7 c2 60 b6 00 00 48 8b 0c fd e0 30 a5 80 4c 89 c3 48 8b 04 0a <48> 2b
58 20 48 8b 44 24 18 48 89 1c 24 48 8b 34 0a 48 2b 46 28
RIP  [<ffffffff8021af75>] get_measured_perf+0x4a/0xf9
 RSP <ffff880a7d56de20>
CR2: 0000000000000020
---[ end trace 2b8fac9a49e19ad4 ]---

Tested-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-04-19 22:47:20 -04:00
Rusty Russell a489f0b555 lguest: fix guest crash on non-linear addresses in gdt pvops
Fixes guest crash 'lguest: bad read address 0x4800000 len 256'

The new per-cpu allocator ends up handing a non-linear address to
write_gdt_entry.  We do __pa() on it, and hand it to the host, which
kills us.

I've long wanted to make the hypercall "LOAD_GDT_ENTRY" to match the IDT
code, but had no pressing reason until now.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: lguest@ozlabs.org
2009-04-19 23:14:01 +09:30
Steven Rostedt 0300e7f1a5 lockdep, x86: account for irqs enabled in paranoid_exit
I hit the check_flags error of lockdep:

 WARNING: at kernel/lockdep.c:2893 check_flags+0x1a7/0x1d0()
 [...]
 hardirqs last  enabled at (12567): [<ffffffff8026206a>] local_bh_enable+0xaa/0x110
 hardirqs last disabled at (12569): [<ffffffff80610c76>] int3+0x16/0x40
 softirqs last  enabled at (12566): [<ffffffff80514d2b>] lock_sock_nested+0xfb/0x110
 softirqs last disabled at (12568): [<ffffffff8058454e>] tcp_prequeue_process+0x2e/0xa0

The check_flags warning of lockdep tells me that lockdep thought interrupts
were disabled, but they were really enabled.

The numbers in the above parenthesis show the order of events:

 12566: softirqs last enabled:  lock_sock_nested
 12567: hardirqs last enabled:  local_bh_enable
 12568: softirqs last disabled: tcp_prequeue_process
 12566: hardirqs last disabled: int3

int3 is a breakpoint!

Examining this further, I have CONFIG_NET_TCPPROBE enabled which adds
break points into the kernel.

The paranoid_exit of the return of int3 does not account for enabling
interrupts on return to kernel. This code is a bit tricky since it
is also used by the nmi handler (when lockdep is off), and we must be
careful about the swapgs. We can not call kernel code after the swapgs
has been performed.

[ Impact: fix lockdep check_flags warning + self-turn-off ]

Acked-by: Peter Zijlsta <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-18 09:04:28 +02:00
Jaswinder Singh Rajput a81b6314e0 x86: mm/numa_32.c calculate_numa_remap_pages should use __init
calculate_numa_remap_pages() is called only by __init initmem_init()
further calculate_numa_remap_pages is calling:
	__init find_e820_area() and __init reserve_early()

So calculate_numa_remap_pages() should be __init calculate_numa_remap_pages().

 WARNING: arch/x86/built-in.o(.text+0x82ea3): Section mismatch in reference from the function calculate_numa_remap_pages() to the function .init.text:find_e820_area()
 The function calculate_numa_remap_pages() references
 the function __init find_e820_area().
 This is often because calculate_numa_remap_pages lacks a __init
 annotation or the annotation of find_e820_area is wrong.

 WARNING: arch/x86/built-in.o(.text+0x82f5f): Section mismatch in reference from the function calculate_numa_remap_pages() to the function .init.text:reserve_early()
 The function calculate_numa_remap_pages() references
 the function __init reserve_early().
 This is often because calculate_numa_remap_pages lacks a __init
 annotation or the annotation of reserve_early is wrong.

[ Impact: save memory, address Section mismatch warning ]

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
LKML-Reference: <1239991281.3153.4.camel@ht.satnam>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-17 22:43:13 +02:00
H. Peter Anvin 1648e4f805 x86, kbuild: make "make install" not depend on vmlinux
It is common to use "make install" in restricted environments which
differ from the one which was actually used to build the kernel.  In
such environments it is highly undesirable to trigger a rebuild of any
part of the system.  Worse, the rebuild may be spurious, triggered by
differences in the environment.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
LKML-Reference: <20090415234642.GA28531@uranus.ravnborg.org>
2009-04-17 22:43:12 +02:00
Jack Steiner 27229ca632 x86/uv: fix init of cpu-less nodes
Fix an endcase in the UV initialization code for the "UV large system mode"
of apicids.  If node zero contains no cpus, cpus on another node will be the
boot cpu.  The percpu data that contains the extra apicid bits was not
being initialized early enough.

[ Impact: fix potential boot crash on cpu-less UV nodes ]

Signed-off-by: Jack Steiner <steiner@sgi.com>
LKML-Reference: <20090417142447.GA23759@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-17 22:43:11 +02:00
Jack Steiner dc09855191 x86/uv: fix init of memory-less nodes
Add support for nodes that have cpus but no memory.
The current code was failing to add these nodes
to the nodes_present_map.

v2: Fixes case caught by David Rientjes - missed support
    for the x2apic SRAT table.

[ Impact: fix potential boot crash on memory-less UV nodes. ]

Reported-by: David Rientjes <rientjes@google.com>
Signed-off-by: Jack Steiner <steiner@sgi.com>
LKML-Reference: <20090417142242.GA23743@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-17 22:42:12 +02:00
Linus Torvalds b9836e0837 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: fix microcode driver newly spewing warnings
  x86, PAT: Remove page granularity tracking for vm_insert_pfn maps
  x86: disable X86_PTRACE_BTS for now
  x86, documentation: kernel-parameters replace X86-32,X86-64 with X86
  x86: pci-swiotlb.c swiotlb_dma_ops should be static
  x86, PAT: Remove duplicate memtype reserve in devmem mmap
  x86, PAT: Consolidate code in pat_x_mtrr_type() and reserve_memtype()
  x86, PAT: Changing memtype to WC ensuring no WB alias
  x86, PAT: Handle faults cleanly in set_memory_ APIs
  x86, PAT: Change order of cpa and free in set_memory_wb
  x86, CPA: Change idmap attribute before ioremap attribute setup
2009-04-17 09:56:11 -07:00
Yinghai Lu ca713c2ab0 x86/irq: mark NUMA_MIGRATE_IRQ_DESC broken
It causes crash on system with lots of cards with MSI-X
when irq_balancer enabled...

The patches fixing it were both complex and fragile, according
to Eric they were also doing quite dangerous things to the
hardware.

Instead we now have patches that solve this problem via static
NUMA node mappings - not dynamic allocation and balancing.

The patches are much simpler than this method but are still too
large outside of the merge window, so we mark the dynamic balancer
as broken for now, and queue up the new approach for v2.6.31.

[ Impact: deactivate broken kernel feature ]

Reported-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
LKML-Reference: <49E68C41.4020801@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-17 03:20:01 +02:00
Linus Torvalds 20d9207849 Merge branch 'x86/uv' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86/uv' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: UV BAU distribution and payload MMRs
  x86: UV: BAU partition-relative distribution map
  x86, uv: add Kconfig dependency on NUMA for UV systems
  x86: prevent /sys/firmware/sgi_uv from being created on non-uv systems
  x86, UV: Fix for nodes with memory and no cpus
  x86, UV: system table in bios accessed after unmap
  x86: UV BAU messaging timeouts
  x86: UV BAU and nodes with no memory
2009-04-16 16:43:20 -07:00
Dmitry Adamushko 0917798d82 x86: fix microcode driver newly spewing warnings
Jeff Garzik reported this WARN_ON() noise:

> Kernel: 2.6.30-rc1-00306-g8371f87
> Hardware: ICH10 x86-64
>
> This is a regression from 2.6.29.  Microcode spews the following WARNING
> multiple times during boot:
>
> ------------[ cut here ]------------
> WARNING: at fs/sysfs/group.c:138 sysfs_remove_group+0xeb/0xf0()
> Hardware name:         sysfs group ffffffffa0209700 not found for
>  kobject 'cpu0'

Keep sysfs files around for cpus even when we failed to locate
microcode for them at the moment of module loading. The appropriate
microcode firmware can become available later on.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-17 01:11:20 +02:00
Pallipadi, Venkatesh 4b06504627 x86, PAT: Remove page granularity tracking for vm_insert_pfn maps
This change resolves the problem of too many single page entries
in pat_memtype_list and "freeing invalid memtype" errors with i915,
reported here:

  http://marc.info/?l=linux-kernel&m=123845244713183&w=2

Remove page level granularity track and untrack of vm_insert_pfn.
memtype tracking at page granularity does not scale and cleaner
approach would be for the driver to request a type for a bigger
IO address range or PCI io memory range for that device, either at
mmap time or driver init time and just use that type during
vm_insert_pfn.

This patch just removes the track/untrack of vm_insert_pfn. That
means we will be in same state as 2.6.28, with respect to these APIs.

Newer APIs for the drivers to request a memtype for a bigger region
is coming soon.

[ Impact: fix Xorg startup warnings and hangs ]

Reported-by: Arkadiusz Miskiewicz <a.miskiewicz@gmail.com>
Tested-by: Arkadiusz Miskiewicz <a.miskiewicz@gmail.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
LKML-Reference: <20090408223716.GC3493@linux-os.sc.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-17 00:44:22 +02:00
Cliff Wickman 4ea3c51d5b x86: UV BAU distribution and payload MMRs
This patch correctly sets BAU memory mapped registers to point
to the sending activation descriptor table and target payload table.

The "Broadcast Assist Unit" is used for TLB shootdown in UV.

The memory mapped registers that point to sending and receiving
memory structures contain node numbers.

In one case the __pa() function did not provide the node id of
memory on blade zero in configurations where that id is nonzero.
In another case, it was assumed that memory was allocated on
the local node.  That assumption is not true in a configuration
in which the node has no memory.

Tested on the UV hardware simulator.

[ Impact: fix possible runtime crash due to incorrect TLB logic ]

Signed-off-by: Cliff Wickman <cpw@sgi.com>
LKML-Reference: <E1LuR5Z-0007An-B8@eag09.americas.sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-16 19:44:16 +02:00
Ingo Molnar d45b41ae8d x86: disable X86_PTRACE_BTS for now
Oleg Nesterov found a couple of races in the ptrace-bts code
and fixes are queued up for it but they did not get ready in time
for the merge window. We'll merge them in v2.6.31 - until then
mark the feature as CONFIG_BROKEN. There's no user-space yet
making use of this so it's not a big issue.

Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-15 23:15:14 +02:00
Linus Torvalds ea34f43a07 acpi-cpufreq: fix 'smp_call_function_many()' confusion
It turns out that 'smp_call_function_many()' doesn't work at all like
'smp_call_function_single()', and my change to Andrew's patch to use it
rather than a loop over all CPU's acpi-cpufreq doesn't work.

My bad.

'smp_call_function_many()' has two "features" (aka "documented bugs"):

 (a) it needs to be called with preemption disabled, because it uses
     smp_processor_id() without guarding the CPU lookup with 'get_cpu()'
     and 'put_cpu()' like the 'single' variant does.

 (b) even if the current CPU is part of the CPU mask, it won't do the
     call on that CPU.

Still, we're better off trying to use 'smp_call_function_many()' than
looping over CPU's, since it at least in theory allows us to use a
broadcast IPI and do it all in parallel.  So let's just work around the
silly semantic bugs in that function.

Reported-and-tested-by: Ali Gholami Rudi <ali@rudi.ir>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Dave Jones <davej@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-15 08:41:16 -07:00
Hugh Dickins 6f66cbc630 x86 microcode: revert some work_on_cpu
Revert part of af5c820a31 ("x86: cpumask:
use work_on_cpu in arch/x86/kernel/microcode_core.c")

That change is causing only one Intel CPU's microcode to be updated e.g.
microcode: CPU3 updated from revision 0x9 to 0x17, date = 2005-04-22
where before it announced that also for CPU0 and CPU1 and CPU2.

We cannot use work_on_cpu() in the CONFIG_MICROCODE_OLD_INTERFACE code,
because Intel's request_microcode_user() involves a copy_from_user() from
/sbin/microcode_ctl, which therefore needs to be on that CPU at the time.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-14 11:45:59 -07:00
Cliff Wickman 94ca8e4852 x86: UV: BAU partition-relative distribution map
This patch enables each partition's BAU distribution bit map
to be partition-relative.

The distribution bitmap had been constructed assuming 0 as the base
node number.  That construct would not have allowed a total system of
greater than 256 nodes.
It also corrects an error that occurred when the first blade's nasid
was not zero.  That nasid was stored as the base node.
The base node number gets added by hardware to the node numbers implied
in the distribution bitmap, resulting in invalid target nasids.

Tested on the UV hardware simulator.

Signed-off-by: Cliff Wickman <cpw@sgi.com>
LKML-Reference: <E1Ltl0C-0004Ob-37@eag09.americas.sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-14 18:53:36 +02:00
Pallipadi, Venkatesh 6ec3cfeca0 x86, irq: Remove IRQ_DISABLED check in process context IRQ move
As discussed in the thread here:

  http://marc.info/?l=linux-kernel&m=123964468521142&w=2

Eric W. Biederman observed:

> It looks like some additional bugs have slipped in since last I looked.
>
> set_irq_affinity does this:
> ifdef CONFIG_GENERIC_PENDING_IRQ
>        if (desc->status & IRQ_MOVE_PCNTXT || desc->status & IRQ_DISABLED) {
>                cpumask_copy(desc->affinity, cpumask);
>                desc->chip->set_affinity(irq, cpumask);
>        } else {
>                desc->status |= IRQ_MOVE_PENDING;
>                cpumask_copy(desc->pending_mask, cpumask);
>        }
> #else
>
> That IRQ_DISABLED case is a software state and as such it has nothing to
> do with how safe it is to move an irq in process context.

[...]

>
> The only reason we migrate MSIs in interrupt context today is that there
> wasn't infrastructure for support migration both in interrupt context
> and outside of it.

Yes. The idea here was to force the MSI migration to happen in process
context. One of the patches in the series did

        disable_irq(dev->irq);
        irq_set_affinity(dev->irq, cpumask_of(dev->cpu));
        enable_irq(dev->irq);

with the above patch adding irq/manage code check for interrupt disabled
and moving the interrupt in process context.

IIRC, there was no IRQ_MOVE_PCNTXT when we were developing this HPET
code and we ended up having this ugly hack. IRQ_MOVE_PCNTXT was there
when we eventually submitted the patch upstream. But, looks like I did a
blind rebasing instead of using IRQ_MOVE_PCNTXT in hpet MSI code.

Below patch fixes this. i.e., revert commit 932775a4ab
and add PCNTXT to HPET MSI setup. Also removes copying of desc->affinity
in generic code as set_affinity routines are doing it internally.

Reported-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "Li Shaohua" <shaohua.li@intel.com>
Cc: Gary Hade <garyhade@us.ibm.com>
Cc: "lcm@us.ibm.com" <lcm@us.ibm.com>
Cc: suresh.b.siddha@intel.com
LKML-Reference: <20090413222058.GB8211@linux-os.sc.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-14 15:21:13 +02:00
Linus Torvalds 1c98aa7424 Fix quilt merge error in acpi-cpufreq.c
We ended up incorrectly using '&cur' instead of '&readin' in the
work_on_cpu() -> smp_call_function_single() transformation in commit
01599fca67 ("cpufreq: use
smp_call_function_[single|many]() in acpi-cpufreq.c").

Andrew explains:
 "OK, the acpi tree went and had conflicting changes merged into it after
  I'd written the patch and it appears that I incorrectly reverted part
  of 18b2646fe3 while fixing the resulting
  rejects.

  Switching it to `readin' looks correct."

Acked-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-13 18:09:20 -07:00
Jaswinder Singh Rajput ff6c6fed3a x86: pci-swiotlb.c swiotlb_dma_ops should be static
Impact: reduce kernel size a bit, address sparse warning

Addresses the problem pointed out by this sparse warning:

  arch/x86/kernel/pci-swiotlb.c:53:20: warning: symbol 'swiotlb_dma_ops' was not declared. Should it be static?

For x86: swiotlb_dma_ops can be static, because it's not used outside
of pci-swiotlb.c

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
LKML-Reference: <1239558861.3938.2.camel@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-14 02:51:04 +02:00
Linus Torvalds 2e1c63b7ed Merge branch 'for-rc1/xen/core' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen
* 'for-rc1/xen/core' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen:
  xen: add FIX_TEXT_POKE to fixmap
  xen: honour VCPU availability on boot
  xen: clean up gate trap/interrupt constants
  xen: set _PAGE_NX in __supported_pte_mask before pagetable construction
  xen: resume interrupts before system devices.
  xen/mmu: weaken flush_tlb_other test
  xen/mmu: some early pagetable cleanups
  Xen: Add virt_to_pfn helper function
  x86-64: remove PGE from must-have feature list
  xen: mask XSAVE from cpuid
  NULL noise: arch/x86/xen/smp.c
  xen: remove xen_load_gdt debug
  xen: make xen_load_gdt simpler
  xen: clean up xen_load_gdt
  xen: split construction of p2m mfn tables from registration
  xen: separate p2m allocation from setting
  xen: disable preempt for leave_lazy_mmu
2009-04-13 15:30:20 -07:00
Linus Torvalds b8256b45d1 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: add linux kernel support for YMM state
  x86: fix wrong section of pat_disable & make it static
  x86: Fix section mismatches in mpparse
  x86: fix set_fixmap to use phys_addr_t
  x86: Document get_user_pages_fast()
  x86, intr-remap: fix eoi for interrupt remapping without x2apic
2009-04-13 11:32:09 -07:00
Andrew Morton 01599fca67 cpufreq: use smp_call_function_[single|many]() in acpi-cpufreq.c
Atttempting to rid us of the problematic work_on_cpu().  Just use
smp_call_fuction_single() here.

This repairs a 10% sysbench(oltp)+mysql regression which Mike reported,
due to

  commit 6b44003e5c
  Author: Andrew Morton <akpm@linux-foundation.org>
  Date:   Thu Apr 9 09:50:37 2009 -0600

      work_on_cpu(): rewrite it to create a kernel thread on demand

It seems that the kernel calls these acpi-cpufreq functions at a quite
high frequency.

Valdis Kletnieks also reports that this causes 70-90 forks per second on
his hardware.

Cc: Valdis.Kletnieks@vt.edu
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Zhao Yakui <yakui.zhao@intel.com>
Acked-by: Dave Jones <davej@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Mike Galbraith <efault@gmx.de>
Cc: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
[ Made it use smp_call_function_many() instead of looping over cpu's
  with smp_call_function_single()    - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-13 11:09:46 -07:00
Suresh Siddha a30469e792 x86: add linux kernel support for YMM state
Impact: save/restore Intel-AVX state properly between tasks

Intel Advanced Vector Extensions (AVX) introduce 256-bit vector processing
capability. More about AVX at http://software.intel.com/sites/avx

Add OS support for YMM state management using xsave/xrstor infrastructure
to support AVX.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <1239402084.27006.8057.camel@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-12 13:08:56 +02:00
Marcin Slusarz 1ee4bd92a7 x86: fix wrong section of pat_disable & make it static
pat_disable cannot be __cpuinit anymore because it's called from pat_init
and the callchain looks like this:
pat_disable [cpuinit] <- pat_init <- generic_set_all <-
 ipi_handler <- set_mtrr <- (other non init/cpuinit functions)

WARNING: arch/x86/mm/built-in.o(.text+0x449e): Section mismatch in reference
from the function pat_init() to the function .cpuinit.text:pat_disable()
The function pat_init() references
the function __cpuinit pat_disable().
This is often because pat_init lacks a __cpuinit
annotation or the annotation of pat_disable is wrong.

Non CONFIG_X86_PAT version of pat_disable is static inline, so this version
can be static too (and there are no callers outside of this file).

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
LKML-Reference: <49DFB055.6070405@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-12 12:34:23 +02:00
Rakib Mullick 575922248c x86: Fix section mismatches in mpparse
Impact: fix section mismatch

In arch/x86/kernel/mpparse.c, smp_reserve_bootmem() has been called
and also refers to a function which is in .init section. Thus causes
the first warning. And check_irq_src() also requires an __init,
because it refers to an .init section.

Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <b9df5fa10904102004g51265d9axc8d07278bfdb6ba0@mail.gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-12 12:32:18 +02:00
Masami Hiramatsu 9b987aeb4a x86: fix set_fixmap to use phys_addr_t
Impact: fix kprobes crash on 32-bit with RAM above 4G

Use phys_addr_t for receiving a physical address argument
instead of unsigned long. This allows fixmap to handle
pages higher than 4GB on x86-32.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: systemtap-ml <systemtap@sources.redhat.com>
Cc: Gary Hade <garyhade@us.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <49DE3695.6040800@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-10 20:27:13 +02:00
Suresh Siddha 0c3c8a1836 x86, PAT: Remove duplicate memtype reserve in devmem mmap
/dev/mem mmap code was doing memtype reserve/free for a while now.
Recently we added memtype tracking in remap_pfn_range, and /dev/mem mmap
uses it indirectly. So, we don't need seperate tracking in /dev/mem code
any more. That means another ~100 lines of code removed :-).

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
LKML-Reference: <20090409212709.085210000@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-10 13:55:48 +02:00
Suresh Siddha b6ff32d9aa x86, PAT: Consolidate code in pat_x_mtrr_type() and reserve_memtype()
Fix pat_x_mtrr_type() to use UC_MINUS when the mtrr type return UC. This
is to be  consistent with ioremap() and ioremap_nocache() which uses
UC_MINUS.

Consolidate the code such that reserve_memtype() also uses
pat_x_mtrr_type() when the caller doesn't specify any special attribute
(non WB attribute).

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
LKML-Reference: <20090409212708.939936000@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-10 13:55:48 +02:00
venkatesh.pallipadi@intel.com 3869c4aa18 x86, PAT: Changing memtype to WC ensuring no WB alias
As per SDM, there should not be any aliasing of a WC with any cacheable
type across CPUs. That is if one CPU is changing the identity map
memtype to _WC, no other CPU at the time of this change should not have a
TLB for this page that carries a WB attribute. SDM suggests to make the
page not present. But for that we will have to handle any page faults
that can potentially happen due to these pages being not present.

Other way to deal with this without having any WB mapping is to change
the page first to UC and then to WC. This ensures that we meet the SDM
requirement of no cacheable alais to WC page. This also has same or
lower overhead than marking the page not present and making it present
later.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <20090409212708.797481000@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-10 13:55:47 +02:00
venkatesh.pallipadi@intel.com 9fa3ab390a x86, PAT: Handle faults cleanly in set_memory_ APIs
Handle faults and do proper cleanups in set_memory_*() functions. In
some cases, these functions were not doing proper free on failure paths.

With the changes to tracking memtype of RAM pages in struct page instead
of pat list, we do not need the changes in commits c5e147. This patch
reverts that change.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <20090409212708.653222000@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-10 13:55:47 +02:00
venkatesh.pallipadi@intel.com a5593e0b32 x86, PAT: Change order of cpa and free in set_memory_wb
To be free of aliasing due to races, set_memory_* interfaces should
follow ordering of reserving, changing memtype to UC/WC, changing
memtype back to WB followed by free.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <20090409212708.512280000@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-10 13:55:46 +02:00
Suresh Siddha 43a432b155 x86, CPA: Change idmap attribute before ioremap attribute setup
Change the identity mapping with the requested attribute first, before
we setup the virtual memory mapping with the new requested attribute.

This makes sure that there is no window when identity map'ed attribute
may disagree with ioremap range on the attribute type.

This also avoids doing cpa on the ioremap'ed address twice (first in
ioremap_page_range and then in ioremap_change_attr using vaddr), and
should improve ioremap performance a bit.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
LKML-Reference: <20090409212708.373330000@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-10 13:55:46 +02:00
Andy Grover a0d22f485a x86: Document get_user_pages_fast()
While better than get_user_pages(), the usage of gupf(),
especially the return values and the fact that it can
potentially only partially pin the range, warranted some
documentation.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
Cc: npiggin@suse.de
Cc: akpm@linux-foundation.org
LKML-Reference: <1239320729-3262-1-git-send-email-andy.grover@oracle.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-10 13:14:08 +02:00
Weidong Han 746cddd37d x86, intr-remap: fix eoi for interrupt remapping without x2apic
To simplify level irq migration in the presence of interrupt-remapping,
Suresh used a virtual vector (io-apic pin number) to eliminate io-apic
RTE modification. Level triggered interrupt will appear as an edge to
the local apic cpu but still as level to the IO-APIC. So in addition to
do the local apic EOI, it still needs to do IO-APIC directed EOI to clear
the remote IRR bit in the IO-APIC RTE. Pls refer to Suresh's patch for
more details (commit 0280f7c416).

Now interrupt remapping is decoupled from x2apic, it also needs to do the
directed EOI for apic. Otherwise, apic interrupts won't work correctly.

Signed-off-by: Weidong Han <weidong.han@intel.com>
Cc: iommu@lists.linux-foundation.org
Cc: Weidong Han <weidong.han@intel.com>
Cc: suresh.b.siddha@intel.com
Cc: dwmw2@infradead.org
Cc: allen.m.kay@intel.com
LKML-Reference: <1239355037-22856-1-git-send-email-weidong.han@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-10 13:12:17 +02:00
Masami Hiramatsu 3b3809ac53 x86: fix set_fixmap to use phys_addr_t
Use phys_addr_t for receiving a physical address argument instead of
unsigned long.  This allows fixmap to handle pages higher than 4GB on
x86-32.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-09 16:41:45 -07:00
Linus Torvalds e66dd19092 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: cpu_debug remove execute permission
  x86: smarten /proc/interrupts output for new counters
  x86: DMI match for the Dell DXP061 as it needs BIOS reboot
  x86: make 64 bit to use default_inquire_remote_apic
  x86, setup: un-resequence mode setting for VGA 80x34 and 80x60 modes
  x86, intel-iommu: fix X2APIC && !ACPI build failure
2009-04-09 10:38:23 -07:00
Linus Torvalds c2ea122cd7 Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  tracing: consolidate documents
  blktrace: pass the right pointer to kfree()
  tracing/syscalls: use a dedicated file header
  tracing: append a comma to INIT_FTRACE_GRAPH
2009-04-09 10:37:46 -07:00
Jaswinder Singh Rajput f20ab9c38f x86: cpu_debug remove execute permission
It seems by mistake these files got execute permissions so removing it.

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
LKML-Reference: <1239211186.9037.2.camel@ht.satnam>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-09 06:34:02 +02:00
Frederic Weisbecker 47788c58e6 tracing/syscalls: use a dedicated file header
Impact: fix build warnings and possibe compat misbehavior on IA64

Building a kernel on ia64 might trigger these ugly build warnings:

CC      arch/ia64/ia32/sys_ia32.o
In file included from arch/ia64/ia32/sys_ia32.c:55:
arch/ia64/ia32/ia32priv.h:290:1: warning: "elf_check_arch" redefined
In file included from include/linux/elf.h:7,
                 from include/linux/module.h:14,
                 from include/linux/ftrace.h:8,
                 from include/linux/syscalls.h:68,
                 from arch/ia64/ia32/sys_ia32.c:18:
arch/ia64/include/asm/elf.h:19:1: warning: this is the location of the previous definition
[...]

sys_ia32.c includes linux/syscalls.h which in turn includes linux/ftrace.h
to import the syscalls tracing prototypes.

But including ftrace.h can pull too much things for a low level file,
especially on ia64 where the ia32 private headers conflict with higher
level headers.

Now we isolate the syscall tracing headers in their own lightweight file.

Reported-by: Tony Luck <tony.luck@intel.com>
Tested-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Jason Baron <jbaron@redhat.com>
Cc: "Frank Ch. Eigler" <fche@redhat.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Jiaying Zhang <jiayingz@google.com>
Cc: Michael Rubin <mrubin@google.com>
Cc: Martin Bligh <mbligh@google.com>
Cc: Michael Davidson <md@google.com>
LKML-Reference: <20090408184058.GB6017@nowhere>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-09 05:43:32 +02:00
Jeremy Fitzhardinge 3ecb1b7df9 xen: add FIX_TEXT_POKE to fixmap
FIX_TEXT_POKE[01] are used to map kernel addresses, so they're mapping
pfns, not mfns.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-04-08 17:57:19 -07:00
Jeremy Fitzhardinge 2b2a733447 xen: clean up gate trap/interrupt constants
Use GATE_INTERRUPT/TRAP rather than 0xe/f.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-04-08 14:25:50 -07:00
Jeremy Fitzhardinge bc6081ff98 xen: set _PAGE_NX in __supported_pte_mask before pagetable construction
Some 64-bit machines don't support the NX flag in ptes.
Check for NX before constructing the kernel pagetables.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-04-08 14:25:49 -07:00
Jeremy Fitzhardinge e3f8a74e3a xen/mmu: weaken flush_tlb_other test
Impact: fixes crashing bug

There's no particular problem with getting an empty cpu mask,
so just shortcut-return if we get one.

Avoids crash reported by Christophe Saout <christophe@saout.de>

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-04-08 14:25:46 -07:00
Jeremy Fitzhardinge b96229b50d xen/mmu: some early pagetable cleanups
1. make sure early-allocated ptes are pinned, so they can be later
   unpinned
2. don't pin pmd+pud, just make them RO
3. scatter some __inits around

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-04-08 14:25:45 -07:00
Alex Nixon b40bf53eff Xen: Add virt_to_pfn helper function
Signed-off-by: Alex Nixon <alex.nixon@citrix.com>
2009-04-08 11:51:46 -07:00
Jeremy Fitzhardinge 10eceebeaa x86-64: remove PGE from must-have feature list
PGE may not be available when running paravirtualized, so test the cpuid
bit before using it.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-04-08 11:51:46 -07:00
Jeremy Fitzhardinge 191216b928 xen: mask XSAVE from cpuid
Xen leaves XSAVE set in cpuid, but doesn't allow cr4.OSXSAVE
to be set.  This confuses the kernel and it ends up crashing on
an xsetbv instruction.

At boot time, try to set cr4.OSXSAVE, and mask XSAVE out of
cpuid it we can't.  This will produce a spurious error from Xen,
but allows us to support XSAVE if/when Xen does.

This also factors out the cpuid mask decisions to boot time.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-04-08 11:51:46 -07:00
Hannes Eder 1207cf8eb9 NULL noise: arch/x86/xen/smp.c
Fix this sparse warnings:
  arch/x86/xen/smp.c:316:52: warning: Using plain integer as NULL pointer
  arch/x86/xen/smp.c:421:60: warning: Using plain integer as NULL pointer

Signed-off-by: Hannes Eder <hannes@hanneseder.net>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-04-08 11:51:45 -07:00
Jeremy Fitzhardinge c667d5d6a7 xen: remove xen_load_gdt debug
Don't need the noise.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-04-08 11:51:45 -07:00
Jeremy Fitzhardinge a957fac500 xen: make xen_load_gdt simpler
Remove use of multicall machinery which is unused (gdt loading
is never performance critical).  This removes the implicit use
of percpu variables, which simplifies understanding how
the percpu code's use of load_gdt interacts with this code.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-04-08 11:51:45 -07:00
Jeremy Fitzhardinge c7da8c829b xen: clean up xen_load_gdt
Makes the logic a bit clearer.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-04-08 11:51:45 -07:00
Jeremy Fitzhardinge cdaead6b4e xen: split construction of p2m mfn tables from registration
Build the p2m_mfn_list_list early with the rest of the p2m table, but
register it later when the real shared_info structure is in place.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-04-08 11:51:44 -07:00
Jeremy Fitzhardinge e791ca0fd7 xen: separate p2m allocation from setting
When doing very early p2m setting, we need to separate setting
from allocation, so split things up accordingly.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-04-08 11:51:44 -07:00
Jeremy Fitzhardinge d6382bf77e xen: disable preempt for leave_lazy_mmu
xen_mc_flush() requires preemption to be disabled for its own sanity,
so disable it while we're flushing.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-04-08 11:51:44 -07:00
Hidetoshi Seto 59d138120d x86: smarten /proc/interrupts output for new counters
Now /proc/interrupts of tip tree has new counters:

  PLT: Platform interrupts

Format change of output, as like that by commit:

  commit 7a81d9a7da
  x86: smarten /proc/interrupts output

should be applied to these new counters too.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Jan Beulich <jbeulich@novell.com>
LKML-Reference: <49C98DEA.8060208@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-08 18:06:07 +02:00
Ingo Molnar 280ff388b1 Merge commit 'v2.6.30-rc1' into x86/urgent
Merge reason: fix to be queued up depends on upstream facilities

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-08 18:04:52 +02:00
Alan Cox c5da9a2bb2 x86: DMI match for the Dell DXP061 as it needs BIOS reboot
Closes http://bugzilla.kernel.org/show_bug.cgi?12901

Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
LKML-Reference: <20090326204524.4454.8776.stgit@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-08 17:53:27 +02:00
Yinghai Lu 08d63b10db x86: make 64 bit to use default_inquire_remote_apic
Impact: restore old behavior

for flat and phys_flat

Signed-off-by: Yinhai Lu <yinghai@kernel.org.
LKML-Reference: <49DCBBF1.8080903@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-08 17:36:12 +02:00
Jack Steiner 54c28d294c x86, uv: add Kconfig dependency on NUMA for UV systems
Impact: build fix

Add Kconfig dependency on NUMA for enabling UV. Although it might
be possible to configure non-NUMA UV systems, they are unsupported
and not interesting. Much of the infrastructure for UV requires
NUMA support.

Signed-off-by: Jack Steiner <steiner@sgi.com>
LKML-Reference: <20090403203942.GA20137@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-08 14:58:11 +02:00
Russ Anderson 06aa05b307 x86: prevent /sys/firmware/sgi_uv from being created on non-uv systems
/sys/firmware/sgi_uv should only be created on uv systems.

Signed-off-by: Russ Anderson <rja@sgi.com>
LKML-Reference: <20090403222423.GA28546@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-08 14:58:10 +02:00
Len Brown 8897c18595 Merge branches 'release', 'APERF', 'ARAT', 'misc', 'kelvin', 'device-lock' and 'bjorn.notify' into release 2009-04-07 18:18:42 -04:00
Venkatesh Pallipadi db954b5898 x86 ACPI: Add support for Always Running APIC timer
Add support for Always Running APIC timer, CPUID_0x6_EAX_Bit2.
This bit means the APIC timer continues to run even when CPU is
in deep C-states.

The advantage is that we can use LAPIC timer on these CPUs
always, and there is no need for "slow to read and program"
external timers (HPET/PIT) and the timer broadcast logic
and related code in C-state entry and exit.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-04-07 18:17:51 -04:00
Venkatesh Pallipadi 18b2646fe3 ACPI x86: Make aperf/mperf MSR access in acpi_cpufreq read_only
Do not write zeroes to APERF and MPERF by ondemand governor. With this
change, other users can share these MSRs for reads.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-04-07 18:15:05 -04:00
Venkatesh Pallipadi e4f6937222 ACPI x86: Cleanup acpi_cpufreq structures related to aperf/mperf
Change structure name to make the code cleaner and simpler. No
functionality change in this patch.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-04-07 18:15:05 -04:00
Linus Torvalds c93f216b5b Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  branch tracer, intel-iommu: fix build with CONFIG_BRANCH_TRACER=y
  branch tracer: Fix for enabling branch profiling makes sparse unusable
  ftrace: Correct a text align for event format output
  Update /debug/tracing/README
  tracing/ftrace: alloc the started cpumask for the trace file
  tracing, x86: remove duplicated #include
  ftrace: Add check of sched_stopped for probe_sched_wakeup
  function-graph: add proper initialization for init task
  tracing/ftrace: fix missing include string.h
  tracing: fix incorrect return type of ns2usecs()
  tracing: remove CALLER_ADDR2 from wakeup tracer
  blktrace: fix pdu_len when tracing packet command requests
  blktrace: small cleanup in blk_msg_write()
  blktrace: NUL-terminate user space messages
  tracing: move scripts/trace/power.pl to scripts/tracing/power.pl
2009-04-07 14:10:10 -07:00
H. Peter Anvin 1e274a5827 x86, setup: un-resequence mode setting for VGA 80x34 and 80x60 modes
Impact: Fixes these modes on at least one system

The rewrite of the setup code into C resequenced the font setting and
register reprogramming phases of configuring nonstandard VGA modes
which use 480 scan lines in text mode.  However, there exists at least
one board (Micro-Star MS-7383 version 2.0) on which this resequencing
causes an unusable display.

Revert to the original sequencing: set up 480-line mode, install the
font, and then adjust the vertical end register appropriately.

This failure was masked by the fact that the 480-line setup was broken
until checkin 5f64135612 (therefore this
is not a -stable candidate bug fix.)

Reported-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-04-07 10:59:25 -07:00
Yang Hongyang 2f4f27d42a dma-mapping: replace all DMA_24BIT_MASK macro with DMA_BIT_MASK(24)
Replace all DMA_24BIT_MASK macro with DMA_BIT_MASK(24)

Signed-off-by: Yang Hongyang<yanghy@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-07 08:31:12 -07:00
Yang Hongyang 284901a90a dma-mapping: replace all DMA_32BIT_MASK macro with DMA_BIT_MASK(32)
Replace all DMA_32BIT_MASK macro with DMA_BIT_MASK(32)

Signed-off-by: Yang Hongyang<yanghy@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-07 08:31:11 -07:00
Yang Hongyang 50cf156af7 dma-mapping: replace all DMA_40BIT_MASK macro with DMA_BIT_MASK(40)
Replace all DMA_40BIT_MASK macro with DMA_BIT_MASK(40)

Signed-off-by: Yang Hongyang<yanghy@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-07 08:31:10 -07:00
Huang Weiyi 5ab8026a30 tracing, x86: remove duplicated #include
Remove duplicated #include in arch/x86/kernel/ftrace.c.

Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
LKML-Reference: <1238503291-2532-1-git-send-email-weiyi.huang@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-07 14:01:53 +02:00
David Woodhouse f7d7f866ba x86, intel-iommu: fix X2APIC && !ACPI build failure
This build failure:

| drivers/pci/dmar.c:47: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘dmar_tbl_size’
| drivers/pci/dmar.c:62: warning: ‘struct acpi_dmar_device_scope’ declared inside parameter list
| drivers/pci/dmar.c:62: warning: its scope is only this definition or declaration, which is probably not what you want

Triggers due to this commit:

  d0b03bd: x2apic/intr-remap: decouple interrupt remapping from x2apic

Which exposed a pre-existing but dormant fragility of the 'select X86_X2APIC'
it moved around and turned that fragility into a build failure.

Replace it with a proper 'depends on' construct.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
LKML-Reference: <1239084280.22733.404.camel@macbook.infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-07 08:22:46 +02:00
Huang Weiyi d226169428 ACPI: cpufreq: remove dupilcated #include
Remove dupilicated #include in arch/x86/kernel/cpu/cpufreq/longhaul.c.

Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-04-07 01:39:14 -04:00
Linus Torvalds ffa009c366 Merge git://git.infradead.org/iommu-2.6
* git://git.infradead.org/iommu-2.6:
  drivers/pci/intr_remapping.c: include acpi.h
  intel-iommu: Fix oops in device_to_iommu() when devices not found.
  intel-iommu: Handle PCI domains appropriately.
  intel-iommu: Fix device-to-iommu mapping for PCI-PCI bridges.
  x2apic/intr-remap: decouple interrupt remapping from x2apic
  x86, dmar: check if it's initialized before disable queue invalidation
  intel-iommu: set compatibility format interrupt
  Intel IOMMU Suspend/Resume Support - Interrupt Remapping
  Intel IOMMU Suspend/Resume Support - Queued Invalidation
  Intel IOMMU Suspend/Resume Support - DMAR
  intel-iommu: Add for_each_iommu() and for_each_active_iommu() macros
2009-04-06 14:26:05 -07:00
Linus Torvalds 32fb6c1756 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (140 commits)
  ACPI: processor: use .notify method instead of installing handler directly
  ACPI: button: use .notify method instead of installing handler directly
  ACPI: support acpi_device_ops .notify methods
  toshiba-acpi: remove MAINTAINERS entry
  ACPI: battery: asynchronous init
  acer-wmi: Update copyright notice & documentation
  acer-wmi: Cleanup the failure cleanup handling
  acer-wmi: Blacklist Acer Aspire One
  video: build fix
  thinkpad-acpi: rework brightness support
  thinkpad-acpi: enhanced debugging messages for the fan subdriver
  thinkpad-acpi: enhanced debugging messages for the hotkey subdriver
  thinkpad-acpi: enhanced debugging messages for rfkill subdrivers
  thinkpad-acpi: restrict access to some firmware LEDs
  thinkpad-acpi: remove HKEY disable functionality
  thinkpad-acpi: add new debug helpers and warn of deprecated atts
  thinkpad-acpi: add missing log levels
  thinkpad-acpi: cleanup debug helpers
  thinkpad-acpi: documentation cleanup
  thinkpad-acpi: drop ibm-acpi alias
  ...
2009-04-05 11:16:25 -07:00
Linus Torvalds 714f83d5d9 Merge branch 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (413 commits)
  tracing, net: fix net tree and tracing tree merge interaction
  tracing, powerpc: fix powerpc tree and tracing tree interaction
  ring-buffer: do not remove reader page from list on ring buffer free
  function-graph: allow unregistering twice
  trace: make argument 'mem' of trace_seq_putmem() const
  tracing: add missing 'extern' keywords to trace_output.h
  tracing: provide trace_seq_reserve()
  blktrace: print out BLK_TN_MESSAGE properly
  blktrace: extract duplidate code
  blktrace: fix memory leak when freeing struct blk_io_trace
  blktrace: fix blk_probes_ref chaos
  blktrace: make classic output more classic
  blktrace: fix off-by-one bug
  blktrace: fix the original blktrace
  blktrace: fix a race when creating blk_tree_root in debugfs
  blktrace: fix timestamp in binary output
  tracing, Text Edit Lock: cleanup
  tracing: filter fix for TRACE_EVENT_FORMAT events
  ftrace: Using FTRACE_WARN_ON() to check "freed record" in ftrace_release()
  x86: kretprobe-booster interrupt emulation code fix
  ...

Fix up trivial conflicts in
 arch/parisc/include/asm/ftrace.h
 include/linux/memory.h
 kernel/extable.c
 kernel/module.c
2009-04-05 11:04:19 -07:00
Linus Torvalds 90975ef712 Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-cpumask
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-cpumask: (36 commits)
  cpumask: remove cpumask allocation from idle_balance, fix
  numa, cpumask: move numa_node_id default implementation to topology.h, fix
  cpumask: remove cpumask allocation from idle_balance
  x86: cpumask: x86 mmio-mod.c use cpumask_var_t for downed_cpus
  x86: cpumask: update 32-bit APM not to mug current->cpus_allowed
  x86: microcode: cleanup
  x86: cpumask: use work_on_cpu in arch/x86/kernel/microcode_core.c
  cpumask: fix CONFIG_CPUMASK_OFFSTACK=y cpu hotunplug crash
  numa, cpumask: move numa_node_id default implementation to topology.h
  cpumask: convert node_to_cpumask_map[] to cpumask_var_t
  cpumask: remove x86 cpumask_t uses.
  cpumask: use cpumask_var_t in uv_flush_tlb_others.
  cpumask: remove cpumask_t assignment from vector_allocation_domain()
  cpumask: make Xen use the new operators.
  cpumask: clean up summit's send_IPI functions
  cpumask: use new cpumask functions throughout x86
  x86: unify cpu_callin_mask/cpu_callout_mask/cpu_initialized_mask/cpu_sibling_setup_mask
  cpumask: convert struct cpuinfo_x86's llc_shared_map to cpumask_var_t
  cpumask: convert node_to_cpumask_map[] to cpumask_var_t
  x86: unify 32 and 64-bit node_to_cpumask_map
  ...
2009-04-05 10:33:07 -07:00
Len Brown 478c6a43fc Merge branch 'linus' into release
Conflicts:
	arch/x86/kernel/cpu/cpufreq/longhaul.c

Signed-off-by: Len Brown <len.brown@intel.com>
2009-04-05 02:14:15 -04:00
Len Brown 8a3f257c70 Merge branch 'misc' into release 2009-04-05 01:52:07 -04:00
Len Brown 33526a5360 Merge branch 'x2apic' into release 2009-04-05 01:51:51 -04:00
Han, Weidong d0b03bd1c6 x2apic/intr-remap: decouple interrupt remapping from x2apic
interrupt remapping must be enabled before enabling x2apic, but
interrupt remapping doesn't depend on x2apic, it can be used
separately. Enable interrupt remapping in init_dmars even x2apic
is not supported.

[dwmw2: Update Kconfig accordingly, fix build with INTR_REMAP && !X2APIC]

Signed-off-by: Weidong Han <weidong.han@intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-04-04 10:42:28 +01:00
Linus Torvalds 6bb597507f Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, mtrr: remove debug message
  x86: disable stack-protector for __restore_processor_state()
  x86: fix is_io_mapping_possible() build warning on i386 allnoconfig
  x86, setup: compile with -DDISABLE_BRANCH_PROFILING
  x86/dma: unify definition of pci_unmap_addr* and pci_unmap_len macros
  x86, mm: fix misuse of debug_kmap_atomic
  x86: remove duplicated code with pcpu_need_numa()
  x86,percpu: fix inverted NUMA test in setup_pcpu_remap()
  x86: signal: check sas_ss_size instead of sas_ss_flags()
2009-04-03 17:36:21 -07:00
Suresh Siddha 7237d3de78 x86, ACPI: add support for x2apic ACPI extensions
All logical processors with APIC ID values of 255 and greater will have their
APIC reported through Processor X2APIC structure (type-9 entry type) and all
logical processors with APIC ID less than 255 will have their APIC reported
through legacy Processor Local APIC (type-0 entry type) only. This is the
same case even for NMI structure reporting.
    
The Processor X2APIC Affinity structure provides the association between the
X2APIC ID of a logical processor and the proximity domain to which the logical
processor belongs.
    
For OSPM, Procssor IDs outside the 0-254 range are to be declared as Device()
objects in the ACPI namespace.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-04-03 20:08:12 -04:00
Ingo Molnar c5c67c7cba x86, mtrr: remove debug message
The MTRR code grew a new debug message which triggers commonly:

[   40.142276]   get_mtrr: cpu0 reg00 base=0000000000 size=0000080000 write-back
[   40.142280]   get_mtrr: cpu0 reg01 base=0000080000 size=0000040000 write-back
[   40.142284]   get_mtrr: cpu0 reg02 base=0000100000 size=0000040000 write-back
[   40.142311]   get_mtrr: cpu0 reg00 base=0000000000 size=0000080000 write-back
[   40.142314]   get_mtrr: cpu0 reg01 base=0000080000 size=0000040000 write-back
[   40.142317]   get_mtrr: cpu0 reg02 base=0000100000 size=0000040000 write-back

Remove this annoyance.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-04 00:31:02 +02:00
Linus Torvalds 811158b147 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (28 commits)
  trivial: Update my email address
  trivial: NULL noise: drivers/mtd/tests/mtd_*test.c
  trivial: NULL noise: drivers/media/dvb/frontends/drx397xD_fw.h
  trivial: Fix misspelling of "Celsius".
  trivial: remove unused variable 'path' in alloc_file()
  trivial: fix a pdlfush -> pdflush typo in comment
  trivial: jbd header comment typo fix for JBD_PARANOID_IOFAIL
  trivial: wusb: Storage class should be before const qualifier
  trivial: drivers/char/bsr.c: Storage class should be before const qualifier
  trivial: h8300: Storage class should be before const qualifier
  trivial: fix where cgroup documentation is not correctly referred to
  trivial: Give the right path in Documentation example
  trivial: MTD: remove EOL from MODULE_DESCRIPTION
  trivial: Fix typo in bio_split()'s documentation
  trivial: PWM: fix of #endif comment
  trivial: fix typos/grammar errors in Kconfig texts
  trivial: Fix misspelling of firmware
  trivial: cgroups: documentation typo and spelling corrections
  trivial: Update contact info for Jochen Hein
  trivial: fix typo "resgister" -> "register"
  ...
2009-04-03 15:24:35 -07:00
Suresh Siddha 5a3ae27605 x86, PAT: Remove duplicate memtype reserve in pci mmap
pci mmap code was doing memtype reserve for a while now. Recently we
added memtype tracking in remap_pfn_range, and pci code indirectly calls
remap_pfn_range. So, we don't need seperate tracking in pci code
anymore. Which means a patch that removes ~50 lines of code :-).

Also, recently we found out that the pci tracking is not working as we expect
it to work in some cases. Specifically, userlevel X mmap of pci, with some
recent version of X, is having a problem with vm_page_prot getting reset.
The pci tracking uses vm_page_prot to pass on the protection type from parent
to child during fork.
a) Parent does a pci mmap
b) We look at PAT and get either UC_MINUS or WC mapping for parent
c) Store that mapping type in vma vm_page_prot for future use
d) This thread does a fork
e) Fork results in mmap_ops ->open for the child process
f) We get the vm_page_prot from vma and reserve that type for the child process

But, between c) and e) above, the vma vm_page_prot is getting reset to zero.
This results in PAT reserve failing at the time of fork as in here.
http://marc.info/?l=linux-kernel&m=123858163103240&w=2

This cleanup makes the above problem go away as we do not depend on
vm_page_prot in our PAT code anymore.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-03 14:43:29 -07:00
Fenghua Yu b24696bc55 Intel IOMMU Suspend/Resume Support - Interrupt Remapping
This patch enables suspend/resume for interrupt remapping. During suspend,
interrupt remapping is disabled. When resume, interrupt remapping is enabled
again.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-04-03 21:45:59 +01:00
Jack Steiner 6a891a24e4 x86, UV: Fix for nodes with memory and no cpus
Fix initialization of UV blade information for systems that have
nodes with memory but no cpus.

Signed-off-by: Jack Steiner <steiner@sgi.com>
LKML-Reference: <20090330140111.GA18461@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-03 19:49:58 +02:00
Joseph Cihula 9b7b89efa3 x86: disable stack-protector for __restore_processor_state()
The __restore_processor_state() fn restores %gs on resume from S3.  As
such, it cannot be protected by the stack-protector guard since %gs will
not be correct on function entry.

There are only a few other fns in this file and it should not negatively
impact kernel security that they will also have the stack-protector
guard removed (and so it's not worth moving them to another file).

Without this change, S3 resume on a kernel built with
CONFIG_CC_STACKPROTECTOR_ALL=y will fail.

Signed-off-by: Joseph Cihula <joseph.cihula@intel.com>
Tested-by: Chris Wright <chrisw@sous-sol.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Tejun Heo <tj@kernel.org>
LKML-Reference: <49D13385.5060900@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-03 19:48:41 +02:00
Linus Torvalds ca1ee219c0 Merge git://git.infradead.org/iommu-2.6
* git://git.infradead.org/iommu-2.6:
  intel-iommu: Fix address wrap on 32-bit kernel.
  intel-iommu: Enable DMAR on 32-bit kernel.
  intel-iommu: fix PCI device detach from virtual machine
  intel-iommu: VT-d page table to support snooping control bit
  iommu: Add domain_has_cap iommu_ops
  intel-iommu: Snooping control support

Fixed trivial conflicts in arch/x86/Kconfig and drivers/pci/intel-iommu.c
2009-04-03 10:36:57 -07:00
Russ Anderson 1a544e659c x86, UV: system table in bios accessed after unmap
Use the copy of UV system table in kernel memory, not the one in
bios after unmapping.

Signed-off-by: Russ Anderson <rja@sgi.com>
LKML-Reference: <20090330225240.GA22776@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-03 19:25:57 +02:00
Akinobu Mita a0e0404fb0 mm: fix misuse of debug_kmap_atomic
Commit 7ca43e7564 ("mm: use debug_kmap_atomic")
introduced some debug_kmap_atomic() in wrong places.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-03 09:49:41 -07:00
Andrew Morton 6a491e2e3e x86: fix is_io_mapping_possible() build warning on i386 allnoconfig
i386 allnoconfig:

 arch/x86/mm/iomap_32.c: In function 'is_io_mapping_possible':
 arch/x86/mm/iomap_32.c:27: warning: comparison is always false due to limited range of data type

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-03 18:39:05 +02:00
Cliff Wickman c4c4688f72 x86: UV BAU messaging timeouts
This patch replaces a 'nop' uv_enable_timeouts() in the
UV TLB shootdown code. (somehow, long ago that function got
eviscerated)

If any cpu in the destination node does not get interrupted by the
message and post completion in a reasonable time the hardware
should respond to the sender with an error.  This function
enables such timeouts.

Tested on the UV hardware simulator.

Signed-off-by: Cliff Wickman <cpw@sgi.com>
LKML-Reference: <E1LpjXU-00007e-Qh@eag09.americas.sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-03 18:25:27 +02:00
Cliff Wickman 9674f35b1e x86: UV BAU and nodes with no memory
This patch fixes BAU initialization for systems containing
nodes with no memory and for systems with non-consecutive
node numbers.

Fixes and clarifies situations where pnode should be used instead
of node id.

Tested on the UV hardware simulator.

Signed-off-by: Cliff Wickman <cpw@sgi.com>
LKML-Reference: <E1LpjX3-00007N-12@eag09.americas.sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-03 18:25:26 +02:00
Zhang Rui 5b4c0b6fff ACPI: update comment
update ACPI Development Discussion List

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-04-03 12:05:14 -04:00
Ingo Molnar 484cad34dd Merge branch 'dma-debug' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux-2.6-iommu into x86/urgent 2009-04-03 16:35:09 +02:00
H. Peter Anvin 95a38f3463 x86, setup: compile with -DDISABLE_BRANCH_PROFILING
Impact: code size reduction (possibly critical)

The x86 boot and decompression code has no use of the branch profiling
constructs, so disable them.  This would bloat the setup code by as
much as 14K, eating up a fairly large chunk of the 32K area we are
guaranteed to have.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-03 16:34:45 +02:00
Joerg Roedel 67796bf7dc x86/dma: unify definition of pci_unmap_addr* and pci_unmap_len macros
Impact: unification of pci-dma macros and pci_32.h removal

This patch unifies the definition of the pci_unmap_addr*, pci_unmap_len*
and DECLARE_PCI_UNMAP* macros. This makes sense because the pci_unmap
functions are no longer no-ops anymore when the kernel runs with
CONFIG_DMA_API_DEBUG. Without an iommu or DMA_API_DEBUG it is a no-op on 32 bit
because the dma mapping path returns a physical address and therefore the
dma-api implementation has no internal state which needs to be destroyed with
an unmap call.
This unification also simplifies the port of x86_64 iommu drivers to 32 bit x86
and let us get rid of pci_32.h.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
2009-04-03 13:13:45 +02:00
Robin Holt f5f7eac41d Allow rwlocks to re-enable interrupts
Pass the original flags to rwlock arch-code, so that it can re-enable
interrupts if implemented for that architecture.

Initially, make __raw_read_lock_flags and __raw_write_lock_flags stubs
which just do the same thing as non-flags variants.

Signed-off-by: Petr Tesarik <ptesarik@suse.cz>
Signed-off-by: Robin Holt <holt@sgi.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <linux-arch@vger.kernel.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02 19:05:11 -07:00
Gerd Hoffmann f3554f4bc6 preadv/pwritev: Add preadv and pwritev system calls.
This patch adds preadv and pwritev system calls.  These syscalls are a
pretty straightforward combination of pread and readv (same for write).
They are quite useful for doing vectored I/O in threaded applications.
Using lseek+readv instead opens race windows you'll have to plug with
locking.

Other systems have such system calls too, for example NetBSD, check
here: http://www.daemon-systems.org/man/preadv.2.html

The application-visible interface provided by glibc should look like
this to be compatible to the existing implementations in the *BSD family:

  ssize_t preadv(int d, const struct iovec *iov, int iovcnt, off_t offset);
  ssize_t pwritev(int d, const struct iovec *iov, int iovcnt, off_t offset);

This prototype has one problem though: On 32bit archs is the (64bit)
offset argument unaligned, which the syscall ABI of several archs doesn't
allow to do.  At least s390 needs a wrapper in glibc to handle this.  As
we'll need a wrappers in glibc anyway I've decided to push problem to
glibc entriely and use a syscall prototype which works without
arch-specific wrappers inside the kernel: The offset argument is
explicitly splitted into two 32bit values.

The patch sports the actual system call implementation and the windup in
the x86 system call tables.  Other archs follow as separate patches.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <linux-api@vger.kernel.org>
Cc: <linux-arch@vger.kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02 19:05:08 -07:00
Jack Steiner 66666e50fc sgi-gru: add macros for using the UV hub to send interrupts
Add macros for using the UV hub to send interrupts.  Change the IPI code
to use these macros.  These macros will also be used in additional patches
that will follow.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02 19:05:05 -07:00
Jack Steiner a4c3155719 sgi-gru: add definitions of x86_64 GRU MMRs
Add definitions for x86_64 GRU MMRs.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02 19:05:05 -07:00
Jack Steiner bc5d9940e8 sgi-gru: exclude UV definitions on 32-bit x86
Eliminate compile errors on 32-bit X86 caused by UV.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02 19:05:05 -07:00
Oleg Nesterov 43918f2bf4 signals: remove 'handler' parameter to tracehook functions
Container-init must behave like global-init to processes within the
container and hence it must be immune to unhandled fatal signals from
within the container (i.e SIG_DFL signals that terminate the process).

But the same container-init must behave like a normal process to processes
in ancestor namespaces and so if it receives the same fatal signal from a
process in ancestor namespace, the signal must be processed.

Implementing these semantics requires that send_signal() determine pid
namespace of the sender but since signals can originate from workqueues/
interrupt-handlers, determining pid namespace of sender may not always be
possible or safe.

This patchset implements the design/simplified semantics suggested by
Oleg Nesterov.  The simplified semantics for container-init are:

	- container-init must never be terminated by a signal from a
	  descendant process.

	- container-init must never be immune to SIGKILL from an ancestor
	  namespace (so a process in parent namespace must always be able
	  to terminate a descendant container).

	- container-init may be immune to unhandled fatal signals (like
	  SIGUSR1) even if they are from ancestor namespace. SIGKILL/SIGSTOP
	  are the only reliable signals to a container-init from ancestor
	  namespace.

This patch:

Based on an earlier patch submitted by Oleg Nesterov and comments from
Roland McGrath (http://lkml.org/lkml/2008/11/19/258).

The handler parameter is currently unused in the tracehook functions.
Besides, the tracehook functions are called with siglock held, so the
functions can check the handler if they later need to.

Removing the parameter simiplifies changes to sig_ignored() in a follow-on
patch.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Acked-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Daniel Lezcano <daniel.lezcano@free.fr>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02 19:04:58 -07:00
Alexey Dobriyan 6f2c55b843 Simplify copy_thread()
First argument unused since 2.3.11.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02 19:04:51 -07:00
Akinobu Mita ee3b4290ae generic debug pagealloc: build fix
This fixes a build failure with generic debug pagealloc:

  mm/debug-pagealloc.c: In function 'set_page_poison':
  mm/debug-pagealloc.c:8: error: 'struct page' has no member named 'debug_flags'
  mm/debug-pagealloc.c: In function 'clear_page_poison':
  mm/debug-pagealloc.c:13: error: 'struct page' has no member named 'debug_flags'
  mm/debug-pagealloc.c: In function 'page_poison':
  mm/debug-pagealloc.c:18: error: 'struct page' has no member named 'debug_flags'
  mm/debug-pagealloc.c: At top level:
  mm/debug-pagealloc.c:120: error: redefinition of 'kernel_map_pages'
  include/linux/mm.h:1278: error: previous definition of 'kernel_map_pages' was here
  mm/debug-pagealloc.c: In function 'kernel_map_pages':
  mm/debug-pagealloc.c:122: error: 'debug_pagealloc_enabled' undeclared (first use in this function)

by fixing

 - debug_flags should be in struct page
 - define DEBUG_PAGEALLOC config option for all architectures

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02 19:04:48 -07:00
Akinobu Mita a7f8c50d90 x86, mm: fix misuse of debug_kmap_atomic
Impact: fix CONFIG_DEBUG_HIGHMEM=y breakage

Commit 7ca43e756 ("mm: use debug_kmap_atomic") introduced some
debug_kmap_atomic() calls in the wrong places.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <20090402070126.GA3951@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-02 16:37:04 +02:00
Ingo Molnar 83f2f0ed71 Merge branch 'linus' into x86/urgent
Merge needed to go past commit 7ca43e756 (mm: use debug_kmap_atomic)
and fix it.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-02 16:33:51 +02:00
Yinghai Lu 3de46fda4c x86: remove duplicated code with pcpu_need_numa()
Impact: clean up

those code pcpu_need_numa(), should be removed.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: David Miller <davem@davemloft.net>
LKML-Reference: <49D31770.9090502@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-02 06:08:05 +02:00
Tejun Heo eb12ce60c8 x86,percpu: fix inverted NUMA test in setup_pcpu_remap()
setup_percpu_remap() is for NUMA machines yet it bailed out with
-EINVAL if pcpu_need_numa().  Fix the inverted condition.

This problem was reported by David Miller and verified by Yinhai Lu.

Reported-by: David Miller <davem@davemloft.net>
Reported-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
LKML-Reference: <49D30469.8020006@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-02 06:08:05 +02:00
Ingo Molnar 8302294f43 Merge branch 'tracing/core-v2' into tracing-for-linus
Conflicts:
	include/linux/slub_def.h
	lib/Kconfig.debug
	mm/slob.c
	mm/slub.c
2009-04-02 00:49:02 +02:00
Linus Torvalds ef5ddd3d59 Merge branch 'x86-setup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-setup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, setup: guard against pre-ACPI 3 e820 code not updating %ecx
2009-04-01 12:52:57 -07:00
H. Peter Anvin cd670599b7 x86, setup: guard against pre-ACPI 3 e820 code not updating %ecx
Impact: BIOS bug safety

For pre-ACPI 3 BIOSes, pre-initialize the end of the e820 buffer just
in case the BIOS returns an unchanged %ecx but without actually
touching the ACPI 3 extended flags field.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-04-01 11:35:00 -07:00
Linus Torvalds 9c9cb14387 Merge branch 'x86/setup' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86/setup' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, setup: ACPI 3, BIOS workaround for E820-probing code
  x86, setup: preemptively save/restore edi and ebp around INT 15 E820
  x86, setup: mark %esi as clobbered in E820 BIOS call
2009-04-01 11:13:31 -07:00
Linus Torvalds e76e5b2c66 Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (88 commits)
  PCI: fix HT MSI mapping fix
  PCI: don't enable too much HT MSI mapping
  x86/PCI: make pci=lastbus=255 work when acpi is on
  PCI: save and restore PCIe 2.0 registers
  PCI: update fakephp for bus_id removal
  PCI: fix kernel oops on bridge removal
  PCI: fix conflict between SR-IOV and config space sizing
  powerpc/PCI: include pci.h in powerpc MSI implementation
  PCI Hotplug: schedule fakephp for feature removal
  PCI Hotplug: rename legacy_fakephp to fakephp
  PCI Hotplug: restore fakephp interface with complete reimplementation
  PCI: Introduce /sys/bus/pci/devices/.../rescan
  PCI: Introduce /sys/bus/pci/devices/.../remove
  PCI: Introduce /sys/bus/pci/rescan
  PCI: Introduce pci_rescan_bus()
  PCI: do not enable bridges more than once
  PCI: do not initialize bridges more than once
  PCI: always scan child buses
  PCI: pci_scan_slot() returns newly found devices
  PCI: don't scan existing devices
  ...

Fix trivial append-only conflict in Documentation/feature-removal-schedule.txt
2009-04-01 09:47:12 -07:00
Magnus Damm bf9ed57d35 pm: cleanup includes
Remove unused/duplicate cruft from asm/suspend.h:

 - x86_32: remove unused acpi code
 - powerpc: remove duplicate prototypes, see linux/suspend.h

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Cc: Paul Mundt <lethal@linux-sh.org>
Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-01 08:59:16 -07:00
Magnus Damm a8af78982f pm: rework includes, remove arch ifdefs
Make the following header file changes:

 - remove arch ifdefs and asm/suspend.h from linux/suspend.h
 - add asm/suspend.h to disk.c (for arch_prepare_suspend())
 - add linux/io.h to swsusp.c (for ioremap())
 - x86 32/64 bit compile fixes

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Cc: Paul Mundt <lethal@linux-sh.org>
Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-01 08:59:16 -07:00
Akinobu Mita 7ca43e7564 mm: use debug_kmap_atomic
Use debug_kmap_atomic in kmap_atomic, kmap_atomic_pfn, and
iomap_atomic_prot_pfn.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-01 08:59:14 -07:00
Akinobu Mita f4112de6b6 mm: introduce debug_kmap_atomic
x86 has debug_kmap_atomic_prot() which is error checking function for
kmap_atomic.  It is usefull for the other architectures, although it needs
CONFIG_TRACE_IRQFLAGS_SUPPORT.

This patch exposes it to the other architectures.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-01 08:59:14 -07:00
Akinobu Mita 6a11f75b6a generic debug pagealloc
CONFIG_DEBUG_PAGEALLOC is now supported by x86, powerpc, sparc64, and
s390.  This patch implements it for the rest of the architectures by
filling the pages with poison byte patterns after free_pages() and
verifying the poison patterns before alloc_pages().

This generic one cannot detect invalid page accesses immediately but
invalid read access may cause invalid dereference by poisoned memory and
invalid write access can be detected after a long delay.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-01 08:59:13 -07:00
Hiroshi Shimamoto 0f8f308925 x86: signal: check sas_ss_size instead of sas_ss_flags()
Impact: fix redundant and incorrect check

Oleg Nesterov noticed wrt commit:

  14fc9fb: x86: signal: check signal stack overflow properly

>> No need to check SA_ONSTACK if we're already using alternate signal stack.
>
> Yes, but this also mean that we don't need sas_ss_flags() under
> "if (!onsigstack)",

Checking on_sig_stack() in sas_ss_flags() at get_sigframe() is redundant
and not correct on 64 bit. To check sas_ss_size is enough.

Reported-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Cc: roland@redhat.com
LKML-Reference: <49CBB54C.5080201@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-01 17:13:17 +02:00
Ingo Molnar 8b54e45b00 Merge branches 'tracing/docs', 'tracing/filters', 'tracing/ftrace', 'tracing/kprobes', 'tracing/blktrace-v2' and 'tracing/textedit' into tracing/core-v2 2009-03-31 17:46:40 +02:00
Rusty Russell 558f6ab910 Merge branch 'cpumask-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
Conflicts:

	arch/x86/include/asm/topology.h
	drivers/oprofile/buffer_sync.c
(Both cases: changed in Linus' tree, removed in Ingo's).
2009-03-31 13:33:50 +10:30
Linus Torvalds d17abcd541 Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-cpumask
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-cpumask:
  oprofile: Thou shalt not call __exit functions from __init functions
  cpumask: remove the now-obsoleted pcibus_to_cpumask(): generic
  cpumask: remove cpumask_t from core
  cpumask: convert rcutorture.c
  cpumask: use new cpumask_ functions in core code.
  cpumask: remove references to struct irqaction's mask field.
  cpumask: use mm_cpumask() wrapper: kernel/fork.c
  cpumask: use set_cpu_active in init/main.c
  cpumask: remove node_to_first_cpu
  cpumask: fix seq_bitmap_*() functions.
  cpumask: remove dangerous CPU_MASK_ALL_PTR, &CPU_MASK_ALL
2009-03-30 18:00:26 -07:00
Linus Torvalds db6f204019 Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-lguest-and-virtio
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-lguest-and-virtio:
  lguest: barrier me harder
  lguest: use bool instead of int
  lguest: use KVM hypercalls
  lguest: wire up pte_update/pte_update_defer
  lguest: fix spurious BUG_ON() on invalid guest stack.
  virtio: more neatening of virtio_ring macros.
  virtio: fix BAD_RING, START_US and END_USE macros
2009-03-30 17:57:39 -07:00
Linus Torvalds cf2f7d7c90 Merge branch 'proc-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc
* 'proc-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc:
  Revert "proc: revert /proc/uptime to ->read_proc hook"
  proc 2/2: remove struct proc_dir_entry::owner
  proc 1/2: do PDE usecounting even for ->read_proc, ->write_proc
  proc: fix sparse warnings in pagemap_read()
  proc: move fs/proc/inode-alloc.txt comment into a source file
2009-03-30 16:06:04 -07:00
Linus Torvalds 53d8f67082 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6:
  PCI PM: Make pci_prepare_to_sleep() disable wake-up if needed
  radeonfb: Use __pci_complete_power_transition()
  PCI PM: Introduce __pci_[start|complete]_power_transition() (rev. 2)
  PCI PM: Restore config spaces of all devices during early resume
  PCI PM: Make pci_set_power_state() handle devices with no PM support
  PCI PM: Put devices into low power states during late suspend (rev. 2)
  PCI PM: Move pci_restore_standard_config to pci-driver.c
  PCI PM: Use pci_set_power_state during early resume
  PCI PM: Consistently use variable name "error" for pm call return values
  kexec: Change kexec jump code ordering
  PM: Change hibernation code ordering
  PM: Change suspend code ordering
  PM: Rework handling of interrupts during suspend-resume
  PM: Introduce functions for suspending and resuming device interrupts
2009-03-30 15:12:14 -07:00
Ingo Molnar 65fb0d23fc Merge branch 'linus' into cpumask-for-linus
Conflicts:
	arch/x86/kernel/cpu/common.c
2009-03-30 23:53:32 +02:00
Alexey Dobriyan 99b7623380 proc 2/2: remove struct proc_dir_entry::owner
Setting ->owner as done currently (pde->owner = THIS_MODULE) is racy
as correctly noted at bug #12454. Someone can lookup entry with NULL
->owner, thus not pinning enything, and release it later resulting
in module refcount underflow.

We can keep ->owner and supply it at registration time like ->proc_fops
and ->data.

But this leaves ->owner as easy-manipulative field (just one C assignment)
and somebody will forget to unpin previous/pin current module when
switching ->owner. ->proc_fops is declared as "const" which should give
some thoughts.

->read_proc/->write_proc were just fixed to not require ->owner for
protection.

rmmod'ed directories will be empty and return "." and ".." -- no harm.
And directories with tricky enough readdir and lookup shouldn't be modular.
We definitely don't want such modular code.

Removing ->owner will also make PDE smaller.

So, let's nuke it.

Kudos to Jeff Layton for reminding about this, let's say, oversight.

http://bugzilla.kernel.org/show_bug.cgi?id=12454

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2009-03-31 01:14:44 +04:00
Linus Torvalds 712b0006bf Merge branch 'iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (60 commits)
  dma-debug: make memory range checks more consistent
  dma-debug: warn of unmapping an invalid dma address
  dma-debug: fix dma_debug_add_bus() definition for !CONFIG_DMA_API_DEBUG
  dma-debug/x86: register pci bus for dma-debug leak detection
  dma-debug: add a check dma memory leaks
  dma-debug: add checks for kernel text and rodata
  dma-debug: print stacktrace of mapping path on unmap error
  dma-debug: Documentation update
  dma-debug: x86 architecture bindings
  dma-debug: add function to dump dma mappings
  dma-debug: add checks for sync_single_sg_*
  dma-debug: add checks for sync_single_range_*
  dma-debug: add checks for sync_single_*
  dma-debug: add checking for [alloc|free]_coherent
  dma-debug: add add checking for map/unmap_sg
  dma-debug: add checking for map/unmap_page/single
  dma-debug: add core checking functions
  dma-debug: add debugfs interface
  dma-debug: add kernel command line parameters
  dma-debug: add initialization code
  ...

Fix trivial conflicts due to whitespace changes in arch/x86/kernel/pci-nommu.c
2009-03-30 13:41:00 -07:00
Rafael J. Wysocki 2ed8d2b3a8 PM: Rework handling of interrupts during suspend-resume
Use the functions introduced in by the previous patch,
suspend_device_irqs(), resume_device_irqs() and check_wakeup_irqs(),
to rework the handling of interrupts during suspend (hibernation) and
resume.  Namely, interrupts will only be disabled on the CPU right
before suspending sysdevs, while device drivers will be prevented
from receiving interrupts, with the help of the new helper function,
before their "late" suspend callbacks run (and analogously during
resume).

In addition, since the device interrups are now disabled before the
CPU has turned all interrupts off and the CPU will ACK the interrupts
setting the IRQ_PENDING bit for them, check in sysdev_suspend() if
any wake-up interrupts are pending and abort suspend if that's the
case.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Ingo Molnar <mingo@elte.hu>
2009-03-30 21:46:54 +02:00
Linus Torvalds 019abbc870 Merge branch 'x86-stage-3-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-stage-3-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (190 commits)
  Revert "cpuacct: reduce one NULL check in fast-path"
  Revert "x86: don't compile vsmp_64 for 32bit"
  x86: Correct behaviour of irq affinity
  x86: early_ioremap_init(), use __fix_to_virt(), because we are sure it's safe
  x86: use default_cpu_mask_to_apicid for 64bit
  x86: fix set_extra_move_desc calling
  x86, PAT, PCI: Change vma prot in pci_mmap to reflect inherited prot
  x86/dmi: fix dmi_alloc() section mismatches
  x86: e820 fix various signedness issues in setup.c and e820.c
  x86: apic/io_apic.c define msi_ir_chip and ir_ioapic_chip all the time
  x86: irq.c keep CONFIG_X86_LOCAL_APIC interrupts together
  x86: irq.c use same path for show_interrupts
  x86: cpu/cpu.h cleanup
  x86: Fix a couple of sparse warnings in arch/x86/kernel/apic/io_apic.c
  Revert "x86: create a non-zero sized bm_pte only when needed"
  x86: pci-nommu.c cleanup
  x86: io_delay.c cleanup
  x86: rtc.c cleanup
  x86: i8253 cleanup
  x86: kdebugfs.c cleanup
  ...
2009-03-30 11:38:31 -07:00