Commit graph

16448 commits

Author SHA1 Message Date
Satyam Sharma 7b38493501 x86: intel_cacheinfo misc section annotation fixes
cache_shared_cpu_map_setup() and cache_remove_shared_cpu_map()
are functions called from another function that is __cpuinit.  But the
!CONFIG_SMP empty-body stubs of these functions are unconditionally
marked __init, which is actively wrong, and will lead to oops.  But we
never saw this oops, because they always managed to get inlined in their
callsites, by virtue of being empty-body stubs!  They should still be
__cpuinit, of course.

assocs[], levels[] and types[] are only referenced from function that is
__cpuinit.  So these are candidates for being marked __cpuinitdata.

[akpm@linux-foundation.org: build fix]
Signed-off-by: Satyam Sharma <satyam@infradead.org>
Cc: Andi Kleen <ak@suse.de>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-17 20:15:49 +02:00
Yinghai Lu f6855f7fb2 x86: use dev_to_node() to get node for device in dma_alloc_pages()
use dev_to_node() to get node for device in dma_alloc_pages().

Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Acked-by: Christoph Lameter <clameter@sgi.com>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-17 20:15:47 +02:00
Michal Schmidt 8f8ae1a7d4 x86: pci use pci=bfsort for HP DL385 G2 and DL585 G2
HP ProLiant systems DL385 G2 and DL585 G2 need pci=bfsort to enumerate PCI
devices in the expected order.

Matt sayeth:

  biosdevname is a userspace app I wrote to help solve this so we don't need
  to patch the kernel for future systems.  It's not integrated into any
  distributions properly yet, but is included in openSUSE 10.3 and Fedora 8
  for people who want to download and install it there.  It acts as a udev
  helper.

  For the time being, patching the kernel is necessary.  I really hope
  biosdevname eliminates that need in future distributions.

  http://linux.dell.com/biosdevname/

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Cc: <john.cagle@hp.com>
Cc: Matt Domsch <Matt_Domsch@dell.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Cc: Greg KH <greg@kroah.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-17 20:15:46 +02:00
Muli Ben-Yehuda d588ba8c09 x86: Calgary: fix disable busnum for CalIOC2
The old check we used based on dev->bus->number is wrong for devices on
CalIOC2.  Instead look whether we have an IOMMU table for that bus - if
not, translation is disabled.

Thanks to Murillo Fernandes Bernardes <bernarde@br.ibm.com> for
spotting, suggesting a fix and testing.

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Acked-by: Murillo Fernandes Bernardes <bernarde@br.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-17 20:15:45 +02:00
Huang, Ying 84e0fdb175 x86: NX bit handling in change_page_attr()
This patch fixes a bug of change_page_attr/change_page_attr_addr on
Intel x86_64 CPUs.  After changing page attribute to be executable with
these functions, the page remains un-executable on Intel x86_64 CPU.
Because on Intel x86_64 CPU, only if the "NX" bits of all four level
page tables are cleared, the corresponding page is executable (refer to
section 4.13.2 of Intel 64 and IA-32 Architectures Software Developer's
Manual).  So, the bug is fixed through clearing the "NX" bit of PMD when
splitting the huge PMD.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-17 20:15:43 +02:00
James Bottomley 87cde760ab x86: voyager don't try to support uniprocessor builds
A while ago Randy Dunlap and Adrian Bunk suggested we simply prevent UP
voyager building.  I resisted this on the grounds that the nagging was the
only thing that was going to cause me to look at this.  However, now I
think we should probably take this course.

Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-17 20:15:42 +02:00
Prarit Bhargava 27eb0b288f x86: stop nmi softlockup warnings in show_mem()
When dumping memory via sysrq-m it is possible to take a bogus NMI
watchdog or softlockup watchdog because the dump can take a long time on
big memory systems.

Occasionally tickle the watchdog when doing the dump.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2007-10-17 20:15:41 +02:00
Ingo Molnar 509a80c49c x86: fix CONFIG_PAGEALLOC related boot hangs/OOMs
if CONFIG_PAGEALLOC is enabled then X86_FEATURE_PSE is disabled and all
the kernel physical RAM pagetables are set up as 4K pages. This is
needed so that CONFIG_PAGEALLOC can do finegrained mapping and unmapping
of pages.

as a side-effect though, the total size of memory allocated as kernel
pagetables increases significantly. All these pagetables are allocated
via alloc_bootmem_low_pages(), straight out of the lowmem DMA pool. If
the system has enough RAM and a large kernel image then almost all of
the 16 MB lowmem DMA pool is allocated to the image and to pagetables -
leaving no space for __GFP_DMA allocations.

this results in drivers failing and the bootup hanging:

 swapper invoked oom-killer: gfp_mask=0x80d1, order=0, oomkilladj=0
  [<4015059f>] out_of_memory+0x17f/0x1c0
  [<40151f3c>] __alloc_pages+0x37c/0x3a0
  [<40168cd7>] slob_new_page+0x37/0x50
  [<40168dff>] slob_alloc+0x10f/0x190
  [<40169010>] __kmalloc_node+0x80/0x90
  [<405a17e3>] scsi_host_alloc+0x33/0x2c0
  [<405a1a82>] scsi_register+0x12/0x60
  [<40d5889e>] aha1542_detect+0x9e/0x940
  [<405c5ba5>] ultrastor_detect+0x265/0x5f0
  [<401352f5>] getnstimeofday+0x35/0xf0
  [<40d58751>] init_this_scsi_driver+0x41/0xf0
  [<40d0b856>] kernel_init+0x136/0x310
  [<40d58710>] init_this_scsi_driver+0x0/0xf0
  [<40d0b720>] kernel_init+0x0/0x310
  [<40105547>] kernel_thread_helper+0x7/0x10
  =======================

the fix is to first allocate from above the DMA pool, and if that fails
(for example due to it being a machine with less than 16 MB of RAM),
allocate from the DMA pool as a fallback.

With this fix applied i was able to boot a PAGEALLOC=y kernel that would
hang before.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-17 20:15:39 +02:00
Ingo Molnar 1e3e19723e x86: prepare page allocator for high allocations on PAGEALLOC=y
To preserve the DMA pool in CONFIG_DEBUG_PAGEALLOC=y kernels, we'll
allocate pagetables from above the 16MB DMA limit, so we'll have to set
up boot pagetables to cover 16MB more RAM (worst-case).

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-17 20:15:38 +02:00
Ingo Molnar f97586b610 x86: do not crash on non-Geode PCs in TSC probe
with this fix Geode kernels can be booted (and QA-ed) on generic PCs.

otherwise it crashes and burns during early bootup:

Detected 2160.212 MHz processor.
general protection fault: 0000 [#1]
PREEMPT SMP
Modules linked in:
CPU:    0
EIP:    0060:[<c09071f6>]    Not tainted VLI
EFLAGS: 00010002   (2.6.23-rc9 #90)
EIP is at tsc_init+0xa6/0x150
eax: 00000001   ebx: c1dce000   ecx: 00001900   edx: 00000001
esi: 00051000   edi: 00051000   ebp: c08fdfc4   esp: c08fdfa4
ds: 007b   es: 007b   fs: 00d8  gs: 0000  ss: 0068
Process swapper (pid: 0, ti=c08fc000 task=c082a180 task.ti=c08fc000)
Stack: c076b870 00000870 000000d4 0000001d c0831e80 c1dce000 00051000 00051000
       c08fdfcc c09053f8 c08fdff8 c09045ff 000001e2 c09040a0 00051000 00000020
       0004e500 c0932140 00020800 00099800 c08ed000 01409007 00000000
Call Trace:
 [<c010517a>] show_trace_log_lvl+0x1a/0x30
 [<c0105246>] show_stack_log_lvl+0xb6/0x100
 [<c0105732>] show_registers+0x212/0x3a0
 [<c0105aa4>] die+0x104/0x220
 [<c0105f5f>] do_general_protection+0x1ef/0x2b0
 [<c06699f2>] error_code+0x72/0x78
 [<c09053f8>] time_init+0x8/0x20
 [<c09045ff>] start_kernel+0x1af/0x320
 [<00000000>] 0x0
 =======================
Code: 31 d2 b8 00 00 09 3d f7 35 2c 70 9b c0 a3 04 95 8f c0 e8 ce 4e 99 ff b8 e0 45 93 c0 e8 94 b1 c5 ff e8 7f 3d 80 ff b9 00 19 00 00 <0f> 32 f6 c4 01 74 07 83 25 24 ce 82 c0 fd 8b 0d 20 ce 82 c0 b8
EIP: [<c09071f6>] tsc_init+0xa6/0x150 SS:ESP 0068:c08fdfa4
Kernel panic - not syncing: Attempted to kill the idle task!

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-17 20:15:37 +02:00
Ingo Molnar 3fb450a327 x86: enable NMI watchdog on nosmp
if nosmp has been passed as a boot option, but nmi_watchdog=2 has also
been enabled then keep minimal local APIC functionality around to make
the watchdog work.

this allowed me to debug a hard hang that would only occur with a nosmp
bootup.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-17 20:15:35 +02:00
Andi Kleen 2f62c94176 x86_64: Fix compat emulation of PTRACE_GET/SET_THREAD_AREA
Since the 64bit kernel has different indexes for this TLS segments
the address needs to be adjusted in the ptrace 32bit emulation.

[ tglx: arch/x86 adaptation ]

Reported-by: Amnon Shiloh
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-17 20:15:34 +02:00
Fengguang Wu f68fd5f480 x86: call free_init_pages() with irqs enabled in alternative_instructions()
In alternative_instructions(), call free_init_pages() with irqs enabled.

It fixes the warning message in smp_call_function*(), which should not be
called with irqs disabled.

[    0.310000] CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
[    0.310000] CPU: L2 Cache: 512K (64 bytes/line)
[    0.310000] CPU 0/0 -> Node 0
[    0.310000] SMP alternatives: switching to UP code
[    0.310000] Freeing SMP alternatives: 25k freed
[    0.310000] WARNING: at arch/x86_64/kernel/smp.c:397 smp_call_function_mask()
[    0.310000]
[    0.310000] Call Trace:
[    0.310000]  [<ffffffff8100dbde>] dump_trace+0x3ee/0x4a0
[    0.310000]  [<ffffffff8100dcd3>] show_trace+0x43/0x70
[    0.310000]  [<ffffffff8100dd15>] dump_stack+0x15/0x20
[    0.310000]  [<ffffffff8101cd44>] smp_call_function_mask+0x94/0xa0
[    0.310000]  [<ffffffff8101d0b2>] smp_call_function+0x32/0x40
[    0.310000]  [<ffffffff8104277f>] on_each_cpu+0x1f/0x50
[    0.310000]  [<ffffffff81026eac>] global_flush_tlb+0x8c/0x110
[    0.310000]  [<ffffffff81025c85>] free_init_pages+0xe5/0xf0
[    0.310000]  [<ffffffff81549b5e>] alternative_instructions+0x7e/0x150
[    0.310000]  [<ffffffff8154a2ea>] check_bugs+0x1a/0x20
[    0.310000]  [<ffffffff81540c4a>] start_kernel+0x2da/0x380
[    0.310000]  [<ffffffff81540132>] _sinittext+0x132/0x140
[    0.310000]
[    0.320000] ACPI: Core revision 20070126
[    0.560000] Using local APIC timer interrupts.
[    0.590000] Detected 62.496 MHz APIC timer.
[    0.590000] Brought up 1 CPUs

[ tglx: arch/x86 adaptation ]

Cc: Laurent Vivier <Laurent.Vivier@bull.net>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-17 20:15:33 +02:00
Andi Kleen f891dd18c1 x86: initialize 64bit registers for a.out executables
Previously the data from before the exec was kept in there. Zero
them instead.

[ tglx: arch/x86 adaptation ]

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-17 20:15:30 +02:00
Andrey Mirkin 1c5b5cfd29 x86: return correct error code from child_rip in x86_64 entry.S
Right now register edi is just cleared before calling do_exit.
That is wrong because correct return value will be ignored.
Value from rax should be copied to rdi instead of clearing edi.

AK: changed to 32bit move because it's strictly an int

[ tglx: arch/x86 adaptation ]

Signed-off-by: Andrey Mirkin <major@openvz.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-17 20:15:29 +02:00
Jan Beulich aa506dc7b1 i386: avoid temporarily inconsistent pte-s
One more of these issues (which were considered fixed a few releases
back): other than on x86-64, i386 allows set_fixmap() to replace
already present mappings. Consequently, on PAE, care must be taken to
not update the high half of a pte while the low half is still holding
the old value.

[ tglx: arch/x86 adaptation ]

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

 arch/x86/mm/pgtable_32.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)
2007-10-17 20:15:28 +02:00
Sam Ravnborg d72b1b4f41 i386: fix section mismatch warning in intel.c
Fix following section mismatch warning:
WARNING: vmlinux.o(.text+0xc88c): Section mismatch: reference to .init.text:trap_init_f00f_bug (between 'init_intel' and 'cpuid4_cache_lookup')

init_intel are __cpuint where trap_init_f00f_bug is __init.
Fixed by declaring trap_init_f00f_bug __cpuinit.

Moved the defintion of trap_init_f00f_bug to the sole user in init.c
so the ugly prototype in intel.c could get killed.

Frank van Maarseveen <frankvm@frankvm.com> supplied the .config used
to reproduce the warning.

[ tglx: arch/x86 adaptation ]

Cc: Frank van Maarseveen <frankvm@frankvm.com>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-17 20:15:26 +02:00
Satyam Sharma 25d1b51677 i386: Fix section mismatch
Fix bugzilla #8679

WARNING: arch/i386/kernel/built-in.o(.data+0x2148): Section mismatch: reference
to .init.text: (between 'thermal_throttle_cpu_notifier' and 'mtrr_mutex')

comes because struct notifier_block thermal_throttle_cpu_notifier in
arch/i386/kernel/cpu/mcheck/therm_throt.c goes in .data section but the
notifier callback function itself has been marked __cpuinit which becomes
__init == .init.text when HOTPLUG_CPU=n.  The warning is bogus because the
callback will never be called out if HOTPLUG_CPU=n in the first place (as
one can see from kernel/cpu.c, the cpu_chain itself is __cpuinitdata :-)

So, let's mark thermal_throttle_cpu_notifier as __cpuinitdata to fix
the section mismatch warning.

[ tglx: arch/x86 adaptation ]

Signed-off-by: Satyam Sharma <satyam@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-17 20:15:25 +02:00
Andrew Hastings 801916c1b3 x86: fix off-by-one in find_next_zero_string
Fix an off-by-one error in find_next_zero_string which prevents
allocating the last bit.

[ tglx: arch/x86 adaptation ]

Signed-off-by: Andrew Hastings <abh@cray.com> on behalf of Cray Inc.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-17 20:15:22 +02:00
Laurent Vivier 6442eea937 i386: export i386 smp_call_function_mask() to modules
This patch export i386 smp_call_function_mask() with EXPORT_SYMBOL().

This function is needed by KVM to call a function on a set of CPUs.

[ tglx: arch/x86 adaptation ]

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-17 20:15:21 +02:00
Roland McGrath f79eb83b3a x86: Install unstripped copy of 64bit vdso to disk
This keeps an unstripped copy of the 64bit vDSO images built before they are
stripped and embedded in the kernel.  The unstripped copies get installed
in $(MODLIB)/vdso/ by "make install" (or you can explicitly use the
subtarget "make vdso_install").  These files can be useful when they
contain source-level debugging information.

[ tglx: arch/x86 adaptation ]

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-17 20:15:20 +02:00
Roland McGrath af7e6a7464 x86_64: install unstripped copies of compat vdso on disk
This keeps an unstripped copy of the vDSO images built before they are
stripped and embedded in the kernel.  The unstripped copies get installed
in $(MODLIB)/vdso/ by "make install" (or you can explicitly use the
subtarget "make vdso_install").  These files can be useful when they
contain source-level debugging information.

[ tglx: arch/x86 adaptation ]

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-17 20:15:18 +02:00
Adrian Bunk 8957ecab02 i386: setup_trampoline() must be __cpuinit
WARNING: arch/i386/kernel/built-in.o(.text+0xf201): Section mismatch: reference to .init.data:trampoline_end (between 'setup_trampoline' and 'cpu_coregroup_map')
WARNING: arch/i386/kernel/built-in.o(.text+0xf207): Section mismatch: reference to .init.data:trampoline_data (between 'setup_trampoline' and 'cpu_coregroup_map')
WARNING: arch/i386/kernel/built-in.o(.text+0xf21a): Section mismatch: reference to .init.data:trampoline_data (between 'setup_trampoline' and 'cpu_coregroup_map')

Harmless but annoying warnings present when building an i386 SMP kernel
with CONFIG_HOTPLUG_CPU=n and gcc < 4.0 .

[ tglx: arch/x86 adaptation ]

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-17 20:15:17 +02:00
Stephane Eranian 0f8e45a288 i386: make Oprofile call shutdown() only once per session
Oprofile: call model->shutdown() only once to avoid calling release_ev*()
multiple times

[ tglx: arch/x86 adaptation ]

Signed-off-by: Stephane Eranian <eranian@hpl.hp.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-17 20:15:14 +02:00
Thomas Gleixner 3dfbc88464 x86: C1E late detection fix. Really switch off lapic timer
Doh, I completely missed that devices marked DUMMY are not running
the set_mode function. So we force broadcasting, but we keep the
local APIC timer running.

Let the clock event layer mark the device _after_ switching it off.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-17 20:15:13 +02:00
Linus Torvalds fb9fc39517 Merge branch 'xen-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen
* 'xen-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen:
  xfs: eagerly remove vmap mappings to avoid upsetting Xen
  xen: add some debug output for failed multicalls
  xen: fix incorrect vcpu_register_vcpu_info hypercall argument
  xen: ask the hypervisor how much space it needs reserved
  xen: lock pte pages while pinning/unpinning
  xen: deal with stale cr3 values when unpinning pagetables
  xen: add batch completion callbacks
  xen: yield to IPI target if necessary
  Clean up duplicate includes in arch/i386/xen/
  remove dead code in pgtable_cache_init
  paravirt: clean up lazy mode handling
  paravirt: refactor struct paravirt_ops into smaller pv_*_ops
2007-10-17 11:10:11 -07:00
Thomas Bogendoerfer 0eafaae84e [MIPS] IP22: Fix hang due to messing with timer interrupt handler
As IP22 is now using do_IRQ for timer interrupt, don't mess with
interrupt handler any longer

Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2007-10-17 18:28:49 +01:00
Atsushi Nemoto 9ee5389c58 [MIPS] Sibyte: Fix typos in sibyte clockevent drivers
Fix some typo introduced on clockevent conversion.

Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2007-10-17 18:28:49 +01:00
Ralf Baechle 820b2d853b [MIPS] Alchemy: replace last remaining instance of au_ffs with ffs.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2007-10-17 18:28:48 +01:00
Ralf Baechle c30db2480e [MIPS] Alchemy: Reformat PM code.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2007-10-17 18:28:48 +01:00
Ralf Baechle f3e8d1da38 [MIPS] Alchemy: Fix build by conversion to irq_cpu.c.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2007-10-17 18:28:48 +01:00
Ralf Baechle fb8dd01422 [MIPS] MTX1: Enable CONFIG_CROSSCOMPILE in defconfig.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2007-10-17 18:28:48 +01:00
Ralf Baechle b0d4056dd6 [MIPS] Probe for usability of cp0 compare interrupt.
Some processors offer the option of using the interrupt on which
normally the count / compare interrupt would be signaled as a normal
interupt pin.  Previously this required some ugly hackery for each
system which is much easier done by a quick and simple probe.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2007-10-17 18:28:47 +01:00
Maciej W. Rozycki 60b0d65541 [MIPS] SYNC emulation for MIPS I processors
Userland, including the C library and the dynamic linker, is keen to use
the SYNC instruction, even for "generic" MIPS I binaries these days.
Which makes it less than useful on MIPS I processors.

This change adds the emulation, but as our do_ri() infrastructure was not
really prepared to take yet another instruction, I have rewritten it and
its callees slightly as follows.

Now there is only a single place a possible signal is thrown from.  The
place is at the end of do_ri().  The instruction word is fetched in
do_ri() and passed down to handlers.  The handlers are called in sequence
and return a result that lets the caller decide upon further processing.
If the result is positive, then the handler has picked the instruction,
but a signal should be thrown and the result is the signal number.  If the
result is zero, then the handler has successfully simulated the
instruction.  If the result is negative, then the handler did not handle
the instruction; to make it more obvious the calls do not follow the usual
0/-Exxx result convention they now return -1 instead of -EFAULT.

The calculation of the return EPC is now at the beginning.  The reason is
it is easier to handle it there as emulation callees may modify a register
and an instruction may be located in delay slot of a branch whose result
depends on the register.  It has to be undone if a signal is to be raised,
but it is not a problem as this is the slow-path case, and both actions
are done in single places now rather than the former being scattered
through emulation handlers.

The part of do_cpu() being covered follows the changes to do_ri().

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

---
2007-10-17 18:28:47 +01:00
Ralf Baechle 396a2ae08e [MIPS] Fix modpost warning in raw binary builds.
MODPOST vmlinux.o
WARNING: vmlinux.o(.text+0x478): Section mismatch: reference to .init.text:start_kernel (between '_stext' and 'run_init_process')

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2007-10-17 18:28:46 +01:00
Linus Torvalds b6257a9036 Merge branch 'for-linus' of git://git.kernel.dk/data/git/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/data/git/linux-2.6-block:
  [SCSI] Remove full sg table memset()
  [SCSI] ide-scsi: remove usage of sg_last()
  Fix loop terminating conditions in fill_sg().
  [BLOCK] Clear sg entry before filling in blk_rq_map_sg()
  IA64: iommu uses sg_next with an invalid sg element
  cciss: disable DMA refetch on Smart Array P600
  swiotlb: fix map_sg failure handling
  SPARC64: fix iommu sg chaining
  [SCSI] ide-scsi: use scsi_sg_count() instead of ->use_sg
2007-10-17 09:08:13 -07:00
Linus Torvalds c548f08a4f Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (24 commits)
  [POWERPC] Fix vmemmap warning in init_64.c
  [POWERPC] Fix 64 bits vDSO DWARF info for CR register
  [POWERPC] Add 1TB workaround for PA6T
  [POWERPC] Enable NO_HZ and high res timers for pseries and ppc64 configs
  [POWERPC] Quieten cache information at boot
  [POWERPC] Quieten clockevent printk
  [POWERPC] Enable SLUB in *_defconfig
  [POWERPC] Fix 1TB segment detection
  [POWERPC] Fix iSeries_hpte_insert prototype
  [POWERPC] Fix copyright symbol
  [POWERPC] ibmebus: Move to of_device and of_platform_driver, match eHCA and eHEA drivers
  [POWERPC] ibmebus: Add device creation and bus probing based on of_device
  [POWERPC] ibmebus: Remove bus match/probe/remove functions
  [POWERPC] Move of_device allocation into of_device.[ch]
  [POWERPC] mpc52xx: device tree changes for FEC and MDIO
  [POWERPC] bestcomm: GenBD task support
  [POWERPC] bestcomm: FEC task support
  [POWERPC] bestcomm: ATA task support
  [POWERPC] bestcomm: core bestcomm support for Freescale MPC5200
  [POWERPC] mpc52xx: Update mpc52xx_psc structure with B revision changes
  ...
2007-10-17 09:05:55 -07:00
Linus Torvalds 5c8e191e84 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-2.6-x86setup
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-2.6-x86setup:
  Remove magic macros for screen_info structure members
  [x86] remove uses of magic macros for boot_params access
2007-10-17 09:00:30 -07:00
Randy Dunlap f00b51654e Update help text for CONFIG_CRASH_DUMP
Fix typos in CONFIG_RELOCATABLE.  Use tab + 2 spaces for indentation on all
lines.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Bernhard Walle <bwalle@suse.de>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Haren Myneni <hbabu@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:43:06 -07:00
Roel Kluin f7a75f0a40 spin_lock_unlocked cleanups
Replace some SPIN_LOCK_UNLOCKED with DEFINE_SPINLOCK

Signed-off-by: Roel Kluin <12o3l@tiscali.nl>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:43:01 -07:00
Maciej W. Rozycki 44a2db43eb lk201: remove obsolete driver
Remove the old-fashioned lk201 driver under drivers/tc/ that used to be
used by the old dz.c and zs.c drivers, which is now orphan code referred to
from nowhere and does not build anymore.  A modern replacement is available
as drivers/input/keyboard/lkkbd.c.

There are no plans to do anything about this piece of code and it does not
fit anywhere anymore, so it is not just a matter of maintenance or the lack
of.  There are still some bits that might be added to the new lkkbd.c
driver based on the old code, and the embedded hardware documentation which
is otherwise quite hard to get hold of might be useful to keep too.  Both
of these can be done separately though.  RIP.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:57 -07:00
Ralf Baechle 622a9edd91 Remove dma_cache_(wback|inv|wback_inv) functions
dma_cache_(wback|inv|wback_inv) were the earliest attempt on a generalized
cache managment API for I/O purposes.  Originally it was basically the raw
MIPS low level cache API exported to the entire world.  The API has
suffered from a lack of documentation, was not very widely used unlike it's
more modern brothers and can easily be replaced by dma_cache_sync.  So
remove it rsp.  turn the surviving bits back into an arch private API, as
discussed on linux-arch.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Paul Mackerras <paulus@samba.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Kyle McMartin <kyle@parisc-linux.org>
Acked-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:57 -07:00
Adrian Bunk cba4fbbff2 remove include/asm-*/ipc.h
All asm/ipc.h files do only #include <asm-generic/ipc.h>.

This patch therefore removes all include/asm-*/ipc.h files and moves the
contents of include/asm-generic/ipc.h to include/linux/ipc.h.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:55 -07:00
Ken'ichi Ohmichi bcbba6c10e add-vmcore: add a prefix "VMCOREINFO_" to the vmcoreinfo macros
Add a prefix "VMCOREINFO_" to the vmcoreinfo macros.  Old vmcoreinfo macros
were defined as generic names SYMBOL/SIZE/OFFSET /LENGTH/CONFIG, and it is
impossible to grep for them.  So these names should be changed.  This
discussion is the following:
http://www.ussg.iu.edu/hypermail/linux/kernel/0709.1/0415.html

Signed-off-by: Ken'ichi Ohmichi <oomichi@mxs.nes.nec.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:54 -07:00
Ken'ichi Ohmichi 00cab92f9e add-vmcore: use the existing ia64_tpa() instead of asm code
Signed-off-by: Ken'ichi Ohmichi <oomichi@mxs.nes.nec.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:54 -07:00
Ken'ichi Ohmichi fd59d231f8 Add vmcoreinfo
This patch set frees the restriction that makedumpfile users should install a
vmlinux file (including the debugging information) into each system.

makedumpfile command is the dump filtering feature for kdump.  It creates a
small dumpfile by filtering unnecessary pages for the analysis.  To
distinguish unnecessary pages, it needs a vmlinux file including the debugging
information.  These days, the debugging package becomes a huge file, and it is
hard to install it into each system.

To solve the problem, kdump developers discussed it at lkml and kexec-ml.  As
the result, we reached the conclusion that necessary information for dump
filtering (called "vmcoreinfo") should be embedded into the first kernel file
and it should be accessed through /proc/vmcore during the second kernel.
(http://www.uwsg.iu.edu/hypermail/linux/kernel/0707.0/1806.html)

Dan Aloni created the patch set for the above implementation.
(http://www.uwsg.iu.edu/hypermail/linux/kernel/0707.1/1053.html)

And I updated it for multi architectures and memory models.
(http://lists.infradead.org/pipermail/kexec/2007-August/000479.html)

Signed-off-by: Dan Aloni <da-x@monatomic.org>
Signed-off-by: Ken'ichi Ohmichi <oomichi@mxs.nes.nec.co.jp>
Signed-off-by: Bernhard Walle <bwalle@suse.de>
Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:54 -07:00
Roland McGrath 5f149cf0ac powerpc: Use linux/elfcore-compat.h
This makes powerpc64's compat code use the new linux/elfcore-compat.h,
reducing some hand-copied duplication.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Andi Kleen <ak@suse.de>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:51 -07:00
Neil Horman 7dc0b22e3c core_pattern: ignore RLIMIT_CORE if core_pattern is a pipe
For some time /proc/sys/kernel/core_pattern has been able to set its output
destination as a pipe, allowing a user space helper to receive and
intellegently process a core.  This infrastructure however has some
shortcommings which can be enhanced.  Specifically:

1) The coredump code in the kernel should ignore RLIMIT_CORE limitation
   when core_pattern is a pipe, since file system resources are not being
   consumed in this case, unless the user application wishes to save the core,
   at which point the app is restricted by usual file system limits and
   restrictions.

2) The core_pattern code should be able to parse and pass options to the
   user space helper as an argv array.  The real core limit of the uid of the
   crashing proces should also be passable to the user space helper (since it
   is overridden to zero when called).

3) Some miscellaneous bugs need to be cleaned up (specifically the
   recognition of a recursive core dump, should the user mode helper itself
   crash.  Also, the core dump code in the kernel should not wait for the user
   mode helper to exit, since the same context is responsible for writing to
   the pipe, and a read of the pipe by the user mode helper will result in a
   deadlock.

This patch:

Remove the check of RLIMIT_CORE if core_pattern is a pipe.  In the event that
core_pattern is a pipe, the entire core will be fed to the user mode helper.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Cc: <martin.pitt@ubuntu.com>
Cc: <wwoods@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:50 -07:00
Olof Johansson 2b571a066a pcmcia: CompactFlash driver for PA Semi Electra boards
Driver for the CompactFlash slot on the PA Semi Electra eval board. It's
a simple device sitting on localbus, with interrupts and detect/voltage
control over GPIO.

The driver is implemented as an of_platform driver, and adds localbus
as a bus being probed by the of_platform framework.

[akpm@linux-foundation.org: cleanups]
[olof@lixom.net: fix build]
Signed-off-by: Olof Johansson <olof@lixom.net>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Milton Miller <miltonm@bga.com>
Cc: Kumar Gala <galak@gate.crashing.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:48 -07:00
Robert P. J. Day 966fe399cc KCONFIG: Make "Instrumentation support" non-EXPERIMENTAL
It makes more sense to make instrumentation support experimental on a
case-by-case basis.

Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Cc: Andi Kleen <ak@suse.de>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:47 -07:00
Paul E. McKenney 3806204ca9 Remove workaround for unimmunized rcu_dereference from mce_log()
Remove the rmb() from mce_log(), since the immunized version of
rcu_dereference() makes it unnecessary.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:46 -07:00
Alexey Dobriyan f6b450d489 Make unregister_binfmt() return void
list_del() hardly can fail, so checking for return value is pointless
(and current code always return 0).

Nobody really cared that return value anyway.

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:46 -07:00
Alexey Dobriyan e4dc1b14d8 Use list_head in binfmt handling
Switch single-linked binfmt formats list to usual list_head's.  This leads
to one-liners in register_binfmt() and unregister_binfmt().  The downside
is one pointer more in struct linux_binfmt.  This is not a problem, since
the set of registered binfmts on typical box is very small -- (ELF +
something distro enabled for you).

Test-booted, played with executable .txt files, modprobe/rmmod binfmt_misc.

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:46 -07:00
David Rientjes 5a3135c2e7 oom: move prototypes to appropriate header file
Move the OOM killer's extern function prototypes to include/linux/oom.h and
include it where necessary.

[clg@fr.ibm.com: build fix]
Cc: Andrea Arcangeli <andrea@suse.de>
Acked-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:45 -07:00
Christoph Lameter 4ba9b9d0ba Slab API: remove useless ctor parameter and reorder parameters
Slab constructors currently have a flags parameter that is never used.  And
the order of the arguments is opposite to other slab functions.  The object
pointer is placed before the kmem_cache pointer.

Convert

        ctor(void *object, struct kmem_cache *s, unsigned long flags)

to

        ctor(struct kmem_cache *s, void *object)

throughout the kernel

[akpm@linux-foundation.org: coupla fixes]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:45 -07:00
Mark Nelson 1f7d6668c2 powerpc: add Altivec/VMX state to coredumps
Update dump_task_altivec() (which has so far never been put to use) so that
it dumps the Altivec/VMX registers (VR[0] - VR[31], VSCR and VRSAVE) in the
same format as the ptrace get_vrregs(), and add the appropriate glue
typedef and #defines to make it work.

A new note type of NT_PPC_VMX was chosen to be 0x100 (arbitrarily) because
it allows the low range values to be used for more generic purposes and
0x100 seems an adequate starting point for PowerPC extensions.

Signed-off-by: Mark Nelson <markn@au1.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Andi Kleen <ak@suse.de>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:44 -07:00
Mark Nelson 5b20cd80b4 x86: replace NT_PRXFPREG with ELF_CORE_XFPREG_TYPE #define
Replace NT_PRXFPREG with ELF_CORE_XFPREG_TYPE in the coredump code which
allows for more flexibility in the note type for the state of 'extended
floating point' implementations in coredumps.  New note types can now be
added with an appropriate #define.

This does #define ELF_CORE_XFPREG_TYPE to be NT_PRXFPREG in all
current users so there's are no change in behaviour.

This will let us use different note types on powerpc for the Altivec/VMX
state that some PowerPC cpus have (G4, PPC970, POWER6) and for the SPE
(signal processing extension) state that some embedded PowerPC cpus from
Freescale have.

Signed-off-by: Mark Nelson <markn@au1.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Andi Kleen <ak@suse.de>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:44 -07:00
Paul Mackerras 4acadb965c Merge branch 'for-2.6.24' of git://git.secretlab.ca/git/linux-2.6-mpc52xx into merge 2007-10-17 22:31:13 +10:00
Paul Mackerras 5cae826e9e Merge branch 'fixes-2.6.24' of master.kernel.org:/pub/scm/linux/kernel/git/galak/powerpc into merge 2007-10-17 22:30:43 +10:00
Tony Breeds f6b8076910 [POWERPC] Fix vmemmap warning in init_64.c
Use the right printk format to silence the following warning.

  CC      arch/powerpc/mm/init_64.o
arch/powerpc/mm/init_64.c: In function 'vmemmap_populate':
arch/powerpc/mm/init_64.c:243: warning: format '%p' expects type 'void *', but argument 4 has type 'long unsigned int'

Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-10-17 22:30:09 +10:00
Benjamin Herrenschmidt 081c11a5d0 [POWERPC] Fix 64 bits vDSO DWARF info for CR register
The current DWARF info for CR are incorrect, causing the gcc unwinder to
go to lunch if we take a segfault in the vdso.  This fixes it.

Problem identified by Andrew Haley, and fix provided by Jakub Jelinek
(thanks !).

Unfortunately, a bug in gcc cause it to not quite work either, but that
is being fixed separately with something around the lines of:

linux-unwind.h:

     fs->regs.reg[R_CR2].loc.offset = (long) &regs->ccr - new_cfa;
+    /* CR? regs are just 32-bit and PPC is big-endian.  */
+    fs->regs.reg[R_CR2].loc.offset += sizeof (long) - 4;

(According to Jakub)

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-10-17 22:30:09 +10:00
Olof Johansson f66bce5e6a [POWERPC] Add 1TB workaround for PA6T
PA6T has a bug where the slbie instruction does not honor the large
segment bit.  As a result, we have to always use slbia when switching
context.

We don't have to worry about changing the slbie's during fault processing,
since they should never be replacing one VSID with another using the
same ESID.  I.e. there's no risk for inserting duplicate entries due to a
failed slbie of the old entry.  So as long as we clear it out on context
switch we should be fine.

Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-10-17 22:30:09 +10:00
Anton Blanchard 8129535b6b [POWERPC] Enable NO_HZ and high res timers for pseries and ppc64 configs
Enable NO_HZ and high res timers for the ppc64 and pseries defconfigs.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-10-17 22:30:09 +10:00
Anton Blanchard 9697add0f8 [POWERPC] Quieten cache information at boot
After 6 years the ppc64 kernel still thinks its important to tell me my
cache line size is 0x80 bytes. I think most people who care know that by
now. The rest probably cant even understand the hex output.

Since we might have misconfigured firmware or cpus that have a linesize
that isnt 128 bytes, I still print it out for those cases. If people
would prefer to remove it completely, lets do it.

Also for lpar remove the htab_address printout since its not used.

Anton
ppc64 boot log usability expert

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-10-17 22:30:09 +10:00
Anton Blanchard 1281c8bef8 [POWERPC] Quieten clockevent printk
The clockevent bootup message only needs to be KERN_INFO.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-10-17 22:30:09 +10:00
Anton Blanchard 309a109255 [POWERPC] Enable SLUB in *_defconfig
When checking out the new NO_HZ support in powerpc, I noticed we never
slept for more than 2 seconds.  It turns out SLAB has a 2 second per cpu
timer that causes this.

After switching to SLUB I see some nice 4 second sleeps which is the
limit on this POWER6 box (the decrementer ticks at 512MHz):

slept 4.19 sec
slept 4.19 sec
slept 4.19 sec
slept 4.19 sec
slept 3.96 sec
slept 3.80 sec
slept 2.99 sec

Since SLUB is now the default and some powerpc defconfigs already enable
it, lets enable SLUB across the board for consistency.  While doing this
I also noticed that the maple defconfig has SLAB debugging enabled which
is sure to make your box nice and slow.  Fix that too.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-10-17 22:30:08 +10:00
Olof Johansson f5534004e5 [POWERPC] Fix 1TB segment detection
Buglet in the 1TB detection makes it return after checking the first
property word, even if it's not a match.

Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-10-17 22:30:08 +10:00
Stephen Rothwell dc6adfb33f [POWERPC] Fix iSeries_hpte_insert prototype
Commit 1189be6508 ([POWERPC] Use 1TB
segments) added an argument to hpte_insert.

Also make iSeries_hpte_insert static.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-10-17 22:30:08 +10:00
Stephen Rothwell bfce5c3ce5 [POWERPC] Fix copyright symbol
It seems to have been munged by patchwork.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-10-17 22:30:08 +10:00
Joachim Fenkes 6b08f3ae8e [POWERPC] ibmebus: Move to of_device and of_platform_driver, match eHCA and eHEA drivers
Replace struct ibmebus_dev and struct ibmebus_driver with struct of_device
and struct of_platform_driver, respectively.  Match the external ibmebus
interface and drivers using it.

Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Roland Dreier <rolandd@cisco.com>
Acked-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-10-17 22:30:08 +10:00
Joachim Fenkes 55347cc996 [POWERPC] ibmebus: Add device creation and bus probing based on of_device
The devtree root is now searched for devices matching a built-in whitelist
during boot, so these devices appear on the bus from the beginning.  It is
still possible to manually add/remove devices to/from the bus by using the
probe/remove sysfs interface.  Also, when a device driver registers itself,
the devtree is matched against its matchlist.

Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-10-17 22:30:08 +10:00
Joachim Fenkes a988c0a627 [POWERPC] ibmebus: Remove bus match/probe/remove functions
Remove old code that will be replaced by rewritten and shorter functions in
the next patch.  Keep struct ibmebus_dev and struct ibmebus_driver for now,
but replace ibmebus_{,un}register_driver() by dummy functions.  This way, the
kernel will still compile and run during the transition and git bisect will
be happy.

Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-10-17 22:30:08 +10:00
Joachim Fenkes fec738dd48 [POWERPC] Move of_device allocation into of_device.[ch]
Extract generic of_device allocation code from of_platform_device_create()
and move it into of_device.[ch], called of_device_alloc(). Also, there's now
of_device_free() which puts the device node.

This way, bus drivers that build on of_platform (like ibmebus will) can
build upon this code instead of reinventing the wheel.

Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2007-10-17 22:30:07 +10:00
David S. Miller 5804509e65 Fix loop terminating conditions in fill_sg().
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-17 13:11:59 +02:00
FUJITA Tomonori bdb02504f4 IA64: iommu uses sg_next with an invalid sg element
sg list elements might not be continuous.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-17 10:51:20 +02:00
FUJITA Tomonori 24c31eede6 SPARC64: fix iommu sg chaining
Commit 2c941a2040 looks incomplete. The
helper functions like prepare_sg() need to support sg chaining too.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-17 09:22:14 +02:00
H. Peter Anvin 3ea3351000 Remove magic macros for screen_info structure members
Stop using magic macros for screen_info structure members.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2007-10-16 22:57:17 -07:00
H. Peter Anvin 30c826451d [x86] remove uses of magic macros for boot_params access
Instead of using magic macros for boot_params access, simply use the
boot_params structure.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2007-10-16 17:38:31 -07:00
Linus Torvalds f563d53c30 Merge ssh://master.kernel.org/pub/scm/linux/kernel/git/sam/kbuild
* ssh://master.kernel.org/pub/scm/linux/kernel/git/sam/kbuild:
  x86: fix boot error introduced by kbuild
2007-10-16 16:51:11 -07:00
Linus Torvalds ce4b61f5e2 Merge branch 'release' of ssh://master.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6
* 'release' of ssh://master.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
  [IA64] Fix build for CONFIG_SMP=n
2007-10-16 16:48:59 -07:00
Domen Puncer b147d93d62 [POWERPC] mpc52xx: device tree changes for FEC and MDIO
Add device tree entries for lite5200b's FEC's PHY.

Signed-off-by: Domen Puncer <domen.puncer@telargo.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2007-10-16 17:10:32 -06:00
Sylvain Munaut 7acb939130 [POWERPC] bestcomm: GenBD task support
This is the microcode for the GenBD task and the associated
support code. This is a generic task that copy data to/from
a hardware FIFO. This is currently locked to 32bits wide
access but could be extended as needed.

The microcode itself comes directly from the offical
API (v2.2)

Signed-off-by: Sylvain Munaut <tnt@246tNt.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2007-10-16 17:09:56 -06:00
Sylvain Munaut ba11c79aba [POWERPC] bestcomm: FEC task support
This is the microcode for the FEC task and the associated
support code.

The microcode itself comes directly from the offical
API (v2.2)

Signed-off-by: Sylvain Munaut <tnt@246tNt.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2007-10-16 17:09:49 -06:00
Sylvain Munaut 9ea68df515 [POWERPC] bestcomm: ATA task support
This is the microcode for the ATA task and the associated
support code.

The microcode itself comes directly from the offical
API (v2.2)

Signed-off-by: Sylvain Munaut <tnt@246tNt.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2007-10-16 17:09:42 -06:00
Sylvain Munaut 2f9ea1bde0 [POWERPC] bestcomm: core bestcomm support for Freescale MPC5200
This patch adds support for the core of the BestComm API
for the Freescale MPC5200(b). The BestComm engine is a
microcode-controlled / tasks-based DMA used by several
of the onchip devices.

Setting up the tasks / memory allocation and all common
low level functions are handled by this patch.
The specifics details of each tasks and their microcode
are split-out in separate patches.

This is not the official API, but a much cleaner one.
(hopefully)

Signed-off-by: Sylvain Munaut <tnt@246tNt.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2007-10-16 17:09:34 -06:00
Sylvain Munaut 1088a20998 [POWERPC] rheap: Changes config mechanism
Instead of having in the makefile all the option that
requires rheap, we define a configuration symbol
and when needed we make sure it's selected.

Signed-off-by: Sylvain Munaut <tnt@246tNt.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2007-10-16 17:09:21 -06:00
Sylvain Munaut d4697af4f3 [POWERPC] exports rheap symbol to modules
Theses can be useful in modules too. So we export them.

Signed-off-by: Sylvain Munaut <tnt@246tNt.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2007-10-16 17:09:02 -06:00
Sam Ravnborg 7eebb93486 x86: fix boot error introduced by kbuild
x86 uses target specific assignment of EXTRA_AFLAGS,
EXTRA_CFLAGS - this caused troubles with
introducing asflags-y, ccflags-y.

Fixed the target specific assignments in arch/x86/boot/Makefile
and auditted the rest of the kernel for similar usage.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2007-10-16 23:50:33 +02:00
Tony Luck 4d1efed540 [IA64] Fix build for CONFIG_SMP=n
d5a7430ddc missed a spot where we
use cpu_sibling_map and cpu_core_map.  These don't exist on a
uni-processor build.  Wrap #ifdef CONFIG_SMP ... #endif around it.

Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-10-16 13:17:22 -07:00
Anton Blanchard e95206ab2c Update PowerPC vmemmap code for 1TB segments
htab_bolt_mapping takes another argument now the 1TB code has been
merged. Update vmemmap_populate to match.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 13:10:58 -07:00
Jeremy Fitzhardinge a122d6230e xen: add some debug output for failed multicalls
Multicalls are expected to never fail, and the normal response to a
failed multicall is very terse.  In the interests of better
debuggability, add some more verbose output.  It may be worth turning
this off once it all seems more tested.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
2007-10-16 11:51:31 -07:00
Jeremy Fitzhardinge e3d2697669 xen: fix incorrect vcpu_register_vcpu_info hypercall argument
The kernel's copy of struct vcpu_register_vcpu_info was out of date,
at best causing the hypercall to fail and the guest kernel to fall
back to the old mechanism, or worse, causing random memory corruption.

[ Stable folks: applies to 2.6.23 ]

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Stable Kernel <stable@kernel.org>
Cc: Morten =?utf-8?q?B=C3=B8geskov?= <xen-users@morten.bogeskov.dk>
Cc: Mark Williamson <mark.williamson@cl.cam.ac.uk>
2007-10-16 11:51:31 -07:00
Jeremy Fitzhardinge fb1d84043c xen: ask the hypervisor how much space it needs reserved
Ask the hypervisor how much space it needs reserved, since 32-on-64
doesn't need any space, and it may change in future.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
2007-10-16 11:51:31 -07:00
Jeremy Fitzhardinge 74260714c5 xen: lock pte pages while pinning/unpinning
When a pagetable is created, it is made globally visible in the rmap
prio tree before it is pinned via arch_dup_mmap(), and remains in the
rmap tree while it is unpinned with arch_exit_mmap().

This means that other CPUs may race with the pinning/unpinning
process, and see a pte between when it gets marked RO and actually
pinned, causing any pte updates to fail with write-protect faults.

As a result, all pte pages must be properly locked, and only unlocked
once the pinning/unpinning process has finished.

In order to avoid taking spinlocks for the whole pagetable - which may
overflow the PREEMPT_BITS portion of preempt counter - it locks and pins
each pte page individually, and then finally pins the whole pagetable.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hugh Dickens <hugh@veritas.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andi Kleen <ak@suse.de>
Cc: Keir Fraser <keir@xensource.com>
Cc: Jan Beulich <jbeulich@novell.com>
2007-10-16 11:51:30 -07:00
Jeremy Fitzhardinge 9f79991d41 xen: deal with stale cr3 values when unpinning pagetables
When a pagetable is no longer in use, it must be unpinned so that its
pages can be freed.  However, this is only possible if there are no
stray uses of the pagetable.  The code currently deals with all the
usual cases, but there's a rare case where a vcpu is changing cr3, but
is doing so lazily, and the change hasn't actually happened by the time
the pagetable is unpinned, even though it appears to have been completed.

This change adds a second per-cpu cr3 variable - xen_current_cr3 -
which tracks the actual state of the vcpu cr3.  It is only updated once
the actual hypercall to set cr3 has been completed.  Other processors
wishing to unpin a pagetable can check other vcpu's xen_current_cr3
values to see if any cross-cpu IPIs are needed to clean things up.

[ Stable folks: 2.6.23 bugfix ]

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Stable Kernel <stable@kernel.org>
2007-10-16 11:51:30 -07:00
Jeremy Fitzhardinge 91e0c5f3da xen: add batch completion callbacks
This adds a mechanism to register a callback function to be called once
a batch of hypercalls has been issued.  This is typically used to unlock
things which must remain locked until the hypercall has taken place.

[ Stable folks: pre-req for 2.6.23 bugfix "xen: deal with stale cr3
  values when unpinning pagetables" ]

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Stable Kernel <stable@kernel.org>
2007-10-16 11:51:30 -07:00
Jeremy Fitzhardinge f0d7339427 xen: yield to IPI target if necessary
When sending a call-function IPI to a vcpu, yield if the vcpu isn't
running.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
2007-10-16 11:51:30 -07:00
Jesper Juhl d626a1f1cb Clean up duplicate includes in arch/i386/xen/
This patch cleans up duplicate includes in
	arch/i386/xen/

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
2007-10-16 11:51:29 -07:00
Jeremy Fitzhardinge 4f81784774 remove dead code in pgtable_cache_init
The conversion from using a slab cache to quicklist left some residual
dead code.

I note that in the conversion it now always allocates a whole page for
the pgd, rather than the 32 bytes needed for a PAE pgd.  Was this
intended?

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Andi Kleen <ak@suse.de>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
2007-10-16 11:51:29 -07:00
Jeremy Fitzhardinge 8965c1c095 paravirt: clean up lazy mode handling
Currently, the set_lazy_mode pv_op is overloaded with 5 functions:
 1. enter lazy cpu mode
 2. leave lazy cpu mode
 3. enter lazy mmu mode
 4. leave lazy mmu mode
 5. flush pending batched operations

This complicates each paravirt backend, since it needs to deal with
all the possible state transitions, handling flushing, etc. In
particular, flushing is quite distinct from the other 4 functions, and
seems to just cause complication.

This patch removes the set_lazy_mode operation, and adds "enter" and
"leave" lazy mode operations on mmu_ops and cpu_ops.  All the logic
associated with enter and leaving lazy states is now in common code
(basically BUG_ONs to make sure that no mode is current when entering
a lazy mode, and make sure that the mode is current when leaving).
Also, flush is handled in a common way, by simply leaving and
re-entering the lazy mode.

The result is that the Xen, lguest and VMI lazy mode implementations
are much simpler.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Zach Amsden <zach@vmware.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Avi Kivity <avi@qumranet.com>
Cc: Anthony Liguory <aliguori@us.ibm.com>
Cc: "Glauber de Oliveira Costa" <glommer@gmail.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
2007-10-16 11:51:29 -07:00