Commit graph

1550 commits

Author SHA1 Message Date
Benjamin Herrenschmidt 491b98c315 powerpc/pci: Add a platform hook after probe and before resource survey
Some platforms need to perform resource allocation using a custom algorithm
due to HW constraints, or may want to tweak things globally below a host
bridge. For example OPAL support for IODA will need to perform a
resource allocation pass that applies IODA specific segmentation
constraints to MMIO which cannot be done simply using the kernel generic
resource management code.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-25 14:32:53 +11:00
Geoff Thorpe 09c188c4f6 powerpc: Add pgprot_cached_noncoherent()
This adds a pgprot combination required by some cache-enabled IO device
mappings, such as Freescale datapath (QMan and BMan) portals.

Signed-off-by: Geoff Thorpe <geoff@geoffthorpe.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-25 14:32:52 +11:00
Ravi K. Nittala df17f56d8a powerpc/pseries: Cancel RTAS event scan before firmware flash
The RTAS firmware flash update is conducted using an RTAS call that is
serialized by lock_rtas() which uses spin_lock. While the flash is in
progress, rtasd performs scan for any RTAS events that are generated by
the system. rtasd keeps scanning for the RTAS events generated on the
machine. This is performed via workqueue mechanism. The rtas_event_scan()
also uses an RTAS call to scan the events, eventually trying to acquire
the spin_lock before issuing the request.

The flash update takes a while to complete and during this time, any other
RTAS call has to wait. In this case, rtas_event_scan() waits for a long time
on the spin_lock resulting in a soft lockup.

Fix: Just before the flash update is performed, the queued rtas_event_scan()
work item is cancelled from the work queue so that there is no other RTAS
call issued while the flash is in progress. After the flash completes, the
system reboots and the rtas_event_scan() is rescheduled.

Signed-off-by: Suzuki Poulose <suzuki@in.ibm.com>
Signed-off-by: Ravi Nittala <ravi.nittala@in.ibm.com>
Reported-by: Divya Vikas <divya.vikas@in.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-25 14:11:29 +11:00
Jimi Xenidis fac26ad4f9 powerpc/book3e: Add ICSWX/ACOP support to Book3e cores like A2
ICSWX is also used by the A2 processor to access coprocessors,
although not all "chips" that contain A2s have coprocessors.

Signed-off-by: Jimi Xenidis <jimix@pobox.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-25 14:11:28 +11:00
Milton Miller 8d3d589a79 powerpc/pseries: Software invalidatation of TCEs
Some pseries IOMMUs cache TCEs but don't snoop when the TCEs are changed
in memory, hence we need manually invalidate in software.

This adds code to do the invalidate.  It keys off a device tree property
to say where the to do the MMIO for the invalidate and some information
on what the format of the invalidate including some magic routing info.

it_busno get overloaded with this magic routing info and it_index with
the MMIO address for the invalidate command.

This then gets hooked into the building and freeing of TCEs.

This is only useful on bare metal pseries.  pHyp takes care of this when
virtualised.

Based on patch from Milton with cleanups from Mikey.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-25 14:11:26 +11:00
Anton Blanchard 7df1027542 powerpc/time: Optimise decrementer_check_overflow
decrementer_check_overflow is called from arch_local_irq_restore so
we want to make it as light weight as possible. As such, turn
decrementer_check_overflow into an inline function.

To avoid a circular mess of includes, separate out the two components
of struct decrementer_clock and keep the struct clock_event_device
part local to time.c.

The fast path improves from:

arch_local_irq_restore
     0:       mflr    r0
     4:       std     r0,16(r1)
     8:       stdu    r1,-112(r1)
     c:       stb     r3,578(r13)
    10:       cmpdi   cr7,r3,0
    14:       beq-    cr7,24 <.arch_local_irq_restore+0x24>
...
    24:       addi    r1,r1,112
    28:       ld      r0,16(r1)
    2c:       mtlr    r0
    30:       blr

to:

arch_local_irq_restore
    0:       std     r30,-16(r1)
    4:       ld      r30,0(r2)
    8:       stb     r3,578(r13)
    c:       cmpdi   cr7,r3,0
   10:       beq-    cr7,6c <.arch_local_irq_restore+0x6c>
...
   6c:       ld      r30,-16(r1)
   70:       blr

Unfortunately we still setup a local TOC (due to -mminimal-toc). Yet
another sign we should be moving to -mcmodel=medium.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-25 14:11:26 +11:00
Anton Blanchard 37fb9a0231 powerpc/time: Handle wrapping of decrementer
When re-enabling interrupts we have code to handle edge sensitive
decrementers by resetting the decrementer to 1 whenever it is negative.
If interrupts were disabled long enough that the decrementer wrapped to
positive we do nothing. This means interrupts can be delayed for a long
time until it finally goes negative again.

While we hope interrupts are never be disabled long enough for the
decrementer to go positive, we have a very good test team that can
drive any kernel into the ground. The softlockup data we get back
from these fails could be seconds in the future, completely missing
the cause of the lockup.

We already keep track of the timebase of the next event so use that
to work out if we should trigger a decrementer exception.

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: stable@kernel.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-25 14:09:58 +11:00
Jia Hongtao 09cef8bd07 powerpc/85xx: Add lbc suspend support for PM
Power supply for LBC registers is off when system go to deep-sleep state.
We save the values of registers before suspend and restore to registers
after resume.

We removed the last two reservation arrays from struct fsl_lbc_regs for
allocating less memory and minimizing the memcpy size.

Signed-off-by: Jiang Yutang <b14898@freescale.com>
Signed-off-by: Jia Hongtao <B38951@freescale.com>
Signed-off-by: Li Yang <leoli@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-11-24 02:01:40 -06:00
Tejun Heo d88e4cb671 freezer: remove now unused TIF_FREEZE
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: linux-arch@vger.kernel.org
2011-11-21 12:32:25 -08:00
David S. Miller efd0bf97de Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
The forcedeth changes had a conflict with the conversion over
to atomic u64 statistics in net-next.

The libertas cfg.c code had a conflict with the bss reference
counting fix by John Linville in net-next.

Conflicts:
	drivers/net/ethernet/nvidia/forcedeth.c
	drivers/net/wireless/libertas/cfg.c
2011-11-21 13:50:33 -05:00
Linus Torvalds a4cc3889f7 Merge branch 'kvm-updates/3.2' of git://git.kernel.org/pub/scm/virt/kvm/kvm
* 'kvm-updates/3.2' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM guest: prevent tracing recursion with kvmclock
  Revert "KVM: PPC: Add support for explicit HIOR setting"
  KVM: VMX: Check for automatic switch msr table overflow
  KVM: VMX: Add support for guest/host-only profiling
  KVM: VMX: add support for switching of PERF_GLOBAL_CTRL
  KVM: s390: announce SYNC_MMU
  KVM: s390: Fix tprot locking
  KVM: s390: handle SIGP sense running intercepts
  KVM: s390: Fix RUNNING flag misinterpretation
2011-11-20 14:57:43 -08:00
John W. Linville e11c259f74 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem
Conflicts:
	include/net/bluetooth/bluetooth.h
2011-11-17 13:11:43 -05:00
Alexander Graf bb75c627fb Revert "KVM: PPC: Add support for explicit HIOR setting"
This reverts commit a15bd354f0.

It exceeded the padding on the SREGS struct, rendering the ABI
backwards-incompatible.

Conflicts:

	arch/powerpc/kvm/powerpc.c
	include/linux/kvm.h

Signed-off-by: Avi Kivity <avi@redhat.com>
2011-11-17 16:30:25 +02:00
Benjamin Herrenschmidt b97021f855 powerpc: Fix atomic_xxx_return barrier semantics
The Documentation/memory-barriers.txt document requires that atomic
operations that return a value act as a memory barrier both before
and after the actual atomic operation.

Our current implementation doesn't guarantee this. More specifically,
while a load following the isync can not be issued before stwcx. has
completed, that completion doesn't architecturally means that the
result of stwcx. is visible to other processors (or any previous stores
for that matter) (typically, the other processors L1 caches can still
hold the old value).

This has caused an actual crash in RCU torture testing on Power 7

This fixes it by changing those atomic ops to use new macros instead
of RELEASE/ACQUIRE barriers, called ATOMIC_ENTRY and ATMOIC_EXIT barriers,
which are then defined respectively to lwsync and sync.

I haven't had a chance to measure the performance impact (or rather
what I measured with kernel compiles is in the noise, I yet have to
find a more precise benchmark)

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-11-17 16:26:07 +11:00
Kumar Gala 187b9f2aa7 powerpc/book3e-64: Fix debug support for userspace
With the introduction of CONFIG_PPC_ADV_DEBUG_REGS user space debug is
broken on Book-E 64-bit parts that support delayed debug events.  When
switch_booke_debug_regs() sets DBCR0 we'll start getting debug events as
MSR_DE is also set and we aren't able to handle debug events from kernel
space.

We can remove the hack that always enables MSR_DE and loads up DBCR0 and
just utilize switch_booke_debug_regs() to get user space debug working
again.

We still need to handle critical/debug exception stacks & proper
save/restore of state for those exception levles to support debug events
from kernel space like we have on 32-bit.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-17 16:26:07 +11:00
Anton Blanchard d715e433b7 powerpc: Copy down exception vectors after feature fixups
kdump fails because we try to execute an HV only instruction. Feature
fixups are being applied after we copy the exception vectors down to 0
so they miss out on any updates.

We have always had this issue but it only became critical in v3.0
when we added CFAR support (breaks POWER5) and v3.1 when we added
POWERNV (breaks everyone).

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org> [v3.0+]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-16 14:47:54 +11:00
Johannes Berg 6e3e939f3b net: add wireless TX status socket option
The 802.1X EAPOL handshake hostapd does requires
knowing whether the frame was ack'ed by the peer.
Currently, we fudge this pretty badly by not even
transmitting the frame as a normal data frame but
injecting it with radiotap and getting the status
out of radiotap monitor as well. This is rather
complex, confuses users (mon.wlan0 presence) and
doesn't work with all hardware.

To get rid of that hack, introduce a real wifi TX
status option for data frame transmissions.

This works similar to the existing TX timestamping
in that it reflects the SKB back to the socket's
error queue with a SCM_WIFI_STATUS cmsg that has
an int indicating ACK status (0/1).

Since it is possible that at some point we will
want to have TX timestamping and wifi status in a
single errqueue SKB (there's little point in not
doing that), redefine SO_EE_ORIGIN_TIMESTAMPING
to SO_EE_ORIGIN_TXSTATUS which can collect more
than just the timestamp; keep the old constant
as an alias of course. Currently the internal APIs
don't make that possible, but it wouldn't be hard
to split them up in a way that makes it possible.

Thanks to Neil Horman for helping me figure out
the functions that add the control messages.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-11-09 16:01:02 -05:00
Geoff Levand 9fce85f7ff powerpc/ps3: Fix lv1_gpu_attribute hcall
The lv1_gpu_attribute hcall takes three, not five input
arguments.  Adjust the lv1 hcall table and all calls.

Signed-off-by: Geoff Levand <geoff@infradead.org>
CC: Takashi Iwai <tiwai@suse.de>
Acked-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-08 14:51:59 +11:00
Yong Zhang a3a9f3b47d powerpc/irq: Remove IRQF_DISABLED
Since commit [e58aa3d2: genirq: Run irq handlers with interrupts disabled],
We run all interrupt handlers with interrupts disabled
and we even check and yell when an interrupt handler
returns with interrupts enabled (see commit [b738a50a:
genirq: Warn when handler enables interrupts]).

So now this flag is a NOOP and can be removed.

Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-08 14:51:46 +11:00
Linus Torvalds 32aaeffbd4 Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux
* 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits)
  Revert "tracing: Include module.h in define_trace.h"
  irq: don't put module.h into irq.h for tracking irqgen modules.
  bluetooth: macroize two small inlines to avoid module.h
  ip_vs.h: fix implicit use of module_get/module_put from module.h
  nf_conntrack.h: fix up fallout from implicit moduleparam.h presence
  include: replace linux/module.h with "struct module" wherever possible
  include: convert various register fcns to macros to avoid include chaining
  crypto.h: remove unused crypto_tfm_alg_modname() inline
  uwb.h: fix implicit use of asm/page.h for PAGE_SIZE
  pm_runtime.h: explicitly requires notifier.h
  linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h
  miscdevice.h: fix up implicit use of lists and types
  stop_machine.h: fix implicit use of smp.h for smp_processor_id
  of: fix implicit use of errno.h in include/linux/of.h
  of_platform.h: delete needless include <linux/module.h>
  acpi: remove module.h include from platform/aclinux.h
  miscdevice.h: delete unnecessary inclusion of module.h
  device_cgroup.h: delete needless include <linux/module.h>
  net: sch_generic remove redundant use of <linux/module.h>
  net: inet_timewait_sock doesnt need <linux/module.h>
  ...

Fix up trivial conflicts (other header files, and  removal of the ab3550 mfd driver) in
 - drivers/media/dvb/frontends/dibx000_common.c
 - drivers/media/video/{mt9m111.c,ov6650.c}
 - drivers/mfd/ab3550-core.c
 - include/linux/dmaengine.h
2011-11-06 19:44:47 -08:00
Linus Torvalds 1197ab2942 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (106 commits)
  powerpc/p3060qds: Add support for P3060QDS board
  powerpc/83xx: Add shutdown request support to MCU handling on MPC8349 MITX
  powerpc/85xx: Make kexec to interate over online cpus
  powerpc/fsl_booke: Fix comment in head_fsl_booke.S
  powerpc/85xx: issue 15 EOI after core reset for FSL CoreNet devices
  powerpc/8xxx: Fix interrupt handling in MPC8xxx GPIO driver
  powerpc/85xx: Add 'fsl,pq3-gpio' compatiable for GPIO driver
  powerpc/86xx: Correct Gianfar support for GE boards
  powerpc/cpm: Clear muram before it is in use.
  drivers/virt: add ioctl for 32-bit compat on 64-bit to fsl-hv-manager
  powerpc/fsl_msi: add support for "msi-address-64" property
  powerpc/85xx: Setup secondary cores PIR with hard SMP id
  powerpc/fsl-booke: Fix settlbcam for 64-bit
  powerpc/85xx: Adding DCSR node to dtsi device trees
  powerpc/85xx: clean up FPGA device tree nodes for Freecsale QorIQ boards
  powerpc/85xx: fix PHYS_64BIT selection for P1022DS
  powerpc/fsl-booke: Fix setup_initial_memory_limit to not blindly map
  powerpc: respect mem= setting for early memory limit setup
  powerpc: Update corenet64_smp_defconfig
  powerpc: Update mpc85xx/corenet 32-bit defconfigs
  ...

Fix up trivial conflicts in:
 - arch/powerpc/configs/40x/hcu4_defconfig
	removed stale file, edited elsewhere
 - arch/powerpc/include/asm/udbg.h, arch/powerpc/kernel/udbg.c:
	added opal and gelic drivers vs added ePAPR driver
 - drivers/tty/serial/8250.c
	moved UPIO_TSI to powerpc vs removed UPIO_DWAPB support
2011-11-06 17:12:03 -08:00
Christopher Yeoh fcf634098c Cross Memory Attach
The basic idea behind cross memory attach is to allow MPI programs doing
intra-node communication to do a single copy of the message rather than a
double copy of the message via shared memory.

The following patch attempts to achieve this by allowing a destination
process, given an address and size from a source process, to copy memory
directly from the source process into its own address space via a system
call.  There is also a symmetrical ability to copy from the current
process's address space into a destination process's address space.

- Use of /proc/pid/mem has been considered, but there are issues with
  using it:
  - Does not allow for specifying iovecs for both src and dest, assuming
    preadv or pwritev was implemented either the area read from or
  written to would need to be contiguous.
  - Currently mem_read allows only processes who are currently
  ptrace'ing the target and are still able to ptrace the target to read
  from the target. This check could possibly be moved to the open call,
  but its not clear exactly what race this restriction is stopping
  (reason  appears to have been lost)
  - Having to send the fd of /proc/self/mem via SCM_RIGHTS on unix
  domain socket is a bit ugly from a userspace point of view,
  especially when you may have hundreds if not (eventually) thousands
  of processes  that all need to do this with each other
  - Doesn't allow for some future use of the interface we would like to
  consider adding in the future (see below)
  - Interestingly reading from /proc/pid/mem currently actually
  involves two copies! (But this could be fixed pretty easily)

As mentioned previously use of vmsplice instead was considered, but has
problems.  Since you need the reader and writer working co-operatively if
the pipe is not drained then you block.  Which requires some wrapping to
do non blocking on the send side or polling on the receive.  In all to all
communication it requires ordering otherwise you can deadlock.  And in the
example of many MPI tasks writing to one MPI task vmsplice serialises the
copying.

There are some cases of MPI collectives where even a single copy interface
does not get us the performance gain we could.  For example in an
MPI_Reduce rather than copy the data from the source we would like to
instead use it directly in a mathops (say the reduce is doing a sum) as
this would save us doing a copy.  We don't need to keep a copy of the data
from the source.  I haven't implemented this, but I think this interface
could in the future do all this through the use of the flags - eg could
specify the math operation and type and the kernel rather than just
copying the data would apply the specified operation between the source
and destination and store it in the destination.

Although we don't have a "second user" of the interface (though I've had
some nibbles from people who may be interested in using it for intra
process messaging which is not MPI).  This interface is something which
hardware vendors are already doing for their custom drivers to implement
fast local communication.  And so in addition to this being useful for
OpenMPI it would mean the driver maintainers don't have to fix things up
when the mm changes.

There was some discussion about how much faster a true zero copy would
go. Here's a link back to the email with some testing I did on that:

http://marc.info/?l=linux-mm&m=130105930902915&w=2

There is a basic man page for the proposed interface here:

http://ozlabs.org/~cyeoh/cma/process_vm_readv.txt

This has been implemented for x86 and powerpc, other architecture should
mainly (I think) just need to add syscall numbers for the process_vm_readv
and process_vm_writev. There are 32 bit compatibility versions for
64-bit kernels.

For arch maintainers there are some simple tests to be able to quickly
verify that the syscalls are working correctly here:

http://ozlabs.org/~cyeoh/cma/cma-test-20110718.tgz

Signed-off-by: Chris Yeoh <yeohc@au1.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Cc: <linux-man@vger.kernel.org>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-10-31 17:30:44 -07:00
Paul Gortmaker e415372acc powerpc: fix implicit use of mutex.h by include/asm/spu.h
We've been getting the header implicitly via module.h in the past
but when we clean that up, we'll get this failure:

  CC      arch/powerpc/platforms/cell/beat_spu_priv1.o
In file included from arch/powerpc/platforms/cell/beat_spu_priv1.c:22:
arch/powerpc/include/asm/spu.h:190: error: field 'list_mutex' has incomplete type
make[2]: *** [arch/powerpc/platforms/cell/beat_spu_priv1.o] Error 1

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:30:42 -04:00
Paul Gortmaker 9308794884 powerpc: include export.h for files using EXPORT_SYMBOL/THIS_MODULE
Fix failures in powerpc associated with the previously allowed
implicit module.h presence that now lead to things like this:

arch/powerpc/mm/mmu_context_hash32.c:76:1: error: type defaults to 'int' in declaration of 'EXPORT_SYMBOL_GPL'
arch/powerpc/mm/tlb_hash32.c:48:1: error: type defaults to 'int' in declaration of 'EXPORT_SYMBOL'
arch/powerpc/kernel/pci_32.c:51:1: error: type defaults to 'int' in declaration of 'EXPORT_SYMBOL_GPL'
arch/powerpc/kernel/iomap.c:36:1: error: type defaults to 'int' in declaration of 'EXPORT_SYMBOL'
arch/powerpc/platforms/44x/canyonlands.c:126:1: error: type defaults to 'int' in declaration of 'EXPORT_SYMBOL'
arch/powerpc/kvm/44x.c:168:59: error: 'THIS_MODULE' undeclared (first use in this function)

[with several contibutions from Stephen Rothwell <sfr@canb.auug.org.au>]

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:30:38 -04:00
Paul Gortmaker 66b15db69c powerpc: add export.h to files making use of EXPORT_SYMBOL
With module.h being implicitly everywhere via device.h, the absence
of explicitly including something for EXPORT_SYMBOL went unnoticed.
Since we are heading to fix things up and clean module.h from the
device.h file, we need to explicitly include these files now.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:30:37 -04:00
Linus Torvalds 1bc87b0055 Merge branch 'kvm-updates/3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm
* 'kvm-updates/3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm: (75 commits)
  KVM: SVM: Keep intercepting task switching with NPT enabled
  KVM: s390: implement sigp external call
  KVM: s390: fix register setting
  KVM: s390: fix return value of kvm_arch_init_vm
  KVM: s390: check cpu_id prior to using it
  KVM: emulate lapic tsc deadline timer for guest
  x86: TSC deadline definitions
  KVM: Fix simultaneous NMIs
  KVM: x86 emulator: convert push %sreg/pop %sreg to direct decode
  KVM: x86 emulator: switch lds/les/lss/lfs/lgs to direct decode
  KVM: x86 emulator: streamline decode of segment registers
  KVM: x86 emulator: simplify OpMem64 decode
  KVM: x86 emulator: switch src decode to decode_operand()
  KVM: x86 emulator: qualify OpReg inhibit_byte_regs hack
  KVM: x86 emulator: switch OpImmUByte decode to decode_imm()
  KVM: x86 emulator: free up some flag bits near src, dst
  KVM: x86 emulator: switch src2 to generic decode_operand()
  KVM: x86 emulator: expand decode flags to 64 bits
  KVM: x86 emulator: split dst decode to a generic decode_operand()
  KVM: x86 emulator: move memop, memopp into emulation context
  ...
2011-10-30 15:36:45 -07:00
Linus Torvalds f362f98e7c Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/vfs-queue
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/vfs-queue: (21 commits)
  leases: fix write-open/read-lease race
  nfs: drop unnecessary locking in llseek
  ext4: replace cut'n'pasted llseek code with generic_file_llseek_size
  vfs: add generic_file_llseek_size
  vfs: do (nearly) lockless generic_file_llseek
  direct-io: merge direct_io_walker into __blockdev_direct_IO
  direct-io: inline the complete submission path
  direct-io: separate map_bh from dio
  direct-io: use a slab cache for struct dio
  direct-io: rearrange fields in dio/dio_submit to avoid holes
  direct-io: fix a wrong comment
  direct-io: separate fields only used in the submission path from struct dio
  vfs: fix spinning prevention in prune_icache_sb
  vfs: add a comment to inode_permission()
  vfs: pass all mask flags check_acl and posix_acl_permission
  vfs: add hex format for MAY_* flag values
  vfs: indicate that the permission functions take all the MAY_* flags
  compat: sync compat_stats with statfs.
  vfs: add "device" tag to /proc/self/mountstats
  cleanup: vfs: small comment fix for block_invalidatepage
  ...

Fix up trivial conflict in fs/gfs2/file.c (llseek changes)
2011-10-28 10:49:34 -07:00
Eric W. Biederman 1448c721e4 compat: sync compat_stats with statfs.
This was found by inspection while tracking a similar
bug in compat_statfs64, that has been fixed in mainline
since decemeber.

- This fixes a bug where not all of the f_spare fields
  were cleared on mips and s390.
- Add the f_flags field to struct compat_statfs
- Copy f_flags to userspace in case someone cares.
- Use __clear_user to copy the f_spare field to userspace
  to ensure that all of the elements of f_spare are cleared.
  On some architectures f_spare is has 5 ints and on some
  architectures f_spare only has 4 ints.  Which makes
  the previous technique of clearing each int individually
  broken.

I don't expect anyone actually uses the old statfs system
call anymore but if they do let them benefit from having
the compat and the native version working the same.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-10-28 14:58:53 +02:00
Linus Torvalds efb8d21b2c Merge branch 'tty-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
* 'tty-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (79 commits)
  TTY: serial_core: Fix crash if DCD drop during suspend
  tty/serial: atmel_serial: bootconsole removed from auto-enumerates
  Revert "TTY: call tty_driver_lookup_tty unconditionally"
  tty/serial: atmel_serial: add device tree support
  tty/serial: atmel_serial: auto-enumerate ports
  tty/serial: atmel_serial: whitespace and braces modifications
  tty/serial: atmel_serial: change platform_data variable name
  tty/serial: RS485 bindings for device tree
  TTY: call tty_driver_lookup_tty unconditionally
  TTY: pty, release tty in all ptmx_open fail paths
  TTY: make tty_add_file non-failing
  TTY: drop driver reference in tty_open fail path
  8250_pci: Fix kernel panic when pch_uart is disabled
  h8300: drivers/serial/Kconfig was moved
  parport_pc: release IO region properly if unsupported ITE887x card is found
  tty: Support compat_ioctl get/set termios_locked
  hvc_console: display printk messages on console.
  TTY: snyclinkmp: forever loop in tx_load_dma_buffer()
  tty/n_gsm: avoid fifo overflow in gsm_dlci_data_output
  tty/n_gsm: fix a bug in gsm_dlci_data_output (adaption = 2 case)
  ...

Fix up Conflicts in:
 - drivers/tty/serial/8250_pci.c
	Trivial conflict with removed duplicate device ID
 - drivers/tty/serial/atmel_serial.c
	Annoying silly conflict between "specify the port num via
	platform_data" and other changes to atmel_console_init
2011-10-26 15:11:09 +02:00
Kumar Gala 37caf9f2a1 powerpc/fsl-booke: Handle L1 D-cache parity error correctly on e500mc
If the L1 D-Cache is in write shadow mode the HW will auto-recover the
error.  However we might still log the error and cause a machine check
(if L1CSR0[CPE] - Cache error checking enable).  We should only treat
the non-write shadow case as non-recoverable.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-10-06 23:36:55 -05:00
Paul Bolle 395cf9691d doc: fix broken references
There are numerous broken references to Documentation files (in other
Documentation files, in comments, etc.). These broken references are
caused by typo's in the references, and by renames or removals of the
Documentation files. Some broken references are simply odd.

Fix these broken references, sometimes by dropping the irrelevant text
they were part of.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-09-27 18:08:04 +02:00
Paul Mackerras 19ccb76a19 KVM: PPC: Implement H_CEDE hcall for book3s_hv in real-mode code
With a KVM guest operating in SMT4 mode (i.e. 4 hardware threads per
core), whenever a CPU goes idle, we have to pull all the other
hardware threads in the core out of the guest, because the H_CEDE
hcall is handled in the kernel.  This is inefficient.

This adds code to book3s_hv_rmhandlers.S to handle the H_CEDE hcall
in real mode.  When a guest vcpu does an H_CEDE hcall, we now only
exit to the kernel if all the other vcpus in the same core are also
idle.  Otherwise we mark this vcpu as napping, save state that could
be lost in nap mode (mainly GPRs and FPRs), and execute the nap
instruction.  When the thread wakes up, because of a decrementer or
external interrupt, we come back in at kvm_start_guest (from the
system reset interrupt vector), find the `napping' flag set in the
paca, and go to the resume path.

This has some other ramifications.  First, when starting a core, we
now start all the threads, both those that are immediately runnable and
those that are idle.  This is so that we don't have to pull all the
threads out of the guest when an idle thread gets a decrementer interrupt
and wants to start running.  In fact the idle threads will all start
with the H_CEDE hcall returning; being idle they will just do another
H_CEDE immediately and go to nap mode.

This required some changes to kvmppc_run_core() and kvmppc_run_vcpu().
These functions have been restructured to make them simpler and clearer.
We introduce a level of indirection in the wait queue that gets woken
when external and decrementer interrupts get generated for a vcpu, so
that we can have the 4 vcpus in a vcore using the same wait queue.
We need this because the 4 vcpus are being handled by one thread.

Secondly, when we need to exit from the guest to the kernel, we now
have to generate an IPI for any napping threads, because an HDEC
interrupt doesn't wake up a napping thread.

Thirdly, we now need to be able to handle virtual external interrupts
and decrementer interrupts becoming pending while a thread is napping,
and deliver those interrupts to the guest when the thread wakes.
This is done in kvmppc_cede_reentry, just before fast_guest_return.

Finally, since we are not using the generic kvm_vcpu_block for book3s_hv,
and hence not calling kvm_arch_vcpu_runnable, we can remove the #ifdef
from kvm_arch_vcpu_runnable.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-09-25 19:52:30 +03:00
Paul Mackerras 0214394760 KVM: PPC: book3s_pr: Simplify transitions between virtual and real mode
This simplifies the way that the book3s_pr makes the transition to
real mode when entering the guest.  We now call kvmppc_entry_trampoline
(renamed from kvmppc_rmcall) in the base kernel using a normal function
call instead of doing an indirect call through a pointer in the vcpu.
If kvm is a module, the module loader takes care of generating a
trampoline as it does for other calls to functions outside the module.

kvmppc_entry_trampoline then disables interrupts and jumps to
kvmppc_handler_trampoline_enter in real mode using an rfi[d].
That then uses the link register as the address to return to
(potentially in module space) when the guest exits.

This also simplifies the way that we call the Linux interrupt handler
when we exit the guest due to an external, decrementer or performance
monitor interrupt.  Instead of turning on the MMU, then deciding that
we need to call the Linux handler and turning the MMU back off again,
we now go straight to the handler at the point where we would turn the
MMU on.  The handler will then return to the virtual-mode code
(potentially in the module).

Along the way, this moves the setting and clearing of the HID5 DCBZ32
bit into real-mode interrupts-off code, and also makes sure that
we clear the MSR[RI] bit before loading values into SRR0/1.

The net result is that we no longer need any code addresses to be
stored in vcpu->arch.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-09-25 19:52:29 +03:00
Alexander Graf af8f38b349 KVM: PPC: Add sanity checking to vcpu_run
There are multiple features in PowerPC KVM that can now be enabled
depending on the user's wishes. Some of the combinations don't make
sense or don't work though.

So this patch adds a way to check if the executing environment would
actually be able to run the guest properly. It also adds sanity
checks if PVR is set (should always be true given the current code
flow), if PAPR is only used with book3s_64 where it works and that
HV KVM is only used in PAPR mode.

Signed-off-by: Alexander Graf <agraf@suse.de>
2011-09-25 19:52:27 +03:00
Alexander Graf 0254f07429 KVM: PPC: Add PAPR hypercall code for PR mode
When running a PAPR guest, we need to handle a few hypercalls in kernel space,
most prominently the page table invalidation (to sync the shadows).

So this patch adds handling for a few PAPR hypercalls to PR mode KVM. I tried
to share the code with HV mode, but it ended up being a lot easier this way
around, as the two differ too much in those details.

Signed-off-by: Alexander Graf <agraf@suse.de>

---

v1 -> v2:

  - whitespace fix
2011-09-25 19:52:24 +03:00
Alexander Graf a15bd354f0 KVM: PPC: Add support for explicit HIOR setting
Until now, we always set HIOR based on the PVR, but this is just wrong.
Instead, we should be setting HIOR explicitly, so user space can decide
what the initial HIOR value is - just like on real hardware.

We keep the old PVR based way around for backwards compatibility, but
once user space uses the SREGS based method, we drop the PVR logic.

Signed-off-by: Alexander Graf <agraf@suse.de>
2011-09-25 19:52:23 +03:00
Alexander Graf 9432ba6015 KVM: PPC: Add papr_enabled flag
When running a PAPR guest, some things change. The privilege level drops
from hypervisor to supervisor, SDR1 gets treated differently and we interpret
hypercalls. For bisectability sake, add the flag now, but only enable it when
all the support code is there.

Signed-off-by: Alexander Graf <agraf@suse.de>
2011-09-25 19:52:19 +03:00
Alexander Graf db507c300e KVM: PPC: move compute_tlbie_rb to book3s common header
We need the compute_tlbie_rb in _pr and _hv implementations for papr
soon, so let's move it over to a common header file that both
implementations can leverage.

Signed-off-by: Alexander Graf <agraf@suse.de>
2011-09-25 19:52:18 +03:00
Benjamin Herrenschmidt ed79ba9e15 powerpc/powernv: Machine check and other system interrupts
OPAL can handle various interrupt for us such as Machine Checks (it
performs all sorts of recovery tasks and passes back control to us with
informations about the error), Hardware Management Interrupts and Softpatch
interrupts.

This wires up the mechanisms and prints out specific informations returned
by HAL when a machine check occurs.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-20 16:10:03 +10:00
Benjamin Herrenschmidt 5c7c1e9444 powerpc/powernv: Add OPAL ICS backend
OPAL handles HW access to the various ICS or equivalent chips
for us (with the exception of p5ioc2 based HEA which uses a

different backend) similarily to what RTAS does on pSeries.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-20 16:09:59 +10:00
Benjamin Herrenschmidt 628daa8d5a powerpc/powernv: Add RTC and NVRAM support plus RTAS fallbacks
Implements OPAL RTC and NVRAM support and wire all that up to
the powernv platform.

We use RTAS for RTC as a fallback if available. Using RTAS for nvram
is not supported yet, pending some rework/cleanup and generalization
of the pSeries & CHRP code. We also use RTAS fallbacks for power off
and reboot

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-20 16:09:57 +10:00
Benjamin Herrenschmidt daea1175a9 powerpc/powernv: Support for OPAL console
This adds a udbg and an hvc console backend for supporting a console
using the OPAL console interfaces.

On OPAL v1 we have hvc0 mapped to whatever console the system was
configured for (network or hvsi serial port) via the service
processor.

On OPAL v2 we have hvcN mapped to the Nth console provided by OPAL
which generally corresponds to:

	hvc0 : network console (raw protocol)
	hvc1 : serial port S1 (hvsi)
	hvc2 : serial port S2 (hvsi)

Note: At this point, early debug console only works with OPAL v1
and shouldn't be enabled in a normal kernel.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-20 16:09:54 +10:00
Benjamin Herrenschmidt 14a43e69ed powerpc/powernv: Basic support for OPAL
Add definition of OPAL interfaces along with  the wrappers to call
into OPAL runtime and the early device-tree parsing hook to locate
the OPAL runtime firmware.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-20 16:09:50 +10:00
Benjamin Herrenschmidt 27f4488872 powerpc/powernv: Add OPAL takeover from PowerVM
On machines supporting the OPAL firmware version 1, the system
is initially booted under pHyp. We then use a special hypercall
to verify if OPAL is available and if it is, we then trigger
a "takeover" which disables pHyp and loads the OPAL runtime
firmware, giving control to the kernel in hypervisor mode.

This patch add the necessary code to detect that the OPAL takeover
capability is present when running under PowerVM (aka pHyp) and
perform said takeover to get hypervisor control of the processor.

To perform the takeover, we must first use RTAS (within Open
Firmware runtime environment) to start all processors & threads,
in order to give control to OPAL on all of them. We then call
the takeover hypercall on everybody, OPAL will re-enter the kernel
main entry point passing it a flat device-tree.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-20 16:09:47 +10:00
Benjamin Herrenschmidt fb82b83970 powerpc/smp: More generic support for "soft hotplug"
This adds more generic support for doing CPU hotplug with a simple
idle loop and no actual reset of the processors. The generic
smp_generic_kick_cpu() does the hotplug bringup trick if the PACA
shows that the CPU has already been started at boot and we provide
an accessor for the CPU state.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-20 15:53:24 +10:00
Anton Blanchard a11940978b powerpc: Fix oops when echoing bad values to /sys/devices/system/memory/probe
If we echo an address the hypervisor doesn't like to
/sys/devices/system/memory/probe we oops the box:

# echo 0x10000000000 > /sys/devices/system/memory/probe

kernel BUG at arch/powerpc/mm/hash_utils_64.c:541!

The backtrace is:

create_section_mapping
arch_add_memory
add_memory
memory_probe_store
sysdev_class_store
sysfs_write_file
vfs_write
SyS_write

In create_section_mapping we BUG if htab_bolt_mapping returned
an error. A better approach is to return an error which will
propagate back to userspace.

Rerunning the test with this patch applied:

# echo 0x10000000000 > /sys/devices/system/memory/probe
-bash: echo: write error: Invalid argument

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: stable@kernel.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-20 15:53:23 +10:00
Anton Blanchard e377bc5d49 powerpc/numa: Remove duplicate RECLAIM_DISTANCE definition
We have two identical definitions of RECLAIM_DISTANCE, looks like
the patch got applied twice. Remove one.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-20 15:53:22 +10:00
Anton Blanchard 7bebcf0925 powerpc/numa: Disable NEWIDLE balancing at node level
On big POWER7 boxes we see large amounts of CPU time in system
processes like workqueue and watchdog kernel threads.

We currently rebalance the entire machine each time a task goes
idle and this is very expensive on large machines. Disable newidle
balancing at the node level and rely on the scheduler tick to
rebalance across nodes.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-20 15:53:22 +10:00
Anton Blanchard d4761ad2ef powerpc/numa: Increase SD_NODES_PER_DOMAIN to 32.
The largest POWER7 boxes have 32 nodes. SD_NODES_PER_DOMAIN groups
nodes into chunks of 16 and adds a global balancing domain
(SD_ALLNODES) above it.

If we bump SD_NODES_PER_DOMAIN to 32, then we avoid this extra
level of balancing on our largest boxes.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-20 15:53:22 +10:00
Anton Blanchard a200d8e446 powerpc/numa: Enable SD_WAKE_AFFINE in node definition
When chasing a performance issue on ppc64, I noticed tasks
communicating via a pipe would often end up on different nodes.

It turns out SD_WAKE_AFFINE is not set in our node defition. Commit
9fcd18c9e6 (sched: re-tune balancing) enabled SD_WAKE_AFFINE
in the node definition for x86 and we need a similar change for
ppc64.

I used lmbench lat_ctx and perf bench pipe to verify this fix. Each
benchmark was run 10 times and the average taken.

lmbench lat_ctx:

before:  66565 ops/sec
after:  204700 ops/sec

3.1x faster

perf bench pipe:

before: 5.6570 usecs
after:  1.3470 usecs

4.2x faster

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-20 15:53:21 +10:00
Hector Martin c26afe9e85 powerpc/ps3: Add gelic udbg driver
Add a new udbg driver for the PS3 gelic Ehthernet device.

This driver shares only a few stucture and constant definitions with the
gelic Ethernet device driver, so is implemented as a stand-alone driver
with no dependencies on the gelic Ethernet device driver.

Signed-off-by: Hector Martin <hector@marcansoft.com>
Signed-off-by: Andre Heider <a.heider@gmail.com>
Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-20 09:20:05 +10:00
Jim Keniston 6c493685f1 powerpc/nvram: Add compression to fit more oops output into NVRAM
Capture more than twice as much text from the printk buffer, and
compress it to fit it in the lnx,oops-log NVRAM partition.  You
can view the compressed text using the new (as of July 20) --unzip
option of the nvram command in the powerpc-utils package.

[BenH: Added select of ZLIB_DEFLATE]

Signed-off-by: Jim Keniston <jkenisto@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-20 09:19:46 +10:00
Timur Tabi 14b9247019 powerpc/mpic: Add support for discontiguous cores
There is one place in the MPIC driver that assumes that the cores are numbered
from 0 to n-1.  However, this is not true if the CPUs are not numbered
sequentially.  This can happen on a eight-core SOC where cores two and three
are removed in the device tree.  So instead of blindly looping, we iterate
over the discovered CPUs and use the SMP ID as the index.

This means that we no longer ask the MPIC how many CPUs there are, so
we also delete mpic->num_cpus.

We also catch if the number of CPUs in the SOC exceeds the number that the
MPIC supports.  This should never happen, of course, but it's good to be
sure.

Signed-off-by: Timur Tabi <timur@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-20 09:19:42 +10:00
Becky Bruce 41151e77a4 powerpc: Hugetlb for BookE
Enable hugepages on Freescale BookE processors.  This allows the kernel to
use huge TLB entries to map pages, which can greatly reduce the number of
TLB misses and the amount of TLB thrashing experienced by applications with
large memory footprints.  Care should be taken when using this on FSL
processors, as the number of large TLB entries supported by the core is low
(16-64) on current processors.

The supported set of hugepage sizes include 4m, 16m, 64m, 256m, and 1g.
Page sizes larger than the max zone size are called "gigantic" pages and
must be allocated on the command line (and cannot be deallocated).

This is currently only fully implemented for Freescale 32-bit BookE
processors, but there is some infrastructure in the code for
64-bit BooKE.

Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-20 09:19:40 +10:00
Milton Miller d24f9c6999 powerpc: Use the newly added get_required_mask dma_map_ops hook
Now that the generic code has dma_map_ops set, instead of having a
messy ifdef & if block in the base dma_get_required_mask hook push
the computation into the dma ops.

If the ops fails to set the get_required_mask hook default to the
width of dma_addr_t.

This also corrects ibmbus ibmebus_dma_supported to require a 64
bit mask.  I doubt anything is checking or setting the dma mask on
that bus.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-kernel@vger.kernel.org
Cc: benh@kernel.crashing.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-20 09:19:35 +10:00
Milton Miller 6a5c7be5e4 powerpc: Override dma_get_required_mask by platform hook and ops
The hook dma_get_required_mask is supposed to return the mask required
by the platform to operate efficently.  The generic version of
dma_get_required_mask in driver/base/platform.c returns a mask based
only on max_pfn.  However, this is likely too big for iommu systems
and could be too small for platforms that require a dma offset or have
a secondary window at a high offset.

Override the default, provide a hook in ppc_md used by pseries lpar and
cell, and provide the default answer based on memblock_end_of_DRAM(),
with hooks for get_dma_offset, and provide an implementation for iommu
that looks at the defined table size.  Coverting from the end address
to the required bit mask is based on the generic implementation.

The need for this was discovered when the qla2xxx driver switched to
64 bit dma then reverted to 32 bit when dma_get_required_mask said
32 bits was sufficient.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-kernel@vger.kernel.org
Cc: benh@kernel.crashing.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-01 16:00:19 +10:00
Benjamin Herrenschmidt 7bfb40b048 Merge remote-tracking branch 'origin/master' into next
(Pickup Stephen's fix d4d7b2a11c)
2011-09-01 15:57:32 +10:00
Benjamin Herrenschmidt 9bb7361d99 Merge remote-tracking branch 'jwb/next' into next 2011-08-30 15:14:46 +10:00
Stephen Rothwell d4d7b2a11c remove remaining references to nfsservctl
These were missed in commit f5b9409973 "All Arch: remove linkage
for sys_nfsservctl system call" due to them having no sys_ prefix
(presumably).

Cc: NeilBrown <neilb@suse.de>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-parisc@vger.kernel.org
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: James Bottomley <James.Bottomley@hansenpartnership.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-29 16:31:59 -07:00
Timur Tabi dcd83aaff1 tty/powerpc: introduce the ePAPR embedded hypervisor byte channel driver
The ePAPR embedded hypervisor specification provides an API for "byte
channels", which are serial-like virtual devices for sending and receiving
streams of bytes.  This driver provides Linux kernel support for byte
channels via three distinct interfaces:

1) An early-console (udbg) driver.  This provides early console output
through a byte channel.  The byte channel handle must be specified in a
Kconfig option.

2) A normal console driver.  Output is sent to the byte channel designated
for stdout in the device tree.  The console driver is for handling kernel
printk calls.

3) A tty driver, which is used to handle user-space input and output.  The
byte channel used for the console is designated as the default tty.

Signed-off-by: Timur Tabi <timur@freescale.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-23 10:32:56 -07:00
Suzuki Poulose 674bfa4855 powerpc/44x: Kexec support for PPC440X chipsets
This patch adds kexec support for PPC440 based chipsets.  This work is based
on the KEXEC patches for FSL BookE.

The FSL BookE patch and the code flow could be found at the link below:

	http://patchwork.ozlabs.org/patch/49359/

Steps:

1) Invalidate all the TLB entries except the one this code is run from
2) Create a tmp mapping for our code in the other address space and jump to it
3) Invalidate the entry we used
4) Create a 1:1 mapping for 0-2GiB in blocks of 256M
5) Jump to the new 1:1 mapping and invalidate the tmp mapping

I have tested this patches on Ebony, Sequoia boards and Virtex on QEMU.

You need kexec-tools commit e8b7939b1e or newer for ppc440x support, 
available at:

 git://git.kernel.org/pub/scm/utils/kernel/kexec/kexec-tools.git

Signed-off-by: 	Suzuki Poulose <suzuki@in.ibm.com>
Cc:	Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Josh Boyer <jwboyer@gmail.com>
2011-08-11 13:50:37 -04:00
Anton Blanchard 8aa6d35929 powerpc: Move kdump default base address to half RMO size on 64bit
We are seeing boot failures on some very large boxes even with
commit b5416ca9f8 (powerpc: Move kdump default base address to
64MB on 64bit).

This patch halves the RMO so both kernels get about the same
amount of RMO memory. On large machines this region will be
at least 256MB, so each kernel will get 128MB.

We cap it at 256MB (small SLB size) since some early allocations need
to be in the bolted SLB region. We could relax this on machines with
1TB SLBs in a future patch.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-08-05 14:47:56 +10:00
Peter Zijlstra 501d238633 ppc: Remove duplicate definition of PV_POWER7
One definition of PV_POWER7 seems enough to me.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-08-05 14:47:55 +10:00
Anton Blanchard c113a3aee2 powerpc: Jump label misalignment causes oops at boot
I hit an oops at boot on the first instruction of timer_cpu_notify:

NIP [c000000000722f88] .timer_cpu_notify+0x0/0x388

The code should look like:

c000000000722f78:       eb e9 00 30     ld      r31,48(r9)
c000000000722f7c:       2f bf 00 00     cmpdi   cr7,r31,0
c000000000722f80:       40 9e ff 44     bne+    cr7,c000000000722ec4
c000000000722f84:       4b ff ff 74     b       c000000000722ef8

c000000000722f88 <.timer_cpu_notify>:
c000000000722f88:       7c 08 02 a6     mflr    r0
c000000000722f8c:       2f a4 00 07     cmpdi   cr7,r4,7
c000000000722f90:       fb c1 ff f0     std     r30,-16(r1)
c000000000722f94:       fb 61 ff d8     std     r27,-40(r1)

But the oops output shows:

eb61ffd8 eb81ffe0 eba1ffe8 ebc1fff0 7c0803a6 ebe1fff8 4e800020
00000000 ebe90030 c0000000 00ad0a28 00000000 2fa40007 fbc1fff0 fb61ffd8

So we scribbled over our instructions with c000000000ad0a28, which
is an address inside the jump_table ELF section.

It turns out the jump_table section is only aligned to 8 bytes but
we are aligning our entries within the section to 16 bytes. This
means our entries are offset from the table:

c000000000acd4a8 <__start___jump_table>:
        ...
c000000000ad0a10:       c0 00 00 00     lfs     f0,0(0)
c000000000ad0a14:       00 70 cd 5c     .long 0x70cd5c
c000000000ad0a18:       c0 00 00 00     lfs     f0,0(0)
c000000000ad0a1c:       00 70 cd 90     .long 0x70cd90
c000000000ad0a20:       c0 00 00 00     lfs     f0,0(0)
c000000000ad0a24:       00 ac a4 20     .long 0xaca420

And the jump table sort code gets very confused and writes into the
wrong spot. Remove the alignment, and also remove the padding since
we it saves some space and we shouldn't need it.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-08-05 14:47:55 +10:00
Scott Wood 326ed6a9bc powerpc: mtspr/mtmsr should take an unsigned long
Add a cast in case the caller passes in a different type, as it would
if mtspr/mtmsr were functions.

Previously, if a 64-bit type was passed in on 32-bit, GCC would bind the
constraint to a pair of registers, and would substitute the first register
in the pair in the asm code.  This corresponds to the upper half of the
64-bit register, which is generally not the desired behavior.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-08-05 14:47:54 +10:00
Linus Torvalds 3960ef326a Merge branch 'next/cross-platform' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/linux-arm-soc
* 'next/cross-platform' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/linux-arm-soc:
  ARM: Consolidate the clkdev header files
  ARM: set vga memory base at run-time
  ARM: convert PCI defines to variables
  ARM: pci: make pcibios_assign_all_busses use pci_has_flag
  ARM: remove unnecessary mach/hardware.h includes
  pci: move microblaze and powerpc pci flag functions into asm-generic
  powerpc: rename ppc_pci_*_flags to pci_*_flags

Fix up conflicts in arch/microblaze/include/asm/pci-bridge.h
2011-07-26 17:12:10 -07:00
Arun Sharma 7847777a45 atomic: cleanup asm-generic atomic*.h inclusion
After changing all consumers of atomics to include <linux/atomic.h>, we
ran into some compile time errors due to this dependency chain:

linux/atomic.h
  -> asm/atomic.h
    -> asm-generic/atomic-long.h

where atomic-long.h could use funcs defined later in linux/atomic.h
without a prototype.  This patches moves the code that includes
asm-generic/atomic*.h to linux/atomic.h.

Archs that need <asm-generic/atomic64.h> need to select
CONFIG_GENERIC_ATOMIC64 from now on (some of them used to include it
unconditionally).

Compile tested on i386 and x86_64 with allnoconfig.

Signed-off-by: Arun Sharma <asharma@fb.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-26 16:49:47 -07:00
Arun Sharma f24219b4e9 atomic: move atomic_add_unless to generic code
This is in preparation for more generic atomic primitives based on
__atomic_add_unless.

Signed-off-by: Arun Sharma <asharma@fb.com>
Signed-off-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-26 16:49:47 -07:00
Arun Sharma 60063497a9 atomic: use <linux/atomic.h>
This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>

Signed-off-by: Arun Sharma <asharma@fb.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-26 16:49:47 -07:00
Akinobu Mita 148817ba09 asm-generic: add another generic ext2 atomic bitops
The majority of architectures implement ext2 atomic bitops as
test_and_{set,clear}_bit() without spinlock.

This adds this type of generic implementation in ext2-atomic-setbit.h and
use it wherever possible.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Suggested-by: Andreas Dilger <adilger@dilger.ca>
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-26 16:49:46 -07:00
Mike Frysinger 0e9a6cb5e6 ptrace: unify show_regs() prototype
[ poleg@redhat.com: no need to declare show_regs() in ptrace.h, sched.h does this ]
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-26 16:49:43 -07:00
Linus Torvalds 184475029a Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (99 commits)
  drivers/virt: add missing linux/interrupt.h to fsl_hypervisor.c
  powerpc/85xx: fix mpic configuration in CAMP mode
  powerpc: Copy back TIF flags on return from softirq stack
  powerpc/64: Make server perfmon only built on ppc64 server devices
  powerpc/pseries: Fix hvc_vio.c build due to recent changes
  powerpc: Exporting boot_cpuid_phys
  powerpc: Add CFAR to oops output
  hvc_console: Add kdb support
  powerpc/pseries: Fix hvterm_raw_get_chars to accept < 16 chars, fixing xmon
  powerpc/irq: Quieten irq mapping printks
  powerpc: Enable lockup and hung task detectors in pseries and ppc64 defeconfigs
  powerpc: Add mpt2sas driver to pseries and ppc64 defconfig
  powerpc: Disable IRQs off tracer in ppc64 defconfig
  powerpc: Sync pseries and ppc64 defconfigs
  powerpc/pseries/hvconsole: Fix dropped console output
  hvc_console: Improve tty/console put_chars handling
  powerpc/kdump: Fix timeout in crash_kexec_wait_realmode
  powerpc/mm: Fix output of total_ram.
  powerpc/cpufreq: Add cpufreq driver for Momentum Maple boards
  powerpc: Correct annotations of pmu registration functions
  ...

Fix up trivial Kconfig/Makefile conflicts in arch/powerpc, drivers, and
drivers/cpufreq
2011-07-25 22:59:39 -07:00
Linus Torvalds d3ec4844d4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (43 commits)
  fs: Merge split strings
  treewide: fix potentially dangerous trailing ';' in #defined values/expressions
  uwb: Fix misspelling of neighbourhood in comment
  net, netfilter: Remove redundant goto in ebt_ulog_packet
  trivial: don't touch files that are removed in the staging tree
  lib/vsprintf: replace link to Draft by final RFC number
  doc: Kconfig: `to be' -> `be'
  doc: Kconfig: Typo: square -> squared
  doc: Konfig: Documentation/power/{pm => apm-acpi}.txt
  drivers/net: static should be at beginning of declaration
  drivers/media: static should be at beginning of declaration
  drivers/i2c: static should be at beginning of declaration
  XTENSA: static should be at beginning of declaration
  SH: static should be at beginning of declaration
  MIPS: static should be at beginning of declaration
  ARM: static should be at beginning of declaration
  rcu: treewide: Do not use rcu_read_lock_held when calling rcu_dereference_check
  Update my e-mail address
  PCIe ASPM: forcedly -> forcibly
  gma500: push through device driver tree
  ...

Fix up trivial conflicts:
 - arch/arm/mach-ep93xx/dma-m2p.c (deleted)
 - drivers/gpio/gpio-ep93xx.c (renamed and context nearby)
 - drivers/net/r8169.c (just context changes)
2011-07-25 13:56:39 -07:00
Linus Torvalds 5fabc487c9 Merge branch 'kvm-updates/3.1' of git://git.kernel.org/pub/scm/virt/kvm/kvm
* 'kvm-updates/3.1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (143 commits)
  KVM: IOMMU: Disable device assignment without interrupt remapping
  KVM: MMU: trace mmio page fault
  KVM: MMU: mmio page fault support
  KVM: MMU: reorganize struct kvm_shadow_walk_iterator
  KVM: MMU: lockless walking shadow page table
  KVM: MMU: do not need atomicly to set/clear spte
  KVM: MMU: introduce the rules to modify shadow page table
  KVM: MMU: abstract some functions to handle fault pfn
  KVM: MMU: filter out the mmio pfn from the fault pfn
  KVM: MMU: remove bypass_guest_pf
  KVM: MMU: split kvm_mmu_free_page
  KVM: MMU: count used shadow pages on prepareing path
  KVM: MMU: rename 'pt_write' to 'emulate'
  KVM: MMU: cleanup for FNAME(fetch)
  KVM: MMU: optimize to handle dirty bit
  KVM: MMU: cache mmio info on page fault path
  KVM: x86: introduce vcpu_mmio_gva_to_gpa to cleanup the code
  KVM: MMU: do not update slot bitmap if spte is nonpresent
  KVM: MMU: fix walking shadow page table
  KVM guest: KVM Steal time registration
  ...
2011-07-24 09:07:03 -07:00
Linus Torvalds a99a7d1436 Merge branch 'timers-cleanup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-cleanup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  mips: Fix i8253 clockevent fallout
  i8253: Cleanup outb/inb magic
  arm: Footbridge: Use common i8253 clockevent
  mips: Use common i8253 clockevent
  x86: Use common i8253 clockevent
  i8253: Create common clockevent implementation
  i8253: Export i8253_lock unconditionally
  pcpskr: MIPS: Make config dependencies finer grained
  pcspkr: Cleanup Kconfig dependencies
  i8253: Move remaining content and delete asm/i8253.h
  i8253: Consolidate definitions of PIT_LATCH
  x86: i8253: Consolidate definitions of global_clock_event
  i8253: Alpha, PowerPC: Remove unused asm/8253pit.h
  alpha: i8253: Cleanup remaining users of i8253pit.h
  i8253: Remove I8253_LOCK config
  i8253: Make pcsp sound driver use the shared i8253_lock
  i8253: Make pcspkr input driver use the shared i8253_lock
  i8253: Consolidate all kernel definitions of i8253_lock
  i8253: Unify all kernel declarations of i8253_lock
  i8253: Create linux/i8253.h and use it in all 8253 related files
2011-07-22 16:51:56 -07:00
Linus Torvalds 4d4abdcb1d Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (123 commits)
  perf: Remove the nmi parameter from the oprofile_perf backend
  x86, perf: Make copy_from_user_nmi() a library function
  perf: Remove perf_event_attr::type check
  x86, perf: P4 PMU - Fix typos in comments and style cleanup
  perf tools: Make test use the preset debugfs path
  perf tools: Add automated tests for events parsing
  perf tools: De-opt the parse_events function
  perf script: Fix display of IP address for non-callchain path
  perf tools: Fix endian conversion reading event attr from file header
  perf tools: Add missing 'node' alias to the hw_cache[] array
  perf probe: Support adding probes on offline kernel modules
  perf probe: Add probed module in front of function
  perf probe: Introduce debuginfo to encapsulate dwarf information
  perf-probe: Move dwarf library routines to dwarf-aux.{c, h}
  perf probe: Remove redundant dwarf functions
  perf probe: Move strtailcmp to string.c
  perf probe: Rename DIE_FIND_CB_FOUND to DIE_FIND_CB_END
  tracing/kprobe: Update symbol reference when loading module
  tracing/kprobes: Support module init function probing
  kprobes: Return -ENOENT if probe point doesn't exist
  ...
2011-07-22 16:44:39 -07:00
Linus Torvalds acb41c0f92 Merge branch 'of-pci' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'of-pci' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  pci/of: Consolidate pci_bus_to_OF_node()
  pci/of: Consolidate pci_device_to_OF_node()
  x86/devicetree: Use generic PCI <-> OF matching
  microblaze/pci: Move the remains of pci_32.c to pci-common.c
  microblaze/pci: Remove powermac originated cruft
  pci/of: Match PCI devices to OF nodes dynamically
2011-07-22 14:54:02 -07:00
Benjamin Herrenschmidt 4b575f3e8a Merge remote-tracking branch 'jwb/next' into next 2011-07-22 13:16:41 +10:00
Matt Evans 0ca87f05ba net: filter: BPF 'JIT' compiler for PPC64
An implementation of a code generator for BPF programs to speed up packet
filtering on PPC64, inspired by Eric Dumazet's x86-64 version.

Filter code is generated as an ABI-compliant function in module_alloc()'d mem
with stackframe & prologue/epilogue generated if required (simple filters don't
need anything more than an li/blr).  The filter's local variables, M[], live in
registers.  Supports all BPF opcodes, although "complicated" loads from negative
packet offsets (e.g. SKF_LL_OFF) are not yet supported.

There are a couple of further optimisations left for future work; many-pass
assembly with branch-reach reduction and a register allocator to push M[]
variables into volatile registers would improve the code quality further.

This currently supports big-endian 64-bit PowerPC only (but is fairly simple
to port to PPC32 or LE!).

Enabled in the same way as x86-64:

	echo 1 > /proc/sys/net/core/bpf_jit_enable

Or, enabled with extra debug output:

	echo 2 > /proc/sys/net/core/bpf_jit_enable

Signed-off-by: Matt Evans <matt@ozlabs.org>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-21 12:38:32 -07:00
Phil Carmody 497888cf69 treewide: fix potentially dangerous trailing ';' in #defined values/expressions
All these are instances of
  #define NAME value;
or
  #define NAME(params_opt) value;

These of course fail to build when used in contexts like
  if(foo $OP NAME)
  while(bar $OP NAME)
and may silently generate the wrong code in contexts such as
  foo = NAME + 1;    /* foo = value; + 1; */
  bar = NAME - 1;    /* bar = value; - 1; */
  baz = NAME & quux; /* baz = value; & quux; */

Reported on comp.lang.c,
Message-ID: <ab0d55fe-25e5-482b-811e-c475aa6065c3@c29g2000yqd.googlegroups.com>
Initial analysis of the dangers provided by Keith Thompson in that thread.

There are many more instances of more complicated macros having unnecessary
trailing semicolons, but this pile seems to be all of the cases of simple
values suffering from the problem. (Thus things that are likely to be found
in one of the contexts above, more complicated ones aren't.)

Signed-off-by: Phil Carmody <ext-phil.2.carmody@nokia.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-07-21 14:10:00 +02:00
Rob Herring f4ffd5e5df pci: move microblaze and powerpc pci flag functions into asm-generic
Move separate microblaze and powerpc pci flag functions pci_set_flags,
pci_add_flags, and pci_has_flag into asm-generic/pci-bridge.h so other
archs can use them.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Michal Simek <monstr@monstr.eu>
2011-07-12 11:13:07 -05:00
Rob Herring 0e47ff1ce6 powerpc: rename ppc_pci_*_flags to pci_*_flags
This renames pci flags functions and enums in preparation for creating
generic version in asm-generic/pci-bridge.h. The following search and
replace is done:

s/ppc_pci_/pci_/
s/PPC_PCI_/PCI_/

Direct accesses to ppc_pci_flag variable are replaced with helper
functions.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
2011-07-12 09:28:04 -05:00
Dave Kleikamp 91b191c71e powerpc/44x: don't use tlbivax on AMP systems
Since other OS's may be running on the other cores don't use tlbivax

Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2011-07-12 09:21:55 -04:00
Alexander Graf 29d03158f9 KVM: PPC: Remove prog_flags
Commit c8f729d408 (KVM: PPC: Deliver program interrupts right away instead
of queueing them) made away with all users of prog_flags, so we can just
remove it from the headers.

Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:17:00 +03:00
Paul Mackerras 9e368f2915 KVM: PPC: book3s_hv: Add support for PPC970-family processors
This adds support for running KVM guests in supervisor mode on those
PPC970 processors that have a usable hypervisor mode.  Unfortunately,
Apple G5 machines have supervisor mode disabled (MSR[HV] is forced to
1), but the YDL PowerStation does have a usable hypervisor mode.

There are several differences between the PPC970 and POWER7 in how
guests are managed.  These differences are accommodated using the
CPU_FTR_ARCH_201 (PPC970) and CPU_FTR_ARCH_206 (POWER7) CPU feature
bits.  Notably, on PPC970:

* The LPCR, LPID or RMOR registers don't exist, and the functions of
  those registers are provided by bits in HID4 and one bit in HID0.

* External interrupts can be directed to the hypervisor, but unlike
  POWER7 they are masked by MSR[EE] in non-hypervisor modes and use
  SRR0/1 not HSRR0/1.

* There is no virtual RMA (VRMA) mode; the guest must use an RMO
  (real mode offset) area.

* The TLB entries are not tagged with the LPID, so it is necessary to
  flush the whole TLB on partition switch.  Furthermore, when switching
  partitions we have to ensure that no other CPU is executing the tlbie
  or tlbsync instructions in either the old or the new partition,
  otherwise undefined behaviour can occur.

* The PMU has 8 counters (PMC registers) rather than 6.

* The DSCR, PURR, SPURR, AMR, AMOR, UAMOR registers don't exist.

* The SLB has 64 entries rather than 32.

* There is no mediated external interrupt facility, so if we switch to
  a guest that has a virtual external interrupt pending but the guest
  has MSR[EE] = 0, we have to arrange to have an interrupt pending for
  it so that we can get control back once it re-enables interrupts.  We
  do that by sending ourselves an IPI with smp_send_reschedule after
  hard-disabling interrupts.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:59 +03:00
Paul Mackerras 969391c58a powerpc, KVM: Split HVMODE_206 cpu feature bit into separate HV and architecture bits
This replaces the single CPU_FTR_HVMODE_206 bit with two bits, one to
indicate that we have a usable hypervisor mode, and another to indicate
that the processor conforms to PowerISA version 2.06.  We also add
another bit to indicate that the processor conforms to ISA version 2.01
and set that for PPC970 and derivatives.

Some PPC970 chips (specifically those in Apple machines) have a
hypervisor mode in that MSR[HV] is always 1, but the hypervisor mode
is not useful in the sense that there is no way to run any code in
supervisor mode (HV=0 PR=0).  On these processors, the LPES0 and LPES1
bits in HID4 are always 0, and we use that as a way of detecting that
hypervisor mode is not useful.

Where we have a feature section in assembly code around code that
only applies on POWER7 in hypervisor mode, we use a construct like

END_FTR_SECTION_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206)

The definition of END_FTR_SECTION_IFSET is such that the code will
be enabled (not overwritten with nops) only if all bits in the
provided mask are set.

Note that the CPU feature check in __tlbie() only needs to check the
ARCH_206 bit, not the HVMODE bit, because __tlbie() can only get called
if we are running bare-metal, i.e. in hypervisor mode.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:58 +03:00
Paul Mackerras aa04b4cc5b KVM: PPC: Allocate RMAs (Real Mode Areas) at boot for use by guests
This adds infrastructure which will be needed to allow book3s_hv KVM to
run on older POWER processors, including PPC970, which don't support
the Virtual Real Mode Area (VRMA) facility, but only the Real Mode
Offset (RMO) facility.  These processors require a physically
contiguous, aligned area of memory for each guest.  When the guest does
an access in real mode (MMU off), the address is compared against a
limit value, and if it is lower, the address is ORed with an offset
value (from the Real Mode Offset Register (RMOR)) and the result becomes
the real address for the access.  The size of the RMA has to be one of
a set of supported values, which usually includes 64MB, 128MB, 256MB
and some larger powers of 2.

Since we are unlikely to be able to allocate 64MB or more of physically
contiguous memory after the kernel has been running for a while, we
allocate a pool of RMAs at boot time using the bootmem allocator.  The
size and number of the RMAs can be set using the kvm_rma_size=xx and
kvm_rma_count=xx kernel command line options.

KVM exports a new capability, KVM_CAP_PPC_RMA, to signal the availability
of the pool of preallocated RMAs.  The capability value is 1 if the
processor can use an RMA but doesn't require one (because it supports
the VRMA facility), or 2 if the processor requires an RMA for each guest.

This adds a new ioctl, KVM_ALLOCATE_RMA, which allocates an RMA from the
pool and returns a file descriptor which can be used to map the RMA.  It
also returns the size of the RMA in the argument structure.

Having an RMA means we will get multiple KMV_SET_USER_MEMORY_REGION
ioctl calls from userspace.  To cope with this, we now preallocate the
kvm->arch.ram_pginfo array when the VM is created with a size sufficient
for up to 64GB of guest memory.  Subsequently we will get rid of this
array and use memory associated with each memslot instead.

This moves most of the code that translates the user addresses into
host pfns (page frame numbers) out of kvmppc_prepare_vrma up one level
to kvmppc_core_prepare_memory_region.  Also, instead of having to look
up the VMA for each page in order to check the page size, we now check
that the pages we get are compound pages of 16MB.  However, if we are
adding memory that is mapped to an RMA, we don't bother with calling
get_user_pages_fast and instead just offset from the base pfn for the
RMA.

Typically the RMA gets added after vcpus are created, which makes it
inconvenient to have the LPCR (logical partition control register) value
in the vcpu->arch struct, since the LPCR controls whether the processor
uses RMA or VRMA for the guest.  This moves the LPCR value into the
kvm->arch struct and arranges for the MER (mediated external request)
bit, which is the only bit that varies between vcpus, to be set in
assembly code when going into the guest if there is a pending external
interrupt request.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:57 +03:00
Paul Mackerras 371fefd6f2 KVM: PPC: Allow book3s_hv guests to use SMT processor modes
This lifts the restriction that book3s_hv guests can only run one
hardware thread per core, and allows them to use up to 4 threads
per core on POWER7.  The host still has to run single-threaded.

This capability is advertised to qemu through a new KVM_CAP_PPC_SMT
capability.  The return value of the ioctl querying this capability
is the number of vcpus per virtual CPU core (vcore), currently 4.

To use this, the host kernel should be booted with all threads
active, and then all the secondary threads should be offlined.
This will put the secondary threads into nap mode.  KVM will then
wake them from nap mode and use them for running guest code (while
they are still offline).  To wake the secondary threads, we send
them an IPI using a new xics_wake_cpu() function, implemented in
arch/powerpc/sysdev/xics/icp-native.c.  In other words, at this stage
we assume that the platform has a XICS interrupt controller and
we are using icp-native.c to drive it.  Since the woken thread will
need to acknowledge and clear the IPI, we also export the base
physical address of the XICS registers using kvmppc_set_xics_phys()
for use in the low-level KVM book3s code.

When a vcpu is created, it is assigned to a virtual CPU core.
The vcore number is obtained by dividing the vcpu number by the
number of threads per core in the host.  This number is exported
to userspace via the KVM_CAP_PPC_SMT capability.  If qemu wishes
to run the guest in single-threaded mode, it should make all vcpu
numbers be multiples of the number of threads per core.

We distinguish three states of a vcpu: runnable (i.e., ready to execute
the guest), blocked (that is, idle), and busy in host.  We currently
implement a policy that the vcore can run only when all its threads
are runnable or blocked.  This way, if a vcpu needs to execute elsewhere
in the kernel or in qemu, it can do so without being starved of CPU
by the other vcpus.

When a vcore starts to run, it executes in the context of one of the
vcpu threads.  The other vcpu threads all go to sleep and stay asleep
until something happens requiring the vcpu thread to return to qemu,
or to wake up to run the vcore (this can happen when another vcpu
thread goes from busy in host state to blocked).

It can happen that a vcpu goes from blocked to runnable state (e.g.
because of an interrupt), and the vcore it belongs to is already
running.  In that case it can start to run immediately as long as
the none of the vcpus in the vcore have started to exit the guest.
We send the next free thread in the vcore an IPI to get it to start
to execute the guest.  It synchronizes with the other threads via
the vcore->entry_exit_count field to make sure that it doesn't go
into the guest if the other vcpus are exiting by the time that it
is ready to actually enter the guest.

Note that there is no fixed relationship between the hardware thread
number and the vcpu number.  Hardware threads are assigned to vcpus
as they become runnable, so we will always use the lower-numbered
hardware threads in preference to higher-numbered threads if not all
the vcpus in the vcore are runnable, regardless of which vcpus are
runnable.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:57 +03:00
David Gibson 54738c0971 KVM: PPC: Accelerate H_PUT_TCE by implementing it in real mode
This improves I/O performance for guests using the PAPR
paravirtualization interface by making the H_PUT_TCE hcall faster, by
implementing it in real mode.  H_PUT_TCE is used for updating virtual
IOMMU tables, and is used both for virtual I/O and for real I/O in the
PAPR interface.

Since this moves the IOMMU tables into the kernel, we define a new
KVM_CREATE_SPAPR_TCE ioctl to allow qemu to create the tables.  The
ioctl returns a file descriptor which can be used to mmap the newly
created table.  The qemu driver models use them in the same way as
userspace managed tables, but they can be updated directly by the
guest with a real-mode H_PUT_TCE implementation, reducing the number
of host/guest context switches during guest IO.

There are certain circumstances where it is useful for userland qemu
to write to the TCE table even if the kernel H_PUT_TCE path is used
most of the time.  Specifically, allowing this will avoid awkwardness
when we need to reset the table.  More importantly, we will in the
future need to write the table in order to restore its state after a
checkpoint resume or migration.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:56 +03:00
Paul Mackerras a8606e20e4 KVM: PPC: Handle some PAPR hcalls in the kernel
This adds the infrastructure for handling PAPR hcalls in the kernel,
either early in the guest exit path while we are still in real mode,
or later once the MMU has been turned back on and we are in the full
kernel context.  The advantage of handling hcalls in real mode if
possible is that we avoid two partition switches -- and this will
become more important when we support SMT4 guests, since a partition
switch means we have to pull all of the threads in the core out of
the guest.  The disadvantage is that we can only access the kernel
linear mapping, not anything vmalloced or ioremapped, since the MMU
is off.

This also adds code to handle the following hcalls in real mode:

H_ENTER       Add an HPTE to the hashed page table
H_REMOVE      Remove an HPTE from the hashed page table
H_READ        Read HPTEs from the hashed page table
H_PROTECT     Change the protection bits in an HPTE
H_BULK_REMOVE Remove up to 4 HPTEs from the hashed page table
H_SET_DABR    Set the data address breakpoint register

Plus code to handle the following hcalls in the kernel:

H_CEDE        Idle the vcpu until an interrupt or H_PROD hcall arrives
H_PROD        Wake up a ceded vcpu
H_REGISTER_VPA Register a virtual processor area (VPA)

The code that runs in real mode has to be in the base kernel, not in
the module, if KVM is compiled as a module.  The real-mode code can
only access the kernel linear mapping, not vmalloc or ioremap space.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:55 +03:00
Paul Mackerras de56a948b9 KVM: PPC: Add support for Book3S processors in hypervisor mode
This adds support for KVM running on 64-bit Book 3S processors,
specifically POWER7, in hypervisor mode.  Using hypervisor mode means
that the guest can use the processor's supervisor mode.  That means
that the guest can execute privileged instructions and access privileged
registers itself without trapping to the host.  This gives excellent
performance, but does mean that KVM cannot emulate a processor
architecture other than the one that the hardware implements.

This code assumes that the guest is running paravirtualized using the
PAPR (Power Architecture Platform Requirements) interface, which is the
interface that IBM's PowerVM hypervisor uses.  That means that existing
Linux distributions that run on IBM pSeries machines will also run
under KVM without modification.  In order to communicate the PAPR
hypercalls to qemu, this adds a new KVM_EXIT_PAPR_HCALL exit code
to include/linux/kvm.h.

Currently the choice between book3s_hv support and book3s_pr support
(i.e. the existing code, which runs the guest in user mode) has to be
made at kernel configuration time, so a given kernel binary can only
do one or the other.

This new book3s_hv code doesn't support MMIO emulation at present.
Since we are running paravirtualized guests, this isn't a serious
restriction.

With the guest running in supervisor mode, most exceptions go straight
to the guest.  We will never get data or instruction storage or segment
interrupts, alignment interrupts, decrementer interrupts, program
interrupts, single-step interrupts, etc., coming to the hypervisor from
the guest.  Therefore this introduces a new KVMTEST_NONHV macro for the
exception entry path so that we don't have to do the KVM test on entry
to those exception handlers.

We do however get hypervisor decrementer, hypervisor data storage,
hypervisor instruction storage, and hypervisor emulation assist
interrupts, so we have to handle those.

In hypervisor mode, real-mode accesses can access all of RAM, not just
a limited amount.  Therefore we put all the guest state in the vcpu.arch
and use the shadow_vcpu in the PACA only for temporary scratch space.
We allocate the vcpu with kzalloc rather than vzalloc, and we don't use
anything in the kvmppc_vcpu_book3s struct, so we don't allocate it.
We don't have a shared page with the guest, but we still need a
kvm_vcpu_arch_shared struct to store the values of various registers,
so we include one in the vcpu_arch struct.

The POWER7 processor has a restriction that all threads in a core have
to be in the same partition.  MMU-on kernel code counts as a partition
(partition 0), so we have to do a partition switch on every entry to and
exit from the guest.  At present we require the host and guest to run
in single-thread mode because of this hardware restriction.

This code allocates a hashed page table for the guest and initializes
it with HPTEs for the guest's Virtual Real Memory Area (VRMA).  We
require that the guest memory is allocated using 16MB huge pages, in
order to simplify the low-level memory management.  This also means that
we can get away without tracking paging activity in the host for now,
since huge pages can't be paged or swapped.

This also adds a few new exports needed by the book3s_hv code.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:54 +03:00
Paul Mackerras 3c42bf8a71 KVM: PPC: Split host-state fields out of kvmppc_book3s_shadow_vcpu
There are several fields in struct kvmppc_book3s_shadow_vcpu that
temporarily store bits of host state while a guest is running,
rather than anything relating to the particular guest or vcpu.
This splits them out into a new kvmppc_host_state structure and
modifies the definitions in asm-offsets.c to suit.

On 32-bit, we have a kvmppc_host_state structure inside the
kvmppc_book3s_shadow_vcpu since the assembly code needs to be able
to get to them both with one pointer.  On 64-bit they are separate
fields in the PACA.  This means that on 64-bit we don't need to
copy the kvmppc_host_state in and out on vcpu load/unload, and
in future will mean that the book3s_hv code doesn't need a
shadow_vcpu struct in the PACA at all.  That does mean that we
have to be careful not to rely on any values persisting in the
hstate field of the paca across any point where we could block
or get preempted.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:53 +03:00
Paul Mackerras 923c53caea powerpc: Set up LPCR for running guest partitions
In hypervisor mode, the LPCR controls several aspects of guest
partitions, including virtual partition memory mode, and also controls
whether the hypervisor decrementer interrupts are enabled.  This sets
up LPCR at boot time so that guest partitions will use a virtual real
memory area (VRMA) composed of 16MB large pages, and hypervisor
decrementer interrupts are disabled.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:52 +03:00
Paul Mackerras df6909e5d5 KVM: PPC: Move guest enter/exit down into subarch-specific code
Instead of doing the kvm_guest_enter/exit() and local_irq_dis/enable()
calls in powerpc.c, this moves them down into the subarch-specific
book3s_pr.c and booke.c.  This eliminates an extra local_irq_enable()
call in book3s_pr.c, and will be needed for when we do SMT4 guest
support in the book3s hypervisor mode code.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:51 +03:00
Paul Mackerras f9e0554dec KVM: PPC: Pass init/destroy vm and prepare/commit memory region ops down
This arranges for the top-level arch/powerpc/kvm/powerpc.c file to
pass down some of the calls it gets to the lower-level subarchitecture
specific code.  The lower-level implementations (in booke.c and book3s.c)
are no-ops.  The coming book3s_hv.c will need this.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:50 +03:00
Paul Mackerras b01c8b54a1 powerpc, KVM: Rework KVM checks in first-level interrupt handlers
Instead of branching out-of-line with the DO_KVM macro to check if we
are in a KVM guest at the time of an interrupt, this moves the KVM
check inline in the first-level interrupt handlers.  This speeds up
the non-KVM case and makes sure that none of the interrupt handlers
are missing the check.

Because the first-level interrupt handlers are now larger, some things
had to be move out of line in exceptions-64s.S.

This all necessitated some minor changes to the interrupt entry code
in KVM.  This also streamlines the book3s_32 KVM test.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:48 +03:00
Paul Mackerras f05ed4d56e KVM: PPC: Split out code from book3s.c into book3s_pr.c
In preparation for adding code to enable KVM to use hypervisor mode
on 64-bit Book 3S processors, this splits book3s.c into two files,
book3s.c and book3s_pr.c, where book3s_pr.c contains the code that is
specific to running the guest in problem state (user mode) and book3s.c
contains code which should apply to all Book 3S processors.

In doing this, we abstract some details, namely the interrupt offset,
updating the interrupt pending flag, and detecting if the guest is
in a critical section.  These are all things that will be different
when we use hypervisor mode.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:47 +03:00
Paul Mackerras c4befc58a0 KVM: PPC: Move fields between struct kvm_vcpu_arch and kvmppc_vcpu_book3s
This moves the slb field, which represents the state of the emulated
SLB, from the kvmppc_vcpu_book3s struct to the kvm_vcpu_arch, and the
hpte_hash_[v]pte[_long] fields from kvm_vcpu_arch to kvmppc_vcpu_book3s.
This is in accord with the principle that the kvm_vcpu_arch struct
represents the state of the emulated CPU, and the kvmppc_vcpu_book3s
struct holds the auxiliary data structures used in the emulation.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:46 +03:00
Liu Yu dd9ebf1f94 KVM: PPC: e500: Add shadow PID support
Dynamically assign host PIDs to guest PIDs, splitting each guest PID into
multiple host (shadow) PIDs based on kernel/user and MSR[IS/DS].  Use
both PID0 and PID1 so that the shadow PIDs for the right mode can be
selected, that correspond both to guest TID = zero and guest TID = guest
PID.

This allows us to significantly reduce the frequency of needing to
invalidate the entire TLB.  When the guest mode or PID changes, we just
update the host PID0/PID1.  And since the allocation of shadow PIDs is
global, multiple guests can share the TLB without conflict.

Note that KVM does not yet support the guest setting PID1 or PID2 to
a value other than zero.  This will need to be fixed for nested KVM
to work.  Until then, we enforce the requirement for guest PID1/PID2
to stay zero by failing the emulation if the guest tries to set them
to something else.

Signed-off-by: Liu Yu <yu.liu@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:39 +03:00
Liu Yu 08b7fa92b9 KVM: PPC: e500: Stop keeping shadow TLB
Instead of a fully separate set of TLB entries, keep just the
pfn and dirty status.

Signed-off-by: Liu Yu <yu.liu@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:38 +03:00
Scott Wood a4cd8b23ac KVM: PPC: e500: enable magic page
This is a shared page used for paravirtualization.  It is always present
in the guest kernel's effective address space at the address indicated
by the hypercall that enables it.

The physical address specified by the hypercall is not used, as
e500 does not have real mode.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:37 +03:00
Scott Wood 59c1f4e35c KVM: PPC: e500: Eliminate shadow_pages[], and use pfns instead.
This is in line with what other architectures do, and will allow us to
map things other than ordinary, unreserved kernel pages -- such as
dedicated devices, or large contiguous reserved regions.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:35 +03:00
Scott Wood 4cd35f675b KVM: PPC: e500: Save/restore SPE state
This is done lazily.  The SPE save will be done only if the guest has
used SPE since the last preemption or heavyweight exit.  Restore will be
done only on demand, when enabling MSR_SPE in the shadow MSR, in response
to an SPE fault or mtmsr emulation.

For SPEFSCR, Linux already switches it on context switch (non-lazily), so
the only remaining bit is to save it between qemu and the guest.

Signed-off-by: Liu Yu <yu.liu@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:32 +03:00
Scott Wood ecee273fc4 KVM: PPC: booke: use shadow_msr
Keep the guest MSR and the guest-mode true MSR separate, rather than
modifying the guest MSR on each guest entry to produce a true MSR.

Any bits which should be modified based on guest MSR must be explicitly
propagated from vcpu->arch.shared->msr to vcpu->arch.shadow_msr in
kvmppc_set_msr().

While we're modifying the guest entry code, reorder a few instructions
to bury some load latencies.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:32 +03:00
Scott Wood c51584d52e powerpc/e500: SPE register saving: take arbitrary struct offset
Previously, these macros hardcoded THREAD_EVR0 as the base of the save
area, relative to the base register passed.  This base offset is now
passed as a separate macro parameter, allowing reuse with other SPE
save areas, such as used by KVM.

Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:31 +03:00
Alexander Graf a22a2daccf KVM: PPC: Resolve real-mode handlers through function exports
Up until now, Book3S KVM had variables stored in the kernel that a kernel module
or the kvm code in the kernel could read from to figure out where some real mode
helper functions are located.

This is all unnecessary. The high bits of the EA get ignore in real mode, so we
can just use the pointer as is. Also, it's a lot easier on relocations when we
use the normal way of resolving the address to a function, instead of jumping
through hoops.

This patch fixes compilation with CONFIG_RELOCATABLE=y.

Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:29 +03:00
Jiri Kosina b7e9c223be Merge branch 'master' into for-next
Sync with Linus' tree to be able to apply pending patches that
are based on newer code already present upstream.
2011-07-11 14:15:55 +02:00
Becky Bruce 3160b09796 powerpc: Create next_tlbcam_idx percpu variable for FSL_BOOKE
This is used to round-robin TLBCAM entries.

Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-07-08 00:21:34 -05:00
Peter Zijlstra 29dfc4fd7e perf, powerpc: Fix build borkage
The patch a8b0ca17b8 ("perf: Remove the nmi parameter from the swevent
and overflow interface") missed a spot in the ppc hw_breakpoint code,
fix this up so things compile again.

Reported-by: Ingo Molnar <mingo@elte.hu>
Cc: Anton Blanchard <anton@samba.org>
Cc: Eric B Munson <emunson@mgebm.net>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-09pfip95g88s70iwkxu6nnbt@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-01 16:20:04 +02:00
Peter Zijlstra a8b0ca17b8 perf: Remove the nmi parameter from the swevent and overflow interface
The nmi parameter indicated if we could do wakeups from the current
context, if not, we would set some state and self-IPI and let the
resulting interrupt do the wakeup.

For the various event classes:

  - hardware: nmi=0; PMI is in fact an NMI or we run irq_work_run from
    the PMI-tail (ARM etc.)
  - tracepoint: nmi=0; since tracepoint could be from NMI context.
  - software: nmi=[0,1]; some, like the schedule thing cannot
    perform wakeups, and hence need 0.

As one can see, there is very little nmi=1 usage, and the down-side of
not using it is that on some platforms some software events can have a
jiffy delay in wakeup (when arch_irq_work_raise isn't implemented).

The up-side however is that we can remove the nmi parameter and save a
bunch of conditionals in fast paths.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Michael Cree <mcree@orcon.net.nz>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Eric B Munson <emunson@mgebm.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Don Zickus <dzickus@redhat.com>
Link: http://lkml.kernel.org/n/tip-agjev8eu666tvknpb3iaj0fg@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-01 11:06:35 +02:00
Michael Ellerman ac5f89c7d8 powerpc: Add jump label support
This patch adds support for the new "jump label" feature.

Unlike x86 and sparc we just merrily patch the code with no locks etc,
as far as I know this is safe, but I'm not really sure what the x86/sparc
code is protecting against so maybe it's not.

I also don't see any reason for us to implement the poke_early() routine,
even though sparc does.

[BenH: Updated the patch to upstream generic changes]

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-07-01 13:48:55 +10:00
Benjamin Herrenschmidt 87fa35dd88 powerpc/hvsi: Fix conflict with old HVSI driver
A mix of think & mismerge on my side caused a problem where both the
new hvsi_lib and the old hvsi driver gets compiled and try to define
symbols with the same name.

This fixes it by renaming the hvsi_lib exported symbols.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-07-01 13:10:21 +10:00
Benjamin Herrenschmidt 9def247a70 powerpc: Fix build problem with default ppc_md.progress commit
a9c0f41b3a breaks the build
on some platforms. The extern declaration must be shielded
against assembly.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-07-01 13:00:21 +10:00
Dave Carroll a9c0f41b3a powerpc: Add printk companion for ppc_md.progress
This patch adds a printk companion to replace the udbg progress function
when initmem is freed.

Suggested-by: Milton Miller <miltonm@bga.com>
Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Dave Carroll <dcarroll@astekcorp.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-06-30 15:28:05 +10:00
Benjamin Herrenschmidt 6da49a2925 Merge remote branch 'origin/master' into next 2011-06-30 15:23:59 +10:00
Benjamin Herrenschmidt 17bdc6c0e9 powerpc/pseries: Move hvsi support into a library
This will allow a different backend to share it

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-06-29 17:48:37 +10:00
Benjamin Herrenschmidt 4d2bb3f500 powerpc/pseries: Re-implement HVSI as part of hvc_vio
On pseries machines, consoles are provided by the hypervisor using
a low level get_chars/put_chars type interface. However, this is
really just a transport to the service processor which implements
them either as "raw" console (networked consoles, HMC, ...) or as
"hvsi" serial ports.

The later is a simple packet protocol on top of the raw character
interface that is supposed to convey additional "serial port" style
semantics. In practice however, all it does is provide a way to
read the CD line and set/clear our DTR line, that's it.

We currently implement the "raw" protocol as an hvc console backend
(/dev/hvcN) and the "hvsi" protocol using a separate tty driver
(/dev/hvsi0).

However this is quite impractical. The arbitrary difference between
the two type of devices has been a major source of user (and distro)
confusion. Additionally, there's an additional mini -hvsi implementation
in the pseries platform code for our low level debug console and early
boot kernel messages, which means code duplication, though that low
level variant is impractical as it's incapable of doing the initial
protocol negociation to establish the link to the FSP.

This essentially replaces the dedicated hvsi driver and the platform
udbg code completely by extending the existing hvc_vio backend used
in "raw" mode so that:

 - It now supports HVSI as well
 - We add support for hvc backend providing tiocm{get,set}
 - It also provides a udbg interface for early debug and boot console

This is overall less code, though this will only be obvious once we
remove the old "hvsi" driver, which is still available for now. When
the old driver is enabled, the new code still kicks in for the low
level udbg console, replacing the old mini implementation in the platform
code, it just doesn't provide the higher level "hvc" interface.

In addition to producing generally simler code, this has several benefits
over our current situation:

 - The user/distro only has to deal with /dev/hvcN for the hypervisor
console, avoiding all sort of confusion that has plagued us in the past

 - The tty, kernel and low level debug console all use the same code
base which supports the full protocol establishment process, thus the
console is now available much earlier than it used to be with the
old HVSI driver. The kernel console works much earlier and udbg is
available much earlier too. Hackers can enable a hard coded very-early
debug console as well that works with HVSI (previously that was only
supported for the "raw" mode).

I've tried to keep the same semantics as hvsi relative to how I react
to things like CD changes, with some subtle differences though:

 - I clear DTR on close if HUPCL is set

 - Current hvsi triggers a hangup if it detects a up->down transition
   on CD (you can still open a console with CD down). My new implementation
   triggers a hangup if the link to the FSP is severed, and severs it upon
   detecting a up->down transition on CD.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-06-29 17:48:35 +10:00
Benjamin Herrenschmidt 048bee7718 powerpc/pseries: Factor HVSI header struct in packet definitions
Embed the struct hvsi_header in the various packet definitions
rather than open coding it multiple times. Will help provide
stronger type checking.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-06-29 17:48:30 +10:00
Benjamin Herrenschmidt 725e789f22 powerpc/hvsi: Move HVSI protocol definitions to a header file
This moves various HVSI protocol definitions from the hvsi.c
driver to a header file that can be used later on by a udbg
implementation

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-06-29 17:48:28 +10:00
Akinobu Mita 3aef19f0a1 powerpc/pseries: Introduce pSeries_reconfig_notify()
This introduces pSeries_reconfig_notify() as a just wrapper of
blocking_notifier_call_chain() for pSeries_reconfig_chain.

This is a preparation to improvement of error code on reconfiguration
notifier failure.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-06-29 17:48:22 +10:00
Becky Bruce 72632ce5a4 powerpc: Whitespace fix to include/asm/pgtable-ppc64.h
Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-06-29 17:48:16 +10:00
Scott Wood f67f4ef5fc powerpc/book3e-64: use a separate TLB handler when linear map is bolted
On MMUs such as FSL where we can guarantee the entire linear mapping is
bolted, we don't need to worry about linear TLB misses.  If on top of
that we do a full table walk, we get rid of all recursive TLB faults, and
can dispense with some state saving.  This gains a few percent on
TLB-miss-heavy workloads, and around 50% on a benchmark that had a high
rate of virtual page table faults under the normal handler.

While touching the EX_TLB layout, remove EX_TLB_MMUCR0, EX_TLB_SRR0, and
EX_TLB_SRR1 as they're not used.

[BenH: Fixed build with 64K pages (wsp config)]

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-06-29 17:47:48 +10:00
Scott Wood 3d97a619ac powerpc/book3e-64: Reraise doorbell when masked by soft-irq-disable
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-06-29 16:40:59 +10:00
KAMEZAWA Hiroyuki c6830c2260 Fix node_start/end_pfn() definition for mm/page_cgroup.c
commit 21a3c96 uses node_start/end_pfn(nid) for detection start/end
of nodes. But, it's not defined in linux/mmzone.h but defined in
/arch/???/include/mmzone.h which is included only under
CONFIG_NEED_MULTIPLE_NODES=y.

Then, we see
  mm/page_cgroup.c: In function 'page_cgroup_init':
  mm/page_cgroup.c:308: error: implicit declaration of function 'node_start_pfn'
  mm/page_cgroup.c:309: error: implicit declaration of function 'node_end_pfn'

So, fixiing page_cgroup.c is an idea...

But node_start_pfn()/node_end_pfn() is a very generic macro and
should be implemented in the same manner for all archs.
(m32r has different implementation...)

This patch removes definitions of node_start/end_pfn() in each archs
and defines a unified one in linux/mmzone.h. It's not under
CONFIG_NEED_MULTIPLE_NODES, now.

A result of macro expansion is here (mm/page_cgroup.c)

for !NUMA
 start_pfn = ((&contig_page_data)->node_start_pfn);
  end_pfn = ({ pg_data_t *__pgdat = (&contig_page_data); __pgdat->node_start_pfn + __pgdat->node_spanned_pages;});

for NUMA (x86-64)
  start_pfn = ((node_data[nid])->node_start_pfn);
  end_pfn = ({ pg_data_t *__pgdat = (node_data[nid]); __pgdat->node_start_pfn + __pgdat->node_spanned_pages;});

Changelog:
 - fixed to avoid using "nid" twice in node_end_pfn() macro.

Reported-and-acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Reported-and-tested-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-27 14:13:09 -07:00
Ashish Kalra 3a93261f70 powerpc: introduce the ePAPR embedded hypervisor vmpic driver
The Freescale ePAPR reference hypervisor provides interrupt controller
services via a hypercall interface, instead of emulating the MPIC
controller.  This is called the VMPIC.

The ePAPR "virtual interrupt controller" provides interrupt controller
services for external interrupts.  External interrupts received by a
partition can come from two sources:

  - Hardware interrupts - hardware interrupts come from external
    interrupt lines or on-chip I/O devices.
  - Virtual interrupts - virtual interrupts are generated by the hypervisor
    as part of some hypervisor service or hypervisor-created virtual device.

Both types of interrupts are processed using the same programming model and
same set of hypercalls.

Signed-off-by: Ashish Kalra <ashish.kalra@freescale.com>
Signed-off-by: Timur Tabi <timur@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-06-27 08:30:26 -05:00
Timur Tabi bd497fc978 powerpc: introduce ePAPR embedded hypervisor hcall interface
ePAPR hypervisors provide operating system services via a "hypercall"
interface.  The following steps need to be performed to make an hcall:

1. Load r11 with the hcall number
2. Load specific other registers with parameters
3. Issue instrucion "sc 1"
4. The return code is in r3
5. Other returned parameters are in other registers.

To provide this service to the kernel, these steps are wrapped in inline
assembly functions.  Standard ePAPR hcalls are in epapr_hcalls.h, and
Freescale extensions are in fsl_hcalls.h.

Signed-off-by: Timur Tabi <timur@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-06-27 08:30:19 -05:00
Stuart Yoder 6ec36b5848 powerpc: make irq_choose_cpu() available to all PIC drivers
Move irq_choose_cpu() into arch/powerpc/kernel/irq.c so that it can be used
by other PIC drivers.  The function is not MPIC-specific.

Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
Signed-off-by: Timur Tabi <timur@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-06-22 21:44:59 -05:00
Ashish Kalra 1325a684b5 powerpc/85xx: Save scratch registers to thread info instead of using SPRGs.
We expect this is actually faster, and we end up needing more space than we
can get from the SPRGs in some instances.  This is also useful when running
as a guest OS - SPRGs4-7 do not have guest versions.

8 slots are allocated in thread_info for this even though we only actually
use 4 of them - this allows space for future code to have more scratch
space (and we know we'll need it for things like hugetlb).

Signed-off-by: Ashish Kalra <Ashish.Kalra@freescale.com>
Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-06-22 21:44:55 -05:00
Michael Neuling dc28518f7d powerpc: Fix doorbell type shift
doorbell type is defined as bits 32:36 so should be shifted by 63-36 =
27 rather than 28.

We never noticed this bug as we've only every used type PPC_DBELL = 0.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-06-20 11:21:48 +10:00
Matt Evans 7ac87abb81 powerpc: Fix early boot accounting of CPUs
smp_release_cpus() waits for all cpus (including the bootcpu) due to an
off-by-one count on boot_cpu_count (which is all CPUs).  This patch replaces
that with spinning_secondaries (which is all secondary CPUs).

Signed-off-by: Matt Evans <matt@ozlabs.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-06-17 16:19:51 +10:00
Joe Perches 28f65c11f2 treewide: Convert uses of struct resource to resource_size(ptr)
Several fixes as well where the +1 was missing.

Done via coccinelle scripts like:

@@
struct resource *ptr;
@@

- ptr->end - ptr->start + 1
+ resource_size(ptr)

and some grep and typing.

Mostly uncompiled, no cross-compilers.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-06-10 14:55:36 +02:00
Ralf Baechle 75bba01434 i8253: Alpha, PowerPC: Remove unused asm/8253pit.h
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/20110601180610.684557757@duck.linux-mips.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-06-09 15:01:39 +02:00
Benjamin Herrenschmidt d660474e84 Merge remote branch 'kumar/merge' into merge 2011-06-09 14:46:32 +10:00
Benjamin Herrenschmidt ef3b4f8cc2 pci/of: Consolidate pci_bus_to_OF_node()
The generic code always get the device-node in the right place now
so a single implementation will work for all archs

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Michal Simek <monstr@monstr.eu>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-06-08 09:08:57 +10:00
Benjamin Herrenschmidt 64099d981c pci/of: Consolidate pci_device_to_OF_node()
All archs do more or less the same thing now, move it into
a single generic place.

I chose pci.h rather than of_pci.h to avoid having to change
all call-sites to include the later.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Michal Simek <monstr@monstr.eu>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-06-08 09:08:43 +10:00
Benjamin Herrenschmidt 98d9f30c82 pci/of: Match PCI devices to OF nodes dynamically
powerpc has two different ways of matching PCI devices to their
corresponding OF node (if any) for historical reasons. The ppc64 one
does a scan looking for matching bus/dev/fn, while the ppc32 one does a
scan looking only for matching dev/fn on each level in order to be
agnostic to busses being renumbered (which Linux does on some
platforms).

This removes both and instead moves the matching code to the PCI core
itself. It's the most logical place to do it: when a pci_dev is created,
we know the parent and thus can do a single level scan for the matching
device_node (if any).

The benefit is that all archs now get the matching for free. There's one
hook the arch might want to provide to match a PHB bus to its device
node. A default weak implementation is provided that looks for the
parent device device node, but it's not entirely reliable on powerpc for
various reasons so powerpc provides its own.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Michal Simek <monstr@monstr.eu>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-06-08 09:08:17 +10:00
Kumar Gala a3623239ad powerpc/fsl_rio: Fix compile error when CONFIG_FSL_RIO not set
arch/powerpc/kernel/built-in.o: In function `machine_check_e500mc':
arch/powerpc/kernel/traps.c:429: undefined reference to `fsl_rio_mcheck_exception'
arch/powerpc/kernel/built-in.o: In function `machine_check_e500':
arch/powerpc/kernel/traps.c:519: undefined reference to `fsl_rio_mcheck_exception'
make: *** [.tmp_vmlinux1] Error 1

Reported-by: Timur Tabi <timur@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-06-02 15:29:08 -05:00
Linus Torvalds 571503e100 Merge branch 'setns'
* setns:
  ns: Wire up the setns system call

Done as a merge to make it easier to fix up conflicts in arm due to
addition of sendmmsg system call
2011-05-28 10:51:01 -07:00
Eric W. Biederman 7b21fddd08 ns: Wire up the setns system call
32bit and 64bit on x86 are tested and working.  The rest I have looked
at closely and I can't find any problems.

setns is an easy system call to wire up.  It just takes two ints so I
don't expect any weird architecture porting problems.

While doing this I have noticed that we have some architectures that are
very slow to get new system calls.  cris seems to be the slowest where
the last system calls wired up were preadv and pwritev.  avr32 is weird
in that recvmmsg was wired up but never declared in unistd.h.  frv is
behind with perf_event_open being the last syscall wired up.  On h8300
the last system call wired up was epoll_wait.  On m32r the last system
call wired up was fallocate.  mn10300 has recvmmsg as the last system
call wired up.  The rest seem to at least have syncfs wired up which was
new in the 2.6.39.

v2: Most of the architecture support added by Daniel Lezcano <dlezcano@fr.ibm.com>
v3: ported to v2.6.36-rc4 by: Eric W. Biederman <ebiederm@xmission.com>
v4: Moved wiring up of the system call to another patch
v5: ported to v2.6.39-rc6
v6: rebased onto parisc-next and net-next to avoid syscall  conflicts.
v7: ported to Linus's latest post 2.6.39 tree.

>  arch/blackfin/include/asm/unistd.h     |    3 ++-
>  arch/blackfin/mach-common/entry.S      |    1 +
Acked-by: Mike Frysinger <vapier@gentoo.org>

Oh - ia64 wiring looks good.
Acked-by: Tony Luck <tony.luck@intel.com>

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-28 10:48:39 -07:00
Linus Torvalds f23a5e1405 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6:
  PM: Fix PM QOS's user mode interface to work with ASCII input
  PM / Hibernate: Update kerneldoc comments in hibernate.c
  PM / Hibernate: Remove arch_prepare_suspend()
  PM / Hibernate: Update some comments in core hibernate code
2011-05-27 14:27:34 -07:00
Benjamin Herrenschmidt 9693ebd481 Merge remote branch 'kumar/merge' into merge 2011-05-27 09:58:22 +10:00
Milton Miller 7ef71d753e powerpc/cell: Use common smp ipi actions
The cell iic interrupt controller has enough software caused interrupts
to use a unique interrupt for each of the 4 messages powerpc uses.
This means each interrupt gets its own irq action/data combination.

Use the seperate, optimized, arch common ipi action functions
registered via the helper smp_request_message_ipi instead passing the
message as action data to a single action that then demultipexes to
the required acton via a switch statement.

smp_request_message_ipi will register the action as IRQF_PER_CPU
and IRQF_DISABLED, and WARN if the allocation fails for some reason,
so no need to print on that failure.  It will return positive if
the message will not be used by the kernel, in which case we can
free the virq.

In addition to elimiating inefficient code, this also corrects the
error that a kernel built with kexec but without a debugger would
not register the ipi for kdump to notify the other cpus of a crash.

This also restores the debugger action to be static to kernel/smp.c.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-26 13:38:58 +10:00
Brian King ca1931507f powerpc/pseries: Update MAX_HCALL_OPCODE to reflect page coalescing
When page coalescing support was added recently, the MAX_HCALL_OPCODE
define was not updated for the newly added H_GET_MPP_X hcall.

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-26 13:38:57 +10:00
Ian Munsie 02424d8966 powerpc/ftrace: Implement raw syscall tracepoints on PowerPC
This patch implements the raw syscall tracepoints on PowerPC and exports
them for ftrace syscalls to use.

To minimise reworking existing code, I slightly re-ordered the thread
info flags such that the new TIF_SYSCALL_TRACEPOINT bit would still fit
within the 16 bits of the andi. instruction's UI field. The instructions
in question are in /arch/powerpc/kernel/entry_{32,64}.S to and the
_TIF_SYSCALL_T_OR_A with the thread flags to see if system call tracing
is enabled.

In the case of 64bit PowerPC, arch_syscall_addr and
arch_syscall_match_sym_name are overridden to allow ftrace syscalls to
work given the unusual system call table structure and symbol names that
start with a period.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-26 13:38:57 +10:00
Peter Zijlstra 2672391169 mm, powerpc: move the RCU page-table freeing into generic code
In case other architectures require RCU freed page-tables to implement
gup_fast() and software filled hashes and similar things, provide the
means to do so by moving the logic into generic code.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Requested-by: David Miller <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Tony Luck <tony.luck@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:16 -07:00
Peter Zijlstra d6bf29b44d powerpc: mmu_gather rework
Fix up powerpc to the new mmu_gather stuff.

PPC has an extra batching queue to RCU free the actual pagetable
allocations, use the ARCH extentions for that for now.

For the ppc64_tlb_batch, which tracks the vaddrs to unhash from the
hardware hash-table, keep using per-cpu arrays but flush on context switch
and use a TLF bit to track the lazy_mmu state.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Miller <davem@davemloft.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Tony Luck <tony.luck@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:13 -07:00
Rafael J. Wysocki 354258011e PM / Hibernate: Remove arch_prepare_suspend()
All architectures supporting hibernation define
arch_prepare_suspend() as an empty function, so remove it.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-05-24 23:35:55 +02:00
Linus Torvalds 57d19e80f4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
  b43: fix comment typo reqest -> request
  Haavard Skinnemoen has left Atmel
  cris: typo in mach-fs Makefile
  Kconfig: fix copy/paste-ism for dell-wmi-aio driver
  doc: timers-howto: fix a typo ("unsgined")
  perf: Only include annotate.h once in tools/perf/util/ui/browsers/annotate.c
  md, raid5: Fix spelling error in comment ('Ofcourse' --> 'Of course').
  treewide: fix a few typos in comments
  regulator: change debug statement be consistent with the style of the rest
  Revert "arm: mach-u300/gpio: Fix mem_region resource size miscalculations"
  audit: acquire creds selectively to reduce atomic op overhead
  rtlwifi: don't touch with treewide double semicolon removal
  treewide: cleanup continuations and remove logging message whitespace
  ath9k_hw: don't touch with treewide double semicolon removal
  include/linux/leds-regulator.h: fix syntax in example code
  tty: fix typo in descripton of tty_termios_encode_baud_rate
  xtensa: remove obsolete BKL kernel option from defconfig
  m68k: fix comment typo 'occcured'
  arch:Kconfig.locks Remove unused config option.
  treewide: remove extra semicolons
  ...
2011-05-23 09:12:26 -07:00
Linus Torvalds f4b10bc60a Merge branch 'kvm-updates/2.6.40' of git://git.kernel.org/pub/scm/virt/kvm/kvm
* 'kvm-updates/2.6.40' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (131 commits)
  KVM: MMU: Use ptep_user for cmpxchg_gpte()
  KVM: Fix kvm mmu_notifier initialization order
  KVM: Add documentation for KVM_CAP_NR_VCPUS
  KVM: make guest mode entry to be rcu quiescent state
  KVM: x86 emulator: Make jmp far emulation into a separate function
  KVM: x86 emulator: Rename emulate_grpX() to em_grpX()
  KVM: x86 emulator: Remove unused arg from emulate_pop()
  KVM: x86 emulator: Remove unused arg from writeback()
  KVM: x86 emulator: Remove unused arg from read_descriptor()
  KVM: x86 emulator: Remove unused arg from seg_override()
  KVM: Validate userspace_addr of memslot when registered
  KVM: MMU: Clean up gpte reading with copy_from_user()
  KVM: PPC: booke: add sregs support
  KVM: PPC: booke: save/restore VRSAVE (a.k.a. USPRG0)
  KVM: PPC: use ticks, not usecs, for exit timing
  KVM: PPC: fix exit accounting for SPRs, tlbwe, tlbsx
  KVM: PPC: e500: emulate SVR
  KVM: VMX: Cache vmcs segment fields
  KVM: x86 emulator: consolidate segment accessors
  KVM: VMX: Avoid reading %rip unnecessarily when handling exceptions
  ...
2011-05-23 08:42:08 -07:00
Scott Wood 5ce941ee42 KVM: PPC: booke: add sregs support
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-05-22 08:47:53 -04:00
Scott Wood eab176722f KVM: PPC: booke: save/restore VRSAVE (a.k.a. USPRG0)
Linux doesn't use USPRG0 (now renamed VRSAVE in the architecture, even
when Altivec isn't involved), but a guest might.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-05-22 08:47:50 -04:00
Scott Wood 90d34b0e45 KVM: PPC: e500: emulate SVR
Return the actual host SVR for now, as we already do for PVR.  Eventually
we may support Qemu overriding PVR/SVR if the situation is appropriate,
once we implement KVM_SET_SREGS on e500.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-05-22 08:47:46 -04:00
Linus Torvalds 06f4e926d2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1446 commits)
  macvlan: fix panic if lowerdev in a bond
  tg3: Add braces around 5906 workaround.
  tg3: Fix NETIF_F_LOOPBACK error
  macvlan: remove one synchronize_rcu() call
  networking: NET_CLS_ROUTE4 depends on INET
  irda: Fix error propagation in ircomm_lmp_connect_response()
  irda: Kill set but unused variable 'bytes' in irlan_check_command_param()
  irda: Kill set but unused variable 'clen' in ircomm_connect_indication()
  rxrpc: Fix set but unused variable 'usage' in rxrpc_get_transport()
  be2net: Kill set but unused variable 'req' in lancer_fw_download()
  irda: Kill set but unused vars 'saddr' and 'daddr' in irlan_provider_connect_indication()
  atl1c: atl1c_resume() is only used when CONFIG_PM_SLEEP is defined.
  rxrpc: Fix set but unused variable 'usage' in rxrpc_get_peer().
  rxrpc: Kill set but unused variable 'local' in rxrpc_UDP_error_handler()
  rxrpc: Kill set but unused variable 'sp' in rxrpc_process_connection()
  rxrpc: Kill set but unused variable 'sp' in rxrpc_rotate_tx_window()
  pkt_sched: Kill set but unused variable 'protocol' in tc_classify()
  isdn: capi: Use pr_debug() instead of ifdefs.
  tg3: Update version to 3.119
  tg3: Apply rx_discards fix to 5719/5720
  ...

Fix up trivial conflicts in arch/x86/Kconfig and net/mac80211/agg-tx.c
as per Davem.
2011-05-20 13:43:21 -07:00
Shaohui Xie cce1f106c6 powerpc/fsl_rio: move machine_check handler
Add support for machine_check support into machine_check_e500 and
machine_check_e500mc.

Signed-off-by: Shaohui Xie <b21989@freescale.com>
Cc: Li Yang <leoli@freescale.com>
Cc: Roy Zang <tie-fei.zang@freescale.com>
Cc: Alexandre Bounine <alexandre.bounine@idt.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-05-20 08:46:57 -05:00
Shengzhou Liu d08e44570e powerpc/fsl_lbc: Add workaround for ELBC-A001 erratum
Simultaneous FCM and GPCM or UPM operation may erroneously trigger
bus monitor timeout.

Set the local bus monitor timeout value to the maximum by setting
LBCR[BMT] = 0 and LBCR[BMTPS] = 0xF.

Signed-off-by: Shengzhou Liu <Shengzhou.Liu@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-05-20 08:46:49 -05:00
Benjamin Herrenschmidt 880102e785 Merge remote branch 'origin/master' into merge
Manual merge of arch/powerpc/kernel/smp.c and add missing scheduler_ipi()
call to arch/powerpc/platforms/cell/interrupt.c

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-20 15:36:52 +10:00
Benjamin Herrenschmidt 3d07f0e83d Merge remote branch 'kumar/next' into next 2011-05-20 13:43:47 +10:00
Paul Mackerras 593adf317c powerpc/kvm: Fix the build for 32-bit Book 3S (classic) processors
Commits a5d4f3ad3a ("powerpc: Base support for exceptions using
HSRR0/1") and 673b189a2e ("powerpc: Always use SPRN_SPRG_HSCRATCH0
when running in HV mode") cause compile and link errors for 32-bit
classic Book 3S processors when KVM is enabled.  This fixes these
errors.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-20 13:43:41 +10:00
Benjamin Herrenschmidt 4c8440666b Merge branch 'merge' into next 2011-05-19 17:00:06 +10:00
Scott Wood ea94187fac powerpc/mpic: add the mpic global timer support
Add support for MPIC timers as requestable interrupt sources.

Based on http://patchwork.ozlabs.org/patch/20941/ by Dave Liu.

Signed-off-by: Dave Liu <daveliu@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-05-19 01:14:28 -05:00
Scott Wood 22d168ce60 powerpc/mpic: parse 4-cell intspec types other than zero
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-05-19 01:14:27 -05:00
Scott Wood 3a6e9bd7f6 powerpc/e5500: set non-base IVORs
Without this, we attempt to use doorbells for IPIs, and end up
branching to some bad address.  Plus, even for the exceptions
we don't implement, it's good to handle it and get a message out.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-05-19 00:36:43 -05:00
Kumar Gala d36b4c4f3c powerpc/fsl-booke64: Add support for Debug Level exception handler
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-05-19 00:36:42 -05:00
Kumar Gala 134c428e5a Merge remote branch 'benh/merge' into benh-next 2011-05-19 00:36:21 -05:00
Milton Miller 1e8c23013e powerpc: Remove virq_to_host
The only references to the irq_map[].host field are internal to
arch/powerpc/kernel/irq.c

Signed-off-by: Milton Miller <miltonm@bga.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 15:32:01 +10:00
Milton Miller 3ee62d365b powerpc: Add virq_is_host to reduce virq_to_host usage
Some irq_host implementations are using virq_to_host to check if
they are the irq_host for a virtual irq.  To allow us to make space
versus time tradeoffs, replace this usage with an assertive
virq_is_host that confirms or denies the irq is associated with the
given irq_host.

Signed-off-by: Milton Miller <miltonm@bga.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 15:31:59 +10:00
Milton Miller da05198002 powerpc: Remove irq_host_ops->remap hook
It was called from irq_create_mapping if that was called for a host
and hwirq that was previously mapped, "to update the flags".  But the
only implementation was in beat_interrupt and all it did was repeat a
hypervisor call without error checking that was performed with error
checking at the beginning of the map hook.  In addition, the comment on
the beat remap hook says it will only called once for a given mapping,
which would apply to map not remap.

All flags should be known by the time the match hook is called, before
we call the map hook.  Removing this mostly unused hook will simpify
the requirements of irq_domain concept.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 15:31:53 +10:00
Milton Miller 1ece355b68 powerpc: Add kconfig for muxed smp ipi support
Compile the new smp ipi mux and demux code only if a platform
will make use of it.  The new config is selected as required.

The new cause_ipi smp op is only available conditionally to point out
configs where the select is required; this makes setting the op an
immediate fail instead of a deferred unresolved symbol at link.

This also creates a new config for power surge powermac upgrade support
that can be disabled in expert mode but is default on.

I also removed the depends / default y on CONFIG_XICS since it is selected
by PSERIES.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 15:31:05 +10:00
Milton Miller 23d72bfd8f powerpc: Consolidate ipi message mux and demux
Consolidate the mux and demux of ipi messages into smp.c and call
a new smp_ops callback to actually trigger the ipi.

The powerpc architecture code is optimised for having 4 distinct
ipi triggers, which are mapped to 4 distinct messages (ipi many, ipi
single, scheduler ipi, and enter debugger).  However, several interrupt
controllers only provide a single software triggered interrupt that
can be delivered to each cpu.  To resolve this limitation, each smp_ops
implementation created a per-cpu variable that is manipulated with atomic
bitops.  Since these lines will be contended they are optimialy marked as
shared_aligned and take a full cache line for each cpu.  Distro kernels
may have 2 or 3 of these in their config, each taking per-cpu space
even though at most one will be in use.

This consolidation removes smp_message_recv and replaces the single call
actions cases with direct calls from the common message recognition loop.
The complicated debugger ipi case with its muxed crash handling code is
moved to debug_ipi_action which is now called from the demux code (instead
of the multi-message action calling smp_message_recv).

I put a call to reschedule_action to increase the likelyhood of correctly
merging the anticipated scheduler_ipi() hook coming from the scheduler
tree; that single required call can be inlined later.

The actual message decode is a copy of the old pseries xics code with its
memory barriers and cache line spacing, augmented with a per-cpu unsigned
long based on the book-e doorbell code.  The optional data is set via a
callback from the implementation and is passed to the new cause-ipi hook
along with the logical cpu number.  While currently only the doorbell
implemntation uses this data it should be almost zero cost to retrieve and
pass it -- it adds a single register load for the argument from the same
cache line to which we just completed a store and the register is dead
on return from the call.  I extended the data element from unsigned int
to unsigned long in case some other code wanted to associate a pointer.

The doorbell check_self is replaced by a call to smp_muxed_ipi_resend,
conditioned on the CPU_DBELL feature.  The ifdef guard could be relaxed
to CONFIG_SMP but I left it with BOOKE for now.

Also, the doorbell interrupt vector for book-e was not calling irq_enter
and irq_exit, which throws off cpu accounting and causes code to not
realize it is running in interrupt context.  Add the missing calls.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 15:31:03 +10:00
Milton Miller 17f9c8a73b powerpc: Move smp_ops_t from machdep.h to smp.h
I can't see any reason these functions are needed by machdep.h
and they are all hidden by CONFIG_SMP with no UP alternative.

Also move the declarations for the fallback timebase ops, which
are used to fill in the smp ops.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 15:31:01 +10:00
Milton Miller a56555e573 powerpc: Remove alloc_maybe_bootmem for zalloc version
Replace all remaining callers of alloc_maybe_bootmem with
zalloc_maybe_bootmem.   The callsite in pci_dn is followed with a
memset to clear the memory, and not zeroing at the other callsites
in the celleb fake pci code could lead to following uninitialized
memory as pointers or even freeing said pointers on error paths.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 15:30:57 +10:00
Milton Miller f1072939b6 powerpc: Remove checks for MSG_ALL and MSG_ALL_BUT_SELF
Now that smp_ops->smp_message_pass is always called with an (online) cpu
number for the target remove the checks for MSG_ALL and MSG_ALL_BUT_SELF.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:46 +10:00
Milton Miller e047637132 powerpc: Remove call sites of MSG_ALL_BUT_SELF
The only user of MSG_ALL_BUT_SELF in the whole kernel tree is powerpc,
and it only uses it to start the debugger. Both debuggers always call
smp_send_debugger_break with MSG_ALL_BUT_SELF, and only mpic can do
anything more optimal than a loop over all online cpus, but all message
passing implementations have to code for this special delivery target.

Convert smp_send_debugger_break to take void and loop calling the smp_ops
message_pass function for each of the other cpus in the online cpumask.

Use raw_smp_processor_id() because we are either entering the debugger
or trying to start kdump and the additional warning it not useful were
it to trigger.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:46 +10:00
Anton Blanchard 40f1ce7fb7 powerpc: Remove ioremap_flags
We have a confusing number of ioremap functions. Make things just a
bit simpler by merging ioremap_flags and ioremap_prot.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:43 +10:00
Anton Blanchard be135f4089 powerpc: Add ioremap_wc
Add ioremap_wc so drivers can request write combining on kernel
mappings.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:42 +10:00
Anton Blanchard d988f0e3f8 powerpc: Simplify 4k/64k copy_page logic
To make it easier to add optimised versions of copy_page, remove
the 4kB loop for 64kB pages and just do all the work in copy_page.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:42 +10:00
Stratos Psomadakis 2a2c29c1a5 powerpc/mm: Fix compiler warning in pgtable-ppc64.h [-Wunused-but-set-variable]
The variable 'old' is set but not used in the wrprotect functions in
arch/powerpc/include/asm/pgtable-ppc64.h, which can trigger a compiler warning.

Remove the variable, since it's not used anyway.

Signed-off-by: Stratos Psomadakis <psomas@ece.ntua.gr>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:41 +10:00
Nishanth Aravamudan af442a1baa powerpc: Ensure dtl buffers do not cross 4k boundary
Future releases of fimrware will enforce a requirement that DTL buffers
do not cross a 4k boundary. Commit
127493d5dc satisfies this requirement for
CONFIG_VIRT_CPU_ACCOUNTING=y kernels, but if !CONFIG_VIRT_CPU_ACCOUNTING
&& CONFIG_DTL=y, the current code will fail at dtl registration time.
Fix this by making the kmem cache from
127493d5dc visible outside of setup.c and
using the same cache in both dtl.c and setup.c. This requires a bit of
reorganization to ensure ordering of the kmem cache and buffer
allocations.

Note: Since firmware now limits the size of the buffer, I made
dtl_buf_entries read-only in debugfs.

Tested with upcoming firmware with the 4 combinations of
CONFIG_VIRT_CPU_ACCOUNTING and CONFIG_DTL.

Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:41 +10:00
Rafael J. Wysocki 2d2a9163bd Merge branch 'syscore' into for-linus
* syscore:
  PM: Remove sysdev suspend, resume and shutdown operations
  PM / PowerPC: Use struct syscore_ops instead of sysdevs for PM
  PM / UNICORE32: Use struct syscore_ops instead of sysdevs for PM
  PM / AVR32: Use struct syscore_ops instead of sysdevs for PM
  PM / Blackfin: Use struct syscore_ops instead of sysdevs for PM
  ARM / Samsung: Use struct syscore_ops for "core" power management
  ARM / PXA: Use struct syscore_ops for "core" power management
  ARM / SA1100: Use struct syscore_ops for "core" power management
  ARM / Integrator: Use struct syscore_ops for core PM
  ARM / OMAP: Use struct syscore_ops for "core" power management
  ARM: Use struct syscore_ops instead of sysdevs for PM in common code
2011-05-17 23:23:40 +02:00
Rafael J. Wysocki f5a592f7d7 PM / PowerPC: Use struct syscore_ops instead of sysdevs for PM
Make some PowerPC architecture's code use struct syscore_ops
objects for power management instead of sysdev classes and sysdevs.

This simplifies the code and reduces the kernel's memory footprint.
It also is necessary for removing sysdevs from the kernel entirely in
the future.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-05-11 21:37:15 +02:00
Bharat Bhushan 09000adb86 KVM: PPC: Fix issue clearing exit timing counters
Following dump is observed on host when clearing the exit timing counters

[root@p1021mds kvm]# echo -n 'c' > vm1200_vcpu0_timing
INFO: task echo:1276 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
echo          D 0ff5bf94     0  1276   1190 0x00000000
Call Trace:
[c2157e40] [c0007908] __switch_to+0x9c/0xc4
[c2157e50] [c040293c] schedule+0x1b4/0x3bc
[c2157e90] [c04032dc] __mutex_lock_slowpath+0x74/0xc0
[c2157ec0] [c00369e4] kvmppc_init_timing_stats+0x20/0xb8
[c2157ed0] [c0036b00] kvmppc_exit_timing_write+0x84/0x98
[c2157ef0] [c00b9f90] vfs_write+0xc0/0x16c
[c2157f10] [c00ba284] sys_write+0x4c/0x90
[c2157f40] [c000e320] ret_from_syscall+0x0/0x3c

        The vcpu->mutex is used by kvm_ioctl_* (KVM_RUN etc) and same was
used when clearing the stats (in kvmppc_init_timing_stats()). What happens
is that when the guest is idle then it held the vcpu->mutx. While the
exiting timing process waits for guest to release the vcpu->mutex and
a hang state is reached.

        Now using seprate lock for exit timing stats.

Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-11 07:57:04 -04:00
Justin P. Mattock 70f23fd66b treewide: fix a few typos in comments
- kenrel -> kernel
- whetehr -> whether
- ttt -> tt
- sss -> ss

Signed-off-by: Justin P. Mattock <justinmattock@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-05-10 10:16:21 +02:00
Jack Miller a0496d450a powerpc: Add early debug for WSP platforms
Signed-off-by: Jack Miller <jack@codezen.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-06 13:32:41 +10:00
David Gibson a1d0d98daf powerpc: Add WSP platform
Add a platform for the Wire Speed Processor, based on the PPC A2.

This includes code for the ICS & OPB interrupt controllers, as well
as a SCOM backend, and SCOM based cpu bringup.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Jack Miller <jack@codezen.org>
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-06 13:32:35 +10:00
Benjamin Herrenschmidt 40bd587a88 powerpc: Rename slb0_limit() to safe_stack_limit() and add Book3E support
slb0_limit() wasn't a very descriptive name. This changes it along with
a comment explaining what it's used for, and provides a 64-bit BookE
implementation.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-06 13:32:24 +10:00
Tseng-Hui (Frank) Lin 77eafe101a powerpc/pseries: Add support for IO event interrupts
This patch adds support for handling IO Event interrupts which come
through at the /event-sources/ibm,io-events device tree node.

The interrupts come through ibm,io-events device tree node are generated
by the firmware to report IO events. The firmware uses the same interrupt
to report multiple types of events for multiple devices. Each device may
have its own event handler. This patch implements a plateform interrupt
handler that is triggered by the IO event interrupts come through
ibm,io-events device tree node, pull in the IO events from RTAS and call
device event handlers registered in the notifier list.

Device event handlers are expected to use atomic_notifier_chain_register()
and atomic_notifier_chain_unregister() to register/unregister their
event handler in pseries_ioei_notifier_list list with IO event interrupt.
Device event handlers are responsible to identify if the event belongs
to the device event handler. The device event handle should return NOTIFY_OK
after the event is handled if the event belongs to the device event handler,
or NOTIFY_DONE otherwise.

Signed-off-by: Tseng-Hui (Frank) Lin <thlin@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-06 13:19:01 +10:00
Tseng-Hui (Frank) Lin 4cb4638079 powerpc/pseries: Add RTAS event log v6 definition
This patch adds definitions of non-IBM specific v6 extended log
definitions to rtas.h.

Signed-off-by: Tseng-Hui (Frank) Lin <tsenglin@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-06 13:18:59 +10:00
Anton Blanchard 228e548e60 net: Add sendmmsg socket system call
This patch adds a multiple message send syscall and is the send
version of the existing recvmmsg syscall. This is heavily
based on the patch by Arnaldo that added recvmmsg.

I wrote a microbenchmark to test the performance gains of using
this new syscall:

http://ozlabs.org/~anton/junkcode/sendmmsg_test.c

The test was run on a ppc64 box with a 10 Gbit network card. The
benchmark can send both UDP and RAW ethernet packets.

64B UDP

batch   pkts/sec
1       804570
2       872800 (+ 8 %)
4       916556 (+14 %)
8       939712 (+17 %)
16      952688 (+18 %)
32      956448 (+19 %)
64      964800 (+20 %)

64B raw socket

batch   pkts/sec
1       1201449
2       1350028 (+12 %)
4       1461416 (+22 %)
8       1513080 (+26 %)
16      1541216 (+28 %)
32      1553440 (+29 %)
64      1557888 (+30 %)

We see a 20% improvement in throughput on UDP send and 30%
on raw socket send.

[ Add sparc syscall entries. -DaveM ]

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-05 11:10:14 -07:00
Brian King 9ee820fa00 powerpc/pseries: Add page coalescing support
Adds support for page coalescing, which is a feature on IBM Power servers
which allows for coalescing identical pages between logical partitions.
Hint text pages as coalesce candidates, since they are the most likely
pages to be able to be coalesced between partitions. This patch also
exports some page coalescing statistics available from firmware via
lparcfg.

[BenH: Moved a couple of things around to fix compile problems]

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-04 16:02:21 +10:00
KOSAKI Motohiro 104699c0ab powerpc: Convert old cpumask API into new one
Adapt new API.

Almost change is trivial. Most important change is the below line
because we plan to change task->cpus_allowed implementation.

-       ctx->cpus_allowed = current->cpus_allowed;

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-04 15:22:59 +10:00
Paul Mackerras 48404f2e95 powerpc: Save Come-From Address Register (CFAR) in exception frame
Recent 64-bit server processors (POWER6 and POWER7) have a "Come-From
Address Register" (CFAR), that records the address of the most recent
branch or rfid (return from interrupt) instruction for debugging purposes.

This saves the value of the CFAR in the exception entry code and stores
it in the exception frame.  We also make xmon print the CFAR value in
its register dump code.

Rather than extend the pt_regs struct at this time, we steal the orig_gpr3
field, which is only used for system calls, and use it for the CFAR value
for all exceptions/interrupts other than system calls.  This means we
don't save the CFAR on system calls, which is not a great problem since
system calls tend not to happen unexpectedly, and also avoids adding the
overhead of reading the CFAR to the system call entry path.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-04 15:22:09 +10:00
Paul Mackerras 1977b50212 powerpc: Save register r9-r13 values accurately on interrupt with bad stack
When we take an interrupt or exception from kernel mode and the stack
pointer is obviously not a kernel address (i.e. the top bit is 0), we
switch to an emergency stack, save register values and panic.  However,
on 64-bit server machines, we don't actually save the values of r9 - r13
at the time of the interrupt, but rather values corrupted by the
exception entry code for r12-r13, and nothing at all for r9-r11.

This fixes it by passing a pointer to the register save area in the paca
through to the bad_stack code in r3.  The register values are saved in
one of the paca register save areas (depending on which exception this
is).  Using the pointer in r3, the bad_stack code now retrieves the
saved values of r9 - r13 and stores them in the exception frame on the
emergency stack.  This also stores the normal exception frame marker
("regshere") in the exception frame.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-04 15:19:27 +10:00
Tseng-Hui (Frank) Lin 851d2e2fe8 powerpc: Add Initiate Coprocessor Store Word (icswx) support
Icswx is a PowerPC instruction to send data to a co-processor. On Book-S
processors the LPAR_ID and process ID (PID) of the owning process are
registered in the window context of the co-processor at initialization
time. When the icswx instruction is executed the L2 generates a cop-reg
transaction on PowerBus. The transaction has no address and the
processor does not perform an MMU access to authenticate the transaction.
The co-processor compares the LPAR_ID and the PID included in the
transaction and the LPAR_ID and PID held in the window context to
determine if the process is authorized to generate the transaction.

The OS needs to assign a 16-bit PID for the process. This cop-PID needs
to be updated during context switch. The cop-PID needs to be destroyed
when the context is destroyed.

Signed-off-by: Sonny Rao <sonnyrao@linux.vnet.ibm.com>
Signed-off-by: Tseng-Hui (Frank) Lin <thlin@linux.vnet.ibm.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-04 15:19:26 +10:00
Michael Neuling a32e252f7c powerpc: Use new CPU feature bit to select 2.06 tlbie
This removes MMU_FTR_TLBIE_206 as we can now use CPU_FTR_HVMODE_206.  It
also changes the logic to select which tlbie to use to be based on this
new CPU feature bit.

This also duplicates the ASM_FTR_IF/SET/CLR defines for CPU features
(copied from MMU features).

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-04 15:19:26 +10:00
Grant Likely 476eb49126 powerpc/irq: Stop exporting irq_map
First step in eliminating irq_map[] table entirely

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-04 15:02:15 +10:00
Linus Torvalds 5933f2ae35 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (47 commits)
  sysctl: net: call unregister_net_sysctl_table where needed
  Revert: veth: remove unneeded ifname code from veth_newlink()
  smsc95xx: fix reset check
  tg3: Fix failure to enable WoL by default when possible
  networking: inappropriate ioctl operation should return ENOTTY
  amd8111e: trivial typo spelling: Negotitate -> Negotiate
  ipv4: don't spam dmesg with "Using LC-trie" messages
  af_unix: Only allow recv on connected seqpacket sockets.
  mii: add support of pause frames in mii_get_an
  net: ftmac100: fix scheduling while atomic during PHY link status change
  usbnet: Transfer of maintainership
  usbnet: add support for some Huawei modems with cdc-ether ports
  bnx2: cancel timer on device removal
  iwl4965: fix "Received BA when not expected"
  iwlagn: fix "Received BA when not expected"
  dsa/mv88e6131: fix unknown multicast/broadcast forwarding on mv88e6085
  usbnet: Resubmit interrupt URB if device is open
  iwl4965: fix "TX Power requested while scanning"
  iwlegacy: led stay solid on when no traffic
  b43: trivial: update module info about ucode16_mimo firmware
  ...
2011-05-02 18:00:43 -07:00
Lucas De Marchi e9c549998d Revert wrong fixes for common misspellings
These changes were incorrectly fixed by codespell. They were now
manually corrected.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-04-26 23:31:11 -07:00
Matt Evans 44ae3ab335 powerpc: Free up some CPU feature bits by moving out MMU-related features
Some of the 64bit PPC CPU features are MMU-related, so this patch moves
them to MMU_FTR_ bits.  All cpu_has_feature()-style tests are moved to
mmu_has_feature(), and seven feature bits are freed as a result.

Signed-off-by: Matt Evans <matt@ozlabs.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 14:18:52 +10:00
Michael Ellerman a7b8ad4058 powerpc/book3e: Fix extlb size
The calculation of the size for the exception save area of the TLB
miss handler is wrong, luckily it's too big not too small.

Rework it to make it a bit clearer, and also correct. We want 3 save
areas, each EX_TLB_SIZE _bytes_.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 14:18:48 +10:00
Michael Ellerman 9d4a2925c2 powerpc: Add MSR_64BIT
The MSR bit which indicates 64-bit-ness is different between server and
booke, so add a #define which gives you the right mask regardless.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 14:18:43 +10:00
Michael Ellerman d1109b7529 powerpc/pci: Make IO workarounds init implicit when first bus is registered
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 14:18:31 +10:00
Michael Ellerman 3cc30d0726 powerpc/pci: Move IO workarounds to the common kernel dir
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 14:18:29 +10:00
Michael Ellerman 21176fed25 powerpc/pci: Split IO vs MMIO indirect access hooks
The goal is to avoid adding overhead to MMIO when only PIO is needed

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 14:18:27 +10:00
Alexey Kardashevskiy efcac6589a powerpc: Per process DSCR + some fixes (try#4)
The DSCR (aka Data Stream Control Register) is supported on some
server PowerPC chips and allow some control over the prefetch
of data streams.

This patch allows the value to be specified per thread by emulating
the corresponding mfspr and mtspr instructions. Children of such
threads inherit the value. Other threads use a default value that
can be specified in sysfs - /sys/devices/system/cpu/dscr_default.

If a thread starts with non default value in the sysfs entry,
all children threads inherit this non default value even if
the sysfs value is changed later.

Signed-off-by: Alexey Kardashevskiy <aik@au1.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 14:18:19 +10:00
Jack Miller f0aae3238f powerpc/book3e: Flush IPROT protected TLB entries leftover by firmware
When we set up the TLB for ourselves on Book3E, we need to flush out any
old mappings established by the firmware or bootloader.  At present we
attempt this with a tlbilx to flush everything, but this will leave behind
any entries with the IPROT bit set.

There are several good reason firmware might establish mappings with IPROT,
and in fact ePAPR compliant firmwares are required to establish their
initial mapped area with IPROT.

This patch, therefore adds more complex code to scan through the TLB upon
entry and flush away any entries that are not our own.

Signed-off-by: Jack Miller <jack@codezen.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 13:02:16 +10:00
Benjamin Herrenschmidt bd49178109 powerpc: Add TLB size detection for TYPE_3E MMUs
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 13:02:10 +10:00
Benjamin Herrenschmidt 76b4eda866 powerpc: Add A2 cpu support
Add the cputable entry, regs and setup & restore entries for
the PowerPC A2 core.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 13:02:02 +10:00
Andrea Galbusera e2a85aeceb powerpc: Fix multicast problem in fs_enet driver
mac-fec.c was setting individual UDP address registers instead of multicast
group address registers when joining a multicast group.
This prevented from correctly receiving UDP multicast packets.
According to datasheet, replaced hash_table_high and hash_table_low
with grp_hash_table_high and grp_hash_table_low respectively.
Also renamed hash_table_* with grp_hash_table_* in struct fec declaration
for 8xx: these registers are used only for multicast there.

Tested on a MPC5121 based board.
Build tested also against mpc866_ads_defconfig.

Signed-off-by: Andrea Galbusera <gizero@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-21 16:59:30 -07:00
Michael Ellerman 5ca1237601 powerpc/xics: Move irq_host matching into the ics backend
An upcoming new ics backend will need to implement different matching
semantics to the current ones, which are essentially the RTAS ics
backends. So move the current match into the RTAS backend, and allow
other ics backends to override.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 17:01:20 +10:00
Benjamin Herrenschmidt ab814b938d powerpc: Add SCOM infrastructure
SCOM is a side-band configuration bus implemented on some processors.
This code provides a way for code to map and operate on devices via
SCOM, while the details of how that is implemented is left up to a
SCOM "controller" in the platform code.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 17:01:19 +10:00
Michael Ellerman cd85257905 powerpc/xics: xics.h relies on linux/interrupt.h
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 17:01:19 +10:00
Benjamin Herrenschmidt 931e1241a2 powerpc/a2: Add some #defines for A2 specific instructions
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 17:01:19 +10:00
Michael Ellerman de30097476 powerpc/smp: smp_ops->kick_cpu() should be able to fail
When we start a cpu we use smp_ops->kick_cpu(), which currently
returns void, it should be able to fail. Convert it to return
int, and update all uses.

Convert all the current error cases to return -ENOENT, which is
what would eventually be returned by __cpu_up() currently when
it doesn't detect the cpu as coming up in time.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 17:01:18 +10:00
Michael Ellerman ee7a2aa3d3 powerpc/mm: Fix slice state initialization for Book3E
On Book3E, MMU_NO_CONTEXT != 0, but the slice_mm_new_context()
macro assumes that it is.  This means that the map of the
page sizes for each slice is always initialized to zeroes
(which happens to be 4k pages), rather than to the correct
default base page size value - which might be 64k.

This patch corrects the problem.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 16:59:20 +10:00
Michael Ellerman 5e8e7b404a powerpc/mm: Standardise on MMU_NO_CONTEXT
Use MMU_NO_CONTEXT as the initialiser for mm_context.id on
nohash and hash64.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 16:59:20 +10:00
Benjamin Herrenschmidt 948cf67c47 powerpc: Add NAP mode support on Power7 in HV mode
Wakeup comes from the system reset handler with a potential loss of
the non-hypervisor CPU state. We save the non-volatile state on the
stack and a pointer to it in the PACA, which the system reset handler
uses to restore things

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 11:03:24 +10:00
Benjamin Herrenschmidt 9d07bc841c powerpc: Properly handshake CPUs going out of boot spin loop
We need to wait a bit for them to have done their CPU setup
or we might end up with translation and EE on with different
LPCR values between threads

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 11:03:24 +10:00
Paul Mackerras 673b189a2e powerpc: Always use SPRN_SPRG_HSCRATCH0 when running in HV mode
This uses feature sections to arrange that we always use HSPRG1
as the scratch register in the interrupt entry code rather than
SPRG2 when we're running in hypervisor mode on POWER7.  This will
ensure that we don't trash the guest's SPRG2 when we are running
KVM guests.  To simplify the code, we define GET_SCRATCH0() and
SET_SCRATCH0() macros like the GET_PACA/SET_PACA macros.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 11:03:23 +10:00
Benjamin Herrenschmidt b3e6b5dfcf powerpc: More work to support HV exceptions
Rework exception macros a bit to split offset from vector and add
some basic support for HDEC, HDSI, HISI and a few more.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 11:03:23 +10:00
Benjamin Herrenschmidt a5d4f3ad3a powerpc: Base support for exceptions using HSRR0/1
Pass the register type to the prolog, also provides alternate "HV"
version of hardware interrupt (0x500) and adjust LPES accordingly

We tag those interrupts by setting bit 0x2 in the trap number

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 11:03:22 +10:00
Benjamin Herrenschmidt 2dd60d79e0 powerpc: In HV mode, use HSPRG0 for PACA
When running in Hypervisor mode (arch 2.06 or later), we store the PACA
in HSPRG0 instead of SPRG1. The architecture specifies that SPRGs may be
lost during a "nap" power management operation (though they aren't
currently on POWER7) and this enables use of SPRG1 by KVM guests.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 11:03:22 +10:00
Benjamin Herrenschmidt 24cc67de62 powerpc: Define CPU feature for Architected 2.06 HV mode
This bit indicates that we are operating in hypervisor mode on a CPU
compliant to architecture 2.06 or later (currently server only).

We set it on POWER7 and have a boot-time CPU setup function that
clears it if MSR:HV isn't set (booting under a hypervisor).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 11:03:22 +10:00
Benjamin Herrenschmidt 50fb8ebe7c powerpc: Add more Power7 specific definitions
This adds more SPR definitions used on newer processors when running
in hypervisor mode. Along with some other P7 specific bits and pieces

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 11:03:21 +10:00
Benjamin Herrenschmidt 0b05ac6e24 powerpc/xics: Rewrite XICS driver
This is a significant rework of the XICS driver, too significant to
conveniently break it up into a series of smaller patches to be honest.

The driver is moved to a more generic location to allow new platforms
to use it, and is broken up into separate ICP and ICS "backends". For
now we have the native and "hypervisor" ICP backends and one common
RTAS ICS backend.

The driver supports one ICP backend instanciation, and many ICS ones,
in order to accomodate future platforms with multiple possibly different
interrupt "sources" mechanisms.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 11:02:35 +10:00
Stefan Roese 09597cfe93 powerpc: Don't write protect kernel text with CONFIG_DYNAMIC_FTRACE enabled
This problem was noticed on an MPC855T platform. Ftrace did oops
when trying to write to the kernel text segment.

Many thanks to Joakim for finding the root cause of this problem.

Signed-off-by: Stefan Roese <sr@denx.de>
Cc: Joakim Tjernlund <joakim.tjernlund@transmode.se>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-18 13:08:21 +10:00
Benjamin Herrenschmidt 8f3dda75cb Merge remote branch 'kumar/merge' into merge 2011-04-18 12:09:37 +10:00
Scott Wood d51ad91535 powerpc/e500mc: Remove CPU_FTR_MAYBE_CAN_NAP/CPU_FTR_MAYBE_CAN_DOZE
e500mc does not support the HID0/MSR mechanism that is used by e500_idle
(and there are also issues with waking on certain types of interrupts).

Further, even if napping is never actually enabled, just having
CPU_FTR_CAN_NAP will cause machine_init() to overwrite the board's supplied
ppc_md.power_save().

We drop CPU_FTR_MAYBE_CAN_DOZE becuase we should use 'wait' instead on
e500mc.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-04-12 06:29:21 -05:00
Kumar Gala 11ed0db9f6 powerpc/book3e: Fix CPU feature handling on 64-bit e5500
The CPU_FTRS_POSSIBLE and CPU_FTRS_ALWAYS defines did not encompass
e5500 CPU features when built for 64-bit.  This causes issues with
cpu_has_feature() as it utilizes the POSSIBLE & ALWAYS defines as part
of its check.

Create a unique CPU_FTRS_E5500 (as its different from CPU_FTRS_E500MC),
created a new group for 64-bit Book3e based CPUs and add CPU_FTRS_E5500
to that group.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-04-12 06:29:21 -05:00
Linus Torvalds 42933bac11 Merge branch 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6
* 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6:
  Fix common misspellings
2011-04-07 11:14:49 -07:00
Benjamin Herrenschmidt 105765f451 powerpc/smp: Don't expose per-cpu "cpu_state" array
Instead, keep it static, expose an accessor and use that from
the PowerMac code. Avoids easy namespace collisions and will
make it easier to consolidate with other implementations.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-01 15:37:33 +11:00
Benjamin Herrenschmidt d72944457b powerpc/smp: Add a smp_ops->bringup_up() done callback
This allows us to stop abusing smp_ops->setup_cpu() for cleanup
tasks that have to take place after the initial boot time CPU
bringup.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-01 15:37:29 +11:00
Benjamin Herrenschmidt 62cc67b9df powerpc/pmac/smp: Properly NAP offlined CPU on G5
The current code soft-disables, and then goes to NAP mode which
turns interrupts on. That means that if an interrupt occurs, we
will hit the masked interrupt code path which isn't what we want,
as it will return with EE off, which will either get us out of
NAP mode, or fail to enter it (according to spec).

Instead, let's just rely on the fact that it is safe to take
decrementer interrupts on an offline CPU and leave interrupts
enabled. We can also get rid of the special case in asm for
power4_cpu_offline_powersave() and just use power4_idle().

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-01 15:37:25 +11:00
Benjamin Herrenschmidt 1c91cc5705 powerpc/pmac/smp: Rename fixup_irqs() to migrate_irqs() and use it on ppc32
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-01 15:37:18 +11:00
Benjamin Herrenschmidt fb49f864c3 powerpc/pmac/smp: Fix 32-bit PowerMac cpu_die
Use generic cpu_state, call idle_task_exit() properly, and
remove smp_core99_cpu_die() which isn't useful, the generic
function does the job just fine.
2011-04-01 15:37:16 +11:00
Benjamin Herrenschmidt 7a53a4fe70 powerpc/smp: Remove unused smp_ops->cpu_enable()
Remove the last remnants of cpu_enable(), everybody uses the normal
__cpu_up() path now

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-01 15:37:14 +11:00
Benjamin Herrenschmidt b527d07114 powerpc/smp: Remove unused generic_cpu_enable()
Nobody uses it, besides we should always use the normal __cpu_up
path anyways

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-01 15:37:12 +11:00
Benjamin Herrenschmidt fa3f82c8bb powerpc/smp: soft-replugged CPUs must go back to start_secondary
Various thing are torn down when a CPU is hot-unplugged. That CPU
is expected to go back to start_secondary when re-plugged to re
initialize everything, such as clock sources, maps, ...

Some implementations just return from cpu_die() callback
in the idle loop when the CPU is "re-plugged". This is not enough.

We fix it using a little asm trampoline which resets the stack
and calls back into start_secondary as if we were all fresh from
boot. The trampoline already existed on ppc64, but we add it for
ppc32

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-01 15:37:09 +11:00
Lucas De Marchi 25985edced Fix common misspellings
Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-03-31 11:26:23 -03:00
Linus Torvalds 6aba74f279 Merge branch 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  avr32: Fix missing irq namespace conversion
  powerpc: qe_ic: Rename get_irq_desc_data and get_irq_desc_chip
  genirq: Remove the now obsolete config options and select statements
  arm: versatile : Fix typo introduced in irq namespace cleanup
  sound: Fixup the last user of the old irq functions
  genirq: Remove obsolete comment
  genirq: Remove now obsolete set_irq_wake()
  sh: Fix irq cleanup fallout
  x86: apb_timer: Fixup genirq fallout
  genirq: Fix misnamed label in handle_edge_eoi_irq

Fix up crazy conflict in arch/powerpc/include/asm/qe_ic.h:

 - commit eead4d5c63 ("powerpc: qe_ic: Rename get_irq_desc_data and
   get_irq_desc_chip") made the helper functions use
   irq_desc_get_handler_data() instead of the legacy (and no longer
   existing) get_irq_desc_data.

 - commit d4db35e8dc ("powerpc/qe_ic: Fix another breakage from the
   irq_data conversion") used irq_desc_get_chip_data() instead.

According to Thomas, the former is the correct direct conversion, but it
does look like both should work (arch/powerpc/sysdev/qe_lib/qe_ic.c
seems to initialize both to the same thing), and the chip data in some
ways is the more logical.  Somebody should really decide on one of the
other.

This merge picks irq_desc_get_handler_data() as the straightforward pure
conversion to new names, as per Thomas.
2011-03-30 09:35:52 -07:00
Richard Cochran eead4d5c63 powerpc: qe_ic: Rename get_irq_desc_data and get_irq_desc_chip
These two functions disappeared in commit

    0c6f8a8b91
    "genirq: Remove compat code"

but they still exist in qe_ic.h.
This patch renames the function to their new names.

Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Lennert Buytenhek <buytenh@secretlab.ca>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
LKML-Reference: <20110330132504.GA31832@riccoc20.at.omicron.at>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-03-30 15:38:02 +02:00
Benjamin Herrenschmidt d4db35e8dc powerpc/qe_ic: Fix another breakage from the irq_data conversion
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-30 11:17:15 +11:00
Benjamin Herrenschmidt 84493804bb powerpc/mm: Move the STAB0 location to 0x8000 to make room in low memory
Recent upstream builds with allmodconfig fail due to lack of space
between 0x3000 and 0x6000. We have a hard block at 0x7000 but we can
spare a page by moving the STAB0 from 0x6000 to 0x8000.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-30 10:44:20 +11:00
Stephen Rothwell 834796a849 powerpc: Wire up new syscalls
These syscalls have been added recently:
	name_to_handle_at
	open_by_handle_at
	clock_adjtime
	syncfs

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-30 10:44:11 +11:00
Varun Sethi 05e02d7f88 powerpc/booke: Correct the SPRN_MAS5 definition.
339 is the SPR number for MAS5 documented by Power ISA 2.06, and
implemented by e500mc.  It is not yet used anywhere in the kernel,
so nothing should be relying on the wrong number.

Signed-off-by: Varun Sethi <Varun.Sethi@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-30 10:44:09 +11:00
Scott Wood 67eb54944b powerpc: ARCH_PFN_OFFSET should be unsigned long
pfns are unsigned long, but MEMORY_START is phys_addr_t.  This leads
to page_to_pfn() returning phys_addr_t, and thus type mismatches in a few
print statements.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-30 10:44:07 +11:00
Benjamin Herrenschmidt 6090912c4a powerpc: Implement dma_mmap_coherent()
This is used by Alsa to mmap buffers allocated with dma_alloc_coherent()
into userspace. We need a special variant to handle machines with
non-coherent DMAs as those buffers have "special" virt addresses and
require non-cachable mappings

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-30 10:44:00 +11:00
FUJITA Tomonori 8547727756 remove dma64_addr_t
There is no user now.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: David Miller <davem@davemloft.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23 19:47:18 -07:00
Akinobu Mita 61f2e7b0f4 bitops: remove minix bitops from asm/bitops.h
minix bit operations are only used by minix filesystem and useless by
other modules.  Because byte order of inode and block bitmaps is different
on each architecture like below:

m68k:
	big-endian 16bit indexed bitmaps

h8300, microblaze, s390, sparc, m68knommu:
	big-endian 32 or 64bit indexed bitmaps

m32r, mips, sh, xtensa:
	big-endian 32 or 64bit indexed bitmaps for big-endian mode
	little-endian bitmaps for little-endian mode

Others:
	little-endian bitmaps

In order to move minix bit operations from asm/bitops.h to architecture
independent code in minix filesystem, this provides two config options.

CONFIG_MINIX_FS_BIG_ENDIAN_16BIT_INDEXED is only selected by m68k.
CONFIG_MINIX_FS_NATIVE_ENDIAN is selected by the architectures which use
native byte order bitmaps (h8300, microblaze, s390, sparc, m68knommu,
m32r, mips, sh, xtensa).  The architectures which always use little-endian
bitmaps do not select these options.

Finally, we can remove minix bit operations from asm/bitops.h for all
architectures.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Greg Ungerer <gerg@uclinux.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Michal Simek <monstr@monstr.eu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Cc: Chris Zankel <chris@zankel.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23 19:46:22 -07:00
Akinobu Mita f312eff816 bitops: remove ext2 non-atomic bitops from asm/bitops.h
As the result of conversions, there are no users of ext2 non-atomic bit
operations except for ext2 filesystem itself.  Now we can put them into
architecture independent code in ext2 filesystem, and remove from
asm/bitops.h for all architectures.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23 19:46:21 -07:00
Akinobu Mita f57d7ff1b8 powerpc: introduce little-endian bitops
Introduce little-endian bit operations by renaming existing powerpc native
little-endian bit operations and changing them to take any pointer types.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23 19:46:12 -07:00
Akinobu Mita a56560b3b2 asm-generic: change little-endian bitops to take any pointer types
This makes the little-endian bitops take any pointer types by changing the
prototypes and adding casts in the preprocessor macros.

That would seem to at least make all the filesystem code happier, and they
can continue to do just something like

  #define ext2_set_bit __test_and_set_bit_le

(or whatever the exact sequence ends up being).

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Mikael Starvik <starvik@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Matthew Wilcox <willy@debian.org>
Cc: Grant Grundler <grundler@parisc-linux.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Zankel <chris@zankel.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23 19:46:12 -07:00
Akinobu Mita c4945b9ed4 asm-generic: rename generic little-endian bitops functions
As a preparation for providing little-endian bitops for all architectures,
This renames generic implementation of little-endian bitops.  (remove
"generic_" prefix and postfix "_le")

s/generic_find_next_le_bit/find_next_bit_le/
s/generic_find_next_zero_le_bit/find_next_zero_bit_le/
s/generic_find_first_zero_le_bit/find_first_zero_bit_le/
s/generic___test_and_set_le_bit/__test_and_set_bit_le/
s/generic___test_and_clear_le_bit/__test_and_clear_bit_le/
s/generic_test_le_bit/test_bit_le/
s/generic___set_le_bit/__set_bit_le/
s/generic___clear_le_bit/__clear_bit_le/
s/generic_test_and_set_le_bit/test_and_set_bit_le/
s/generic_test_and_clear_le_bit/test_and_clear_bit_le/

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Cc: Greg Ungerer <gerg@uclinux.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23 19:46:11 -07:00
FUJITA Tomonori 3e50594e8e add the common dma_addr_t typedef to include/linux/types.h
All architectures can use the common dma_addr_t typedef now. We can
remove the arch specific dma_addr_t.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Matt Turner <mattst88@gmail.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-22 17:44:09 -07:00
Eric Dumazet b6a84016bd mm: NUMA aware alloc_thread_info_node()
Add a node parameter to alloc_thread_info(), and change its name to
alloc_thread_info_node()

This change is needed to allow NUMA aware kthread_create_on_cpu()

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tejun Heo <tj@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: David Howells <dhowells@redhat.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-22 17:44:01 -07:00
Mike Wolf a71f5d5d27 powerpc/ptrace: Remove BUG_ON when full register set not available
In some cases during a threaded core dump not all the threads will have
a full register set. This happens when the signal causing the core dump
races with a thread exiting.  The race happens when the exiting thread
has entered the kernel for the last time before the signal arrives, but
doesn't get far enough through the exit code to avoid being included
in the core dump.

So we get a thread included in the core dump which is never going to go
out to userspace again and only has a partial register set recorded

Normally we would catch each thread as it is about to go into userspace
and capture the full register set then.

However, this exiting thread is never going to go out to userspace
again, so we have no way to capture its full register set.  It doesn't
really matter, though, as this is a thread which is effectively
already dead.

So instead of hitting a BUG() in this case (a really bad choice of
action in the first place), we use a poison value for the register
values.

[BenH]: Some cosmetic/stylistic changes and fix build on ppc32

Signed-off-by: Mike Wolf <mjw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-21 11:18:14 +11:00
Meador Inge dfec220272 powerpc: Make MPIC honor the "pic-no-reset" device tree property
This property, defined in the Open PIC binding, tells the kernel not to use
the reset bit in the global configuration register.  Additionally, its
presence mandates that only sources which are actually used (i.e. appear in
the device tree) should have their VECPRI bits initialized.

Although, "pic-no-reset" can be used for the same use cases that
"protected-sources" is covering, the "protected-sources" implementation was
left completely intact.  This is a more pragmatic approach as there are
already several existing systems which use protected sources.  If
"pic-no-reset" *and* "protected-sources" are both used, however, then
"pic-no-reset" takes precedence in terms of the init behavior and the
sanity checks done by protected sources will still take place.

Signed-off-by: Meador Inge <meador_inge@mentor.com>
Cc: Hollis Blanchard <hollis_blanchard@mentor.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-21 11:01:32 +11:00
Linus Torvalds 619297855a Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (30 commits)
  trace, filters: Initialize the match variable in process_ops() properly
  trace, documentation: Fix branch profiling location in debugfs
  oprofile, s390: Cleanups
  oprofile, s390: Remove hwsampler_files.c and merge it into init.c
  perf: Fix tear-down of inherited group events
  perf: Reorder & optimize perf_event_context to remove alignment padding on 64 bit builds
  perf: Handle stopped state with tracepoints
  perf: Fix the software events state check
  perf, powerpc: Handle events that raise an exception without overflowing
  perf, x86: Use INTEL_*_CONSTRAINT() for all PEBS event constraints
  perf, x86: Clean up SandyBridge PEBS events
  perf lock: Fix sorting by wait_min
  perf tools: Version incorrect with some versions of grep
  perf evlist: New command to list the names of events present in a perf.data file
  perf script: Add support for H/W and S/W events
  perf script: Add support for dumping symbols
  perf script: Support custom field selection for output
  perf script: Move printing of 'common' data from print_event and rename
  perf tracing: Remove print_graph_cpu and print_graph_proc from trace-event-parse
  perf script: Change process_event prototype
  ...
2011-03-18 10:38:34 -07:00
Linus Torvalds 0a95d92c00 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (62 commits)
  powerpc/85xx: Fix signedness bug in cache-sram
  powerpc/fsl: 85xx: document cache sram bindings
  powerpc/fsl: define binding for fsl mpic interrupt controllers
  powerpc/fsl_msi: Handle msi-available-ranges better
  drivers/serial/ucc_uart.c: Add of_node_put to avoid memory leak
  powerpc/85xx: Fix SPE float to integer conversion failure
  powerpc/85xx: Update sata controller compatible for p1022ds board
  ATA: Add FSL sata v2 controller support
  powerpc/mpc8xxx_gpio: simplify searching for 'fsl, qoriq-gpio' compatiable
  powerpc/8xx: remove obsolete mgsuvd board
  powerpc/82xx: rename and update mgcoge board support
  powerpc/83xx: rename and update kmeter1
  powerpc/85xx: Workaroudn e500 CPU erratum A005
  powerpc/fsl_pci: Add support for FSL PCIe controllers v2.x
  powerpc/85xx: Fix writing to spin table 'cpu-release-addr' on ppc64e
  powerpc/pseries: Disable MSI using new interface if possible
  powerpc: Enable GENERIC_HARDIRQS_NO_DEPRECATED.
  powerpc: core irq_data conversion.
  powerpc: sysdev/xilinx_intc irq_data conversion.
  powerpc: sysdev/uic irq_data conversion.
  ...

Fix up conflicts in arch/powerpc/sysdev/fsl_msi.c (due to getting rid of
of_platform_driver in arch/powerpc)
2011-03-18 06:31:43 -07:00
Benjamin Herrenschmidt 831532035b Merge remote branch 'jwb/next' into next 2011-03-17 17:59:01 +11:00
Linus Torvalds 4c5811bf46 Merge branch 'devicetree/next' of git://git.secretlab.ca/git/linux-2.6
* 'devicetree/next' of git://git.secretlab.ca/git/linux-2.6: (21 commits)
  tty: serial: altera_jtaguart: Add device tree support
  tty: serial: altera_uart: Add devicetree support
  dt: eliminate of_platform_driver shim code
  dt: Eliminate of_platform_{,un}register_driver
  dt/serial: Eliminate users of of_platform_{,un}register_driver
  dt/usb: Eliminate users of of_platform_{,un}register_driver
  dt/video: Eliminate users of of_platform_{,un}register_driver
  dt/net: Eliminate users of of_platform_{,un}register_driver
  dt/sound: Eliminate users of of_platform_{,un}register_driver
  dt/spi: Eliminate users of of_platform_{,un}register_driver
  dt: uartlite: merge platform and of_platform driver bindings
  dt: xilinx_hwicap: merge platform and of_platform driver bindings
  ipmi: convert OF driver to platform driver
  leds/leds-gpio: merge platform_driver with of_platform_driver
  dt/sparc: Eliminate users of of_platform_{,un}register_driver
  dt/powerpc: Eliminate users of of_platform_{,un}register_driver
  dt/powerpc: move of_bus_type infrastructure to ibmebus
  drivercore/dt: add a match table pointer to struct device
  dt: Typo fix.
  altera_ps2: Add devicetree support
  ...
2011-03-16 17:28:10 -07:00
Linus Torvalds e6bee325e4 Merge branch 'tty-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6
* 'tty-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6: (76 commits)
  pch_uart: reference clock on CM-iTC
  pch_phub: add new device ML7213
  n_gsm: fix UIH control byte : P bit should be 0
  n_gsm: add a documentation
  serial: msm_serial_hs: Add MSM high speed UART driver
  tty_audit: fix tty_audit_add_data live lock on audit disabled
  tty: move cd1865.h to drivers/staging/tty/
  Staging: tty: fix build with epca.c driver
  pcmcia: synclink_cs: fix prototype for mgslpc_ioctl()
  Staging: generic_serial: fix double locking bug
  nozomi: don't use flush_scheduled_work()
  tty/serial: Relax the device_type restriction from of_serial
  MAINTAINERS: Update HVC file patterns
  tty: phase out of ioctl file pointer for tty3270 as well
  tty: forgot to remove ipwireless from drivers/char/pcmcia/Makefile
  pch_uart: Fix DMA channel miss-setting issue.
  pch_uart: fix exclusive access issue
  pch_uart: fix auto flow control miss-setting issue
  pch_uart: fix uart clock setting issue
  pch_uart : Use dev_xxx not pr_xxx
  ...

Fix up trivial conflicts in drivers/misc/pch_phub.c (same patch applied
twice, then changes to the same area in one branch)
2011-03-16 15:11:04 -07:00
Anton Blanchard 0837e3242c perf, powerpc: Handle events that raise an exception without overflowing
Events on POWER7 can roll back if a speculative event doesn't
eventually complete. Unfortunately in some rare cases they will
raise a performance monitor exception. We need to catch this to
ensure we reset the PMC. In all cases the PMC will be 256 or less
cycles from overflow.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <stable@kernel.org> # as far back as it applies cleanly
LKML-Reference: <20110309143842.6c22845e@kryten>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-16 14:04:13 +01:00
Linus Torvalds d10902812c Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (27 commits)
  x86: Clean up apic.c and apic.h
  x86: Remove superflous goal definition of tsc_sync
  x86: dt: Correct local apic documentation in device tree bindings
  x86: dt: Cleanup local apic setup
  x86: dt: Fix OLPC=y/INTEL_CE=n build
  rtc: cmos: Add OF bindings
  x86: ce4100: Use OF to setup devices
  x86: ioapic: Add OF bindings for IO_APIC
  x86: dtb: Add generic bus probe
  x86: dtb: Add support for PCI devices backed by dtb nodes
  x86: dtb: Add device tree support for HPET
  x86: dtb: Add early parsing of IO_APIC
  x86: dtb: Add irq domain abstraction
  x86: dtb: Add a device tree for CE4100
  x86: Add device tree support
  x86: e820: Remove conditional early mapping in parse_e820_ext
  x86: OLPC: Make OLPC=n build again
  x86: OLPC: Remove extra OLPC_OPENFIRMWARE_DT indirection
  x86: OLPC: Cleanup config maze completely
  x86: OLPC: Hide OLPC_OPENFIRMWARE config switch
  ...

Fix up conflicts in arch/x86/platform/ce4100/ce4100.c
2011-03-15 20:01:36 -07:00
Linus Torvalds 0586bed3e8 Merge branch 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  rtmutex: tester: Remove the remaining BKL leftovers
  lockdep/timers: Explain in detail the locking problems del_timer_sync() may cause
  rtmutex: Simplify PI algorithm and make highest prio task get lock
  rwsem: Remove redundant asmregparm annotation
  rwsem: Move duplicate function prototypes to linux/rwsem.h
  rwsem: Unify the duplicate rwsem_is_locked() inlines
  rwsem: Move duplicate init macros and functions to linux/rwsem.h
  rwsem: Move duplicate struct rwsem declaration to linux/rwsem.h
  x86: Cleanup rwsem_count_t typedef
  rwsem: Cleanup includes
  locking: Remove deprecated lock initializers
  cred: Replace deprecated spinlock initialization
  kthread: Replace deprecated spinlock initialization
  xtensa: Replace deprecated spinlock initialization
  um: Replace deprecated spinlock initialization
  sparc: Replace deprecated spinlock initialization
  mips: Replace deprecated spinlock initialization
  cris: Replace deprecated spinlock initialization
  alpha: Replace deprecated spinlock initialization
  rtmutex-tester: Remove BKL tests
2011-03-15 18:28:30 -07:00
Linus Torvalds b80cd62b7d Merge branch 'core-futexes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-futexes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  arm: Remove bogus comment in futex_atomic_cmpxchg_inatomic()
  futex: Deobfuscate handle_futex_death()
  plist: Add priority list test
  plist: Shrink struct plist_head
  futex,plist: Remove debug lock assignment from plist_node
  futex,plist: Pass the real head of the priority list to plist_del()
  futex: Sanitize futex ops argument types
  futex: Sanitize cmpxchg_futex_value_locked API
  futex: Remove redundant pagefault_disable in futex_atomic_cmpxchg_inatomic()
  futex: Avoid redudant evaluation of task_pid_vnr()
  futex: Update futex_wait_setup comments about locking
2011-03-15 18:23:52 -07:00
Liu Yu ac6f120369 powerpc/85xx: Workaroudn e500 CPU erratum A005
This erratum can occur if a single-precision floating-point,
double-precision floating-point or vector floating-point instruction on a
mispredicted branch path signals one of the floating-point data interrupts
which are enabled by the SPEFSCR (FINVE, FDBZE, FUNFE or FOVFE bits).  This
interrupt must be recorded in a one-cycle window when the misprediction is
resolved.  If this extremely rare event should occur, the result could be:

The SPE Data Exception from the mispredicted path may be reported
erroneously if a single-precision floating-point, double-precision
floating-point or vector floating-point instruction is the second
instruction on the correct branch path.

According to errata description, some efp instructions which are not
supposed to trigger SPE exceptions can trigger the exceptions in this case.
However, as we haven't emulated these instructions here, a signal will
send to userspace, and userspace application would exit.

This patch re-issue the efp instruction that we haven't emulated,
so that hardware can properly execute it again if this case happen.

Signed-off-by: Liu Yu <yu.liu@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-03-15 10:05:06 -05:00
Michel Lespinasse 8d7718aa08 futex: Sanitize futex ops argument types
Change futex_atomic_op_inuser and futex_atomic_cmpxchg_inatomic
prototypes to use u32 types for the futex as this is the data type the
futex core code uses all over the place.

Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Darren Hart <darren@dvhart.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: David Howells <dhowells@redhat.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <20110311025058.GD26122@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-03-11 12:23:31 +01:00
Michel Lespinasse 37a9d912b2 futex: Sanitize cmpxchg_futex_value_locked API
The cmpxchg_futex_value_locked API was funny in that it returned either
the original, user-exposed futex value OR an error code such as -EFAULT.
This was confusing at best, and could be a source of livelocks in places
that retry the cmpxchg_futex_value_locked after trying to fix the issue
by running fault_in_user_writeable().
    
This change makes the cmpxchg_futex_value_locked API more similar to the
get_futex_value_locked one, returning an error code and updating the
original value through a reference argument.
    
Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>  [tile]
Acked-by: Tony Luck <tony.luck@intel.com>  [ia64]
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michal Simek <monstr@monstr.eu>  [microblaze]
Acked-by: David Howells <dhowells@redhat.com> [frv]
Cc: Darren Hart <darren@dvhart.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <20110311024851.GC26122@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-03-11 12:23:08 +01:00
Lennert Buytenhek 3a0adfab4a powerpc: sysdev/qe_lib/qe_ic irq_data conversion.
Signed-off-by: Lennert Buytenhek <buytenh@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-10 11:04:03 +11:00
Lennert Buytenhek 835c0553eb powerpc: mpic irq_data conversion.
Signed-off-by: Lennert Buytenhek <buytenh@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-10 11:03:56 +11:00
Benjamin Herrenschmidt f2f6dad6ca powerpc/iseries: Fix early init access to lppaca
The combination of commit

8154c5d22d and
93c22703ef

Broke boot on iSeries.

The problem is that iSeries very early boot code, which generates
the device-tree and runs before our normal early initializations
does need access the lppaca's very early, before the PACA array is
initialized, and in fact even before the boot PACA has been
initialized (it contains all 0's at this stage).

However, the first patch above makes that code use the new
llpaca_of(cpu) accessor, which itself is changed by the second patch to
use the PACA array.

We fix that by reverting iSeries to directly dereferencing the array. In
addition, we fix all iterators in the iSeries code to always skip CPU
whose number is above 63 which is the maximum size of that array and
the maximum number of supported CPUs on these machines.

Additionally, we make sure the boot_paca is properly initialized
in our early startup code.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-10 10:06:02 +11:00
Tseng-Hui (Frank) Lin 6edc642ebe powerpc: Cleanup definition of the PID register
Move SPRN_PID declearations in various locations into one place.

Signed-off-by: Tseng-Hui (Frank) Lin <thlin@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-04 18:19:05 +11:00
Jim Keniston 0f4ac13236 powerpc/nvram: Generalize code for OS partitions in NVRAM
Adapt the functions used to create and write to the RTAS-log partition
to work with any OS-type partition.

Signed-off-by: Jim Keniston <jkenisto@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-04 18:19:04 +11:00
Anton Blanchard fe3cc0d99d powerpc: Add pgprot_writecombine
A number of drivers are using pgprot_writecombine() to enable write
combining on userspace mappings. Implement it on powerpc.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-02 16:50:24 +11:00
Thomas Gleixner 089fb442f3 powerpc: Use ARCH_IRQ_INIT_FLAGS
Define the ARCH_IRQ_INIT_FLAGS instead of fixing it up in a loop.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-02 16:50:24 +11:00
Anton Blanchard 357574c482 powerpc/kexec: Restore ppc_md.machine_kexec
Kyle Moffett points out that mpc85xx has started using the
ppc_md.machine_kexec hook. As such, revert patch c94868788c
(powerpc/kexec: Remove ppc_md.machine_kexec).

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-02 14:56:48 +11:00
Grant Likely 38a5d6736e Merge commit 'v2.6.38-rc6' into devicetree/next
Conflicts:
	drivers/spi/pxa2xx_spi_pci.c
2011-02-28 01:36:21 -07:00
Greg Kroah-Hartman f227e08b71 Merge 2.6.38-rc6 into tty-next
This was to resolve a merge issue with drivers/char/Makefile and
drivers/tty/serial/68328serial.c

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-02-24 11:36:31 -08:00
Thomas Gleixner 7acdbb3f35 Merge branch 'linus' into x86/platform
Reason: Import mainline device tree changes on which further patches
        depend on or conflict.

Trivial conflict in: drivers/spi/pxa2xx_spi_pci.c

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-02-23 09:21:41 +01:00
Kay Sievers 3c95c985fa tty: add TIOCVHANGUP to allow clean tty shutdown of all ttys
This is useful for system management software so that it can kick
off things like gettys and everything that's started from a tty,
before we reuse it from/for something else or shut it down.

Without this ioctl it would have to temporarily become the owner of
the tty, then call vhangup() and then give it up again.

Cc: Lennart Poettering <lennart@poettering.net>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-02-17 14:16:30 -08:00
Ingo Molnar a3ec4a603f Merge commit 'v2.6.38-rc5' into core/locking
Merge reason: pick up upstream fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-16 13:33:41 +01:00
Scott Wood b51cbd41a3 powerpc/book3e: Protect complex macro args in mmu-book3e.h
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-02-07 12:47:56 +11:00
Scott Wood 81c386cc7f powerpc: Fix pfn_valid() when memory starts at a non-zero address
max_mapnr is a pfn, not an index innto mem_map[].  So don't add
ARCH_PFN_OFFSET a second time.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-02-07 12:47:56 +11:00
Grant Likely b5d937de03 powerpc/pci: Make both ppc32 and ppc64 use sysdata for pci_controller
Currently, ppc32 uses sysdata for the pci_controller pointer, and
ppc64 uses it to hold the device_node pointer.  This patch moves the
of_node pointer into (struct pci_bus*)->dev.of_node and
(struct pci_dev*)->dev.of_node so that sysdata can be converted to always
use the pci_controller pointer instead.  It also fixes up the
allocating of pci devices so that the of_node pointer gets assigned
consistently and increments the ref count.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-02-04 11:46:51 -07:00
Sebastian Andrzej Siewior 04bea68b2f of/pci: move of_irq_map_pci() into generic code
There is a tiny difference between PPC32 and PPC64. Microblaze uses the
PPC32 variant.

Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
[grant.likely@secretlab.ca: Added comment to #endif, moved documentation
	block to function implementation, fixed for non ppc and microblaze
	compiles]
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-02-04 11:46:50 -07:00
Dave Kleikamp c48d0dbaac powerpc/476: define specific cpu table entry DD2 core
The DD2 core still has some unstability.  Define CPU_FTR_476_DD2 to
enable workarounds in later patches.

This is based on an earlier, unreleased patch for DD1 by Ben Herrenschmidt.

Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2011-02-02 06:58:53 -05:00
Thomas Gleixner aac72277fd rwsem: Move duplicate function prototypes to linux/rwsem.h
All architecture specific rwsem headers carry the same function
prototypes. Just x86 adds asmregparm, which is an empty define on all
other architectures. S390 has a stale rwsem_downgrade_write()
prototype.

Remove the duplicates and add the prototypes to linux/rwsem.h

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Richard Henderson <rth@twiddle.net>
Acked-by: Tony Luck <tony.luck@intel.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Acked-by: David Miller <davem@davemloft.net>
Cc: Chris Zankel <chris@zankel.net>
LKML-Reference: <20110126195833.970840140@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-01-27 12:30:39 +01:00
Thomas Gleixner 41e5887fa3 rwsem: Unify the duplicate rwsem_is_locked() inlines
Instead of having the same implementation in each architecture, move
it to linux/rwsem.h and remove the duplicates. It's unlikely that an
arch will ever implement something different, but we can deal with
that when it happens.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Matt Turner <mattst88@gmail.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Acked-by: David Miller <davem@davemloft.net>
Cc: Chris Zankel <chris@zankel.net>
LKML-Reference: <20110126195833.876773757@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-01-27 12:30:39 +01:00
Thomas Gleixner 12249b3441 rwsem: Move duplicate init macros and functions to linux/rwsem.h
The rwsem initializers and related macros and functions are mostly the
same. Some of them lack the lockdep initializer, but having it in
place does not matter for architectures which do not support lockdep.

powerpc, sparc, x86: No functional change

sh, s390: Removes the duplicate init_rwsem (inline and #define)

alpha, ia64, xtensa: Use the lockdep capable init function in
       	     	     lib/rwsem.c which is just uninlining the init
       	     	     function for the LOCKDEP=n case

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Matt Turner <mattst88@gmail.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Acked-by: David Miller <davem@davemloft.net>
Cc: Chris Zankel <chris@zankel.net>
LKML-Reference: <20110126195833.771812729@linutronix.de>
2011-01-27 12:30:39 +01:00
Thomas Gleixner 1c8ed640d9 rwsem: Move duplicate struct rwsem declaration to linux/rwsem.h
The difference between these declarations is the data type of the
count member and the lack of lockdep in some architectures/

long is equivivalent to signed long and the #ifdef guarded dep_map
member does not hurt anyone.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Matt Turner <mattst88@gmail.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Acked-by: David Miller <davem@davemloft.net>
Cc: Chris Zankel <chris@zankel.net>
LKML-Reference: <20110126195833.679641914@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-01-27 12:30:39 +01:00
Thomas Gleixner c16a87ce06 rwsem: Cleanup includes
All rwsem implementations include the same headers. Include them from
include/linux/rwsem.h

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Matt Turner <mattst88@gmail.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Acked-by: David Miller <davem@davemloft.net>
Cc: Chris Zankel <chris@zankel.net>
LKML-Reference: <20110126195833.483520950@linutronix.de>
2011-01-27 12:30:38 +01:00
Anton Blanchard 158d5b5e36 powerpc/kdump: Move crash_kexec_stop_spus to kdump crash handler
Use the crash handler hooks to run the SPU stop code, just like we do for
ehea and cell RAS code.

While I'm here I noticed "CPUSs reliabally"

so fix the spelling MISTAKESs reliabally.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-01-21 14:08:36 +11:00
Anton Blanchard c1f784e553 powerpc/kdump: Remove ppc_md.machine_crash_shutdown
No one uses ppc_md.machine_crash_shutdown, so remove it.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-01-21 14:08:35 +11:00
Anton Blanchard c94868788c powerpc/kexec: Remove ppc_md.machine_kexec
No one uses ppc_md.machine_kexec, so remove it.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-01-21 14:08:35 +11:00
Anton Blanchard 619b267724 powerpc/kexec: Remove ppc_md.machine_kexec_cleanup
No one uses ppc_md.machine_kexec_cleanup, so remove it.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-01-21 14:08:35 +11:00
Anton Blanchard 50266a1f8a powerpc/kexec: Move all ppc_md kexec function pointers together
Move all the kexec handlers together.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-01-21 14:08:34 +11:00
Steven Rostedt 3cb5f1a3e5 powerpc/ppc64/tracing: Add stack frame to calls of trace_hardirqs_on/off
When an interrupt occurs in userspace, we can call trace_hardirqs_on/off()
    With one level stack. But if we have irqsoff tracing enabled,
    it checks both CALLER_ADDR0 and CALLER_ADDR1. The second call
    goes two stack frames up. If this is from user space, then there may
    not exist a second stack.

    Add a second stack when calling trace_hardirqs_on/off() otherwise
    the following oops might occur:

    Oops: Kernel access of bad area, sig: 11 [#1]
    PREEMPT SMP NR_CPUS=2 PA Semi PWRficient
    last sysfs file: /sys/block/sda/size
    Modules linked in: ohci_hcd ehci_hcd usbcore
    NIP: c0000000000e1c00 LR: c0000000000034d4 CTR: 000000011012c440
    REGS: c00000003e2f3af0 TRAP: 0300   Not tainted  (2.6.37-rc6+)
    MSR: 9000000000001032 <ME,IR,DR>  CR: 48044444  XER: 20000000
    DAR: 00000001ffb9db50, DSISR: 0000000040000000
    TASK = c00000003e1a00a0[2088] 'emacs' THREAD: c00000003e2f0000 CPU: 1
    GPR00: 0000000000000001 c00000003e2f3d70 c00000000084e0d0 c0000000008816e8
    GPR04: 000000001034c678 000000001032e8f9 0000000010336540 0000000040020000
    GPR08: 0000000040020000 00000001ffb9db40 c00000003e2f3e30 0000000060000000
    GPR12: 100000000000f032 c00000000fff0280 000000001032e8c9 0000000000000008
    GPR16: 00000000105be9c0 00000000105be950 00000000105be9b0 00000000105be950
    GPR20: 00000000ffb9dc50 00000000ffb9dbf0 00000000102f0000 00000000102f0000
    GPR24: 00000000102e0000 00000000102f0000 0000000010336540 c0000000009ded38
    GPR28: 00000000102e0000 c0000000000034d4 c0000000007ccb10 c00000003e2f3d70
    NIP [c0000000000e1c00] .trace_hardirqs_off+0xb0/0x1d0
    LR [c0000000000034d4] decrementer_common+0xd4/0x100
    Call Trace:
    [c00000003e2f3d70] [c00000003e2f3e30] 0xc00000003e2f3e30 (unreliable)
    [c00000003e2f3e30] [c0000000000034d4] decrementer_common+0xd4/0x100
    Instruction dump:
    81690000 7f8b0000 419e0018 f84a0028 60000000 60000000 60000000 e95f0000
    80030000 e92a0000 eb6301f8 2f800000 <eb890010> 41fe00dc a06d000a eb1e8050
    ---[ end trace 4ec7fd2be9240928 ]---

    Reported-by: Joerg Sommer <joerg@alea.gnuu.de>
    Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-01-21 14:08:33 +11:00
Michael Ellerman c0337288ab powerpc: Ensure the else case of feature sections will fit
When we create an alternative feature section, the else case must be the
same size or smaller than the body. This is because when we patch the
else case in we just overwrite the body, so there must be room.

Up to now we just did this by inspection, but it's quite easy to enforce
it in the assembler, so we should.

The only change is to add the ifgt block, but that effects the alignment
of the tabs and so the whole macro is modified.

Also add a test, but #if 0 it because we don't want to break the build.
Anyone who's modifying the feature macros should enable the test.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-01-21 14:08:33 +11:00
Benjamin Herrenschmidt 50f4df4e6a Merge remote branch 'kumar/next' into merge 2011-01-21 11:00:44 +11:00
Linus Torvalds 008d23e485 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (43 commits)
  Documentation/trace/events.txt: Remove obsolete sched_signal_send.
  writeback: fix global_dirty_limits comment runtime -> real-time
  ppc: fix comment typo singal -> signal
  drivers: fix comment typo diable -> disable.
  m68k: fix comment typo diable -> disable.
  wireless: comment typo fix diable -> disable.
  media: comment typo fix diable -> disable.
  remove doc for obsolete dynamic-printk kernel-parameter
  remove extraneous 'is' from Documentation/iostats.txt
  Fix spelling milisec -> ms in snd_ps3 module parameter description
  Fix spelling mistakes in comments
  Revert conflicting V4L changes
  i7core_edac: fix typos in comments
  mm/rmap.c: fix comment
  sound, ca0106: Fix assignment to 'channel'.
  hrtimer: fix a typo in comment
  init/Kconfig: fix typo
  anon_inodes: fix wrong function name in comment
  fix comment typos concerning "consistent"
  poll: fix a typo in comment
  ...

Fix up trivial conflicts in:
 - drivers/net/wireless/iwlwifi/iwl-core.c (moved to iwl-legacy.c)
 - fs/ext4/ext4.h

Also fix missed 'diabled' typo in drivers/net/bnx2x/bnx2x.h while at it.
2011-01-13 10:05:56 -08:00
Timur Tabi b49d81ded4 powerpc: fix warning when compiling immap_qe.h
Fix the warnings genereted by arch/powerpc/include/asm/immap_qe.h when
CONFIG_PHYS_ADDR_T_64BIT is defined:

immap_qe.h: In function 'immrbar_virt_to_phys':
immap_qe.h:472:8: warning: cast from pointer to integer of different size
immap_qe.h:472:24: warning: cast from pointer to integer of different size
immap_qe.h:473:5: warning: cast from pointer to integer of different size
immap_qe.h:473:21: warning: cast from pointer to integer of different size
immap_qe.h:474:36: warning: cast from pointer to integer of different size

Note that the QE does not support 36-bit physical addresses, so even when
CONFIG_PHYS_ADDR_T_64BIT is defined, the QE MURAM must be located below the
4GB boundary.

Signed-off-by: Timur Tabi <timur@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-01-12 18:02:46 -06:00
Li Yang 86985db66e powerpc/85xx: add e500 HID1 bit definition
Also make 74xx HID1 definition conditional.

Signed-off-by: Li Yang <leoli@freescale.com>
Signed-off-by: Shaohui Xie <b21989@freescale.com>
Cc: Roy Zang <tie-fei.zang@freescale.com>
Cc: Alexandre Bounine <alexandre.bounine@idt.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-01-12 18:00:29 -06:00
Linus Torvalds 5a62f99544 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (72 commits)
  powerpc/pseries: Fix build of topology stuff without CONFIG_NUMA
  powerpc/pseries: Fix VPHN build errors on non-SMP systems
  powerpc/83xx: add mpc8308_p1m DMA controller device-tree node
  powerpc/83xx: add DMA controller to mpc8308 device-tree node
  powerpc/512x: try to free dma descriptors in case of allocation failure
  powerpc/512x: add MPC8308 dma support
  powerpc/512x: fix the hanged dma transfer issue
  powerpc/512x: scatter/gather dma fix
  powerpc/powermac: Make auto-loading of therm_pm72 possible
  of/address: Use propper endianess in get_flags
  powerpc/pci: Use printf extension %pR for struct resource
  powerpc: Remove unnecessary casts of void ptr
  powerpc: Disable VPHN polling during a suspend operation
  powerpc/pseries: Poll VPA for topology changes and update NUMA maps
  powerpc: iommu: Add device name to iommu error printks
  powerpc: Record vma->phys_addr in ioremap()
  powerpc: Update compat_arch_ptrace
  powerpc: Fix PPC_PTRACE_SETHWDEBUG on PPC_BOOK3S
  powerpc/time: printk time stamp init not correct
  powerpc: Minor cleanups for machdep.h
  ...
2011-01-11 16:31:41 -08:00
Benjamin Herrenschmidt 5d7d8072ed powerpc/pseries: Fix build of topology stuff without CONFIG_NUMA
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-01-12 10:56:29 +11:00
Jesse Larrew 39bf990ead powerpc/pseries: Fix VPHN build errors on non-SMP systems
The header asm/hvcall.h was previously included indirectly via
smp.h. On non-SMP systems, however, these declarations are excluded
and the build breaks. This is easily fixed by including asm/hvcall.h
directly.

The VPHN feature is only meaningful on NUMA systems that implement
the SPLPAR option, so exclude the VPHN code on systems without
SPLPAR enabled.

Also, expose unmap_cpu_from_node() on systems with SPLPAR enabled,
even if CONFIG_HOTPLUG_CPU is disabled.

Lastly, map_cpu_to_node() is now needed by VPHN to manipulate the
node masks after boot time, so remove the __cpuinit annotation to
fix a section mismatch.

Signed-off-by: Jesse Larrew <jlarrew@linux.vnet.ibm.com>
2011-01-11 16:06:16 +11:00
Linus Torvalds 0bd2cbcdfa Merge branch 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6
* 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6: (29 commits)
  of/flattree: forward declare struct device_node in of_fdt.h
  ipmi: explicitly include of_address.h and of_irq.h
  sparc: explicitly cast negative phandle checks to s32
  powerpc/405: Fix missing #{address,size}-cells in i2c node
  powerpc/5200: dts: refactor dts files
  powerpc/5200: dts: Change combatible strings on localbus
  powerpc/5200: dts: remove unused properties
  powerpc/5200: dts: rename nodes to prepare for refactoring dts files
  of/flattree: Update dtc to current mainline.
  of/device: Don't register disabled devices
  powerpc/dts: fix syntax bugs in bluestone.dts
  of: Fixes for OF probing on little endian systems
  of: make drivers depend on CONFIG_OF instead of CONFIG_PPC_OF
  of/flattree: Add of_flat_dt_match() helper function
  of_serial: explicitly include of_irq.h
  of/flattree: Refactor unflatten_device_tree and add fdt_unflatten_tree
  of/flattree: Reorder unflatten_dt_node
  of/flattree: Refactor unflatten_dt_node
  of/flattree: Add non-boottime device tree functions
  of/flattree: Add Kconfig for EARLY_FLATTREE
  ...

Fix up trivial conflict in arch/sparc/prom/tree_32.c as per Grant.
2011-01-10 08:57:03 -08:00
Sebastian Andrzej Siewior 0131d8973c of/address: use proper endianess in get_flags
This patch changes u32 to __be32 for all "ranges", "prop" and "addr" and
such. Those variables are pointing to the device tree which contains
integers in big endian format.

Most functions are doing it right because of_read_number() is doing the
right thing for them. of_bus_isa_get_flags(), of_bus_pci_get_flags() and
of_bus_isa_map() were accessing the data directly and were doing it wrong.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-12-23 15:57:48 -07:00
Werner Fink b7b8de0873 TTY: Add tty ioctl to figure device node of the system console.
This has been in the SuSE kernels for a very long time.

Signed-off-by: Werner Fink <werner@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-12-16 16:18:28 -08:00
Sebastian Siewior 982cf00412 of/address: Use propper endianess in get_flags
This patch changes u32 to __be32 for all "ranges", "prop" and "addr" and
such. Those variables are pointing to the device tree which containts
intergers in big endian format.
Most functions are doing it right because of_read_number() is doing the
right thing for them. of_bus_isa_get_flags(), of_bus_pci_get_flags() and
of_bus_isa_map() were accessing the data directly and were doing it wrong.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-12-09 15:36:30 +11:00
Jesse Larrew 3b7a27db3b powerpc: Disable VPHN polling during a suspend operation
Tie the polling mechanism into the ibm,suspend-me rtas call to
stop/restart polling before/after a suspend, hibernate, migrate,
or checkpoint restart operation. This ensures that the system has a
chance to disable the polling if the partition is migrated to a system
that does not support VPHN (and vice versa).

Signed-off-by: Jesse Larrew <jlarrew@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-12-09 15:36:30 +11:00
Jesse Larrew 9eff1a3840 powerpc/pseries: Poll VPA for topology changes and update NUMA maps
This patch sets a timer during boot that will periodically poll the
associativity change counters in the VPA. When a change in
associativity is detected, it retrieves the new associativity domain
information via the H_HOME_NODE_ASSOCIATIVITY hcall and updates the
NUMA node maps and sysfs entries accordingly. Note that since the
ibm,associativity device tree property does not exist on configurations
with both NUMA and SPLPAR enabled, no device tree updates are necessary.

Signed-off-by: Jesse Larrew <jlarrew@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-12-09 15:36:29 +11:00
Sonny Rao bee376ff4c powerpc: Minor cleanups for machdep.h
Remove stale declaration of setup_pci_ptrs, aparently from ppc before 2.4.0

Remove #ifdef around struct existance delcaration

Fix spelling of "linear"

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Sonny Rao <sonnyrao@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-12-09 15:35:31 +11:00
Anton Blanchard b5f9b6665b powerpc: Hardcode popcnt instructions for old assemblers
The popcnt instructions went into binutils relatively recently. As with a
number of other instructions, create macros and hardcode them.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-12-09 15:35:30 +11:00
Benjamin Herrenschmidt f4b9841595 Merge branch 'nvram' into next 2010-12-09 14:36:38 +11:00
Benjamin Herrenschmidt edc79a2f3e powerpc/nvram: Move the log partition stuff to pseries
The nvram log partition stuff currently in nvram_64.c is really
pseries specific. It isn't actually used on anything else (despite
the fact that we ran the code to setup the partition on anything
except powermac) and the log format is specific to pseries RTAS
implementation. So move it where it belongs

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-11-30 15:37:45 +11:00
Benjamin Herrenschmidt 74d51d0298 powerpc/nvram: Move things out of asm/nvram.h
This moves a bunch of definitions out of asm/nvram.h to the files
that use them or just outright remove completely unused stuff.

We leave the partition signatures definitions, they will be useful

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-11-30 15:09:19 +11:00
Jesse Larrew 36f567b429 powerpc: Add VPHN firmware feature
This simple patch adds the firmware feature for VPHN to the firmware
features bitmask.

Signed-off-by: Jesse Larrew <jlarrew@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-11-29 15:48:22 +11:00
Nishanth Aravamudan cd34206e94 powerpc: Add memory_hotplug_max()
Add a function to get the maximum address that can be hotplug added.
This is needed to calculate the size of the tce table needed to cover
all memory in 1:1 mode.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-11-29 15:48:21 +11:00
Nishanth Aravamudan f6aedd8606 powerpc/macio: Ensure all dma routines get copied over
Also add a comment to dev_archdata, indicating that changes there need
to be verified against the driver code.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-11-29 15:48:21 +11:00
Vaidyanathan Srinivasan 5742bd8595 powerpc: Add support for new hcall H_BEST_ENERGY
Create sysfs interface to export data from H_BEST_ENERGY hcall
that can be used by administrative tools on supported pseries
platforms for energy management	optimizations.

sys/device/system/cpu/pseries_(de)activate_hint_list and
sys/device/system/cpu/cpuN/pseries_(de)activate_hint will provide
hints for activation and deactivation of cpus respectively.

These hints are abstract number given by the hypervisor based
on the extended knowledge the hypervisor has regarding the
system topology and resource mappings.

The activate and the deactivate sysfs entry is for the two
distinct operations that we could do for energy savings.  When
we have more capacity than required, we could deactivate few
core to save energy.  The choice of the core to deactivate
will be based on /sys/devices/system/cpu/deactivate_hint_list.
The comma separated list of cpus (cores) will be the preferred
choice.  If we have to activate some of the deactivated cores,
then /sys/devices/system/cpu/activate_hint_list will be used.

The per-cpu file
/sys/device/system/cpu/cpuN/pseries_(de)activate_hint further
provide more fine grain information by exporting the value of
the hint itself.

Added new driver module
	arch/powerpc/platforms/pseries/pseries_energy.c
under new config option CONFIG_PSERIES_ENERGY

Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-11-29 15:48:19 +11:00
Vaidyanathan Srinivasan 99d8670525 powerpc: Cleanup APIs for cpu/thread/core mappings
These APIs take logical cpu number as input
Change cpu_first_thread_in_core() to cpu_first_thread_sibling()
Change cpu_last_thread_in_core() to cpu_last_thread_sibling()

These APIs convert core number (index) to logical cpu/thread numbers
Add cpu_first_thread_of_core(int core)
Changed cpu_thread_to_core() to cpu_core_index_of_thread(int cpu)

The goal is to make 'threads_per_core' accessible to the
pseries_energy module.  Instead of making an API to read
threads_per_core, this is a higher level wrapper function to
convert from logical cpu number to core number.

The current APIs cpu_first_thread_in_core() and
cpu_last_thread_in_core() returns logical CPU number while
cpu_thread_to_core() returns core number or index which is
not a logical CPU number.  The new APIs are now clearly named to
distinguish 'core number' versus first and last 'logical cpu
number' in that core.

The new APIs cpu_{first,last}_thread_sibling() work on
logical cpu numbers.  While cpu_first_thread_of_core() and
cpu_core_index_of_thread() work on core index.

Example usage:  (4 threads per core system)

cpu_first_thread_sibling(5) = 4
cpu_last_thread_sibling(5) = 7
cpu_core_index_of_thread(5) = 1
cpu_first_thread_of_core(1) = 4

cpu_core_index_of_thread() is used in cpu_to_drc_index() in the
module and cpu_first_thread_of_core() is used in
drc_index_to_cpu() in the module.

Make API changes to few callers.  Export symbols for use in modules.

Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-11-29 15:48:19 +11:00
Christian Dietrich 56e640de12 powerpc: Removing undead ifdef __KERNEL__
The __KERNEL__ ifdef isn't necessary at this point, because it is
checked in an outer ifdef level already and has no effect here.

Signed-off-by: Christian Dietrich <qy03fugy@stud.informatik.uni-erlangen.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-11-29 15:48:18 +11:00
Anton Blanchard 64ff312876 powerpc: Add support for popcnt instructions
POWER5 added popcntb, and POWER7 added popcntw and popcntd. As a first step
this patch does all the work out of line, but it would be nice to implement
them as inlines with an out of line fallback.

The performance issue with hweight was noticed when disabling SMT on a large
(192 thread) POWER7 box. The patch improves that testcase by about 8%.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-11-29 15:48:17 +11:00
Uwe Kleine-König b595076a18 tree-wide: fix comment/printk typos
"gadget", "through", "command", "maintain", "maintain", "controller", "address",
"between", "initiali[zs]e", "instead", "function", "select", "already",
"equal", "access", "management", "hierarchy", "registration", "interest",
"relative", "memory", "offset", "already",

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-11-01 15:38:34 -04:00
David Daney 4b6ba8aacb of/net: Move of_get_mac_address() to a common source file.
There are two identical implementations of of_get_mac_address(), one
each in arch/powerpc/kernel/prom_parse.c and
arch/microblaze/kernel/prom_parse.c.  Move this function to a new
common file of_net.{c,h} and adjust all the callers to include the new
header.

Signed-off-by: David Daney <ddaney@caviumnetworks.com>
[grant.likely@secretlab.ca: protect header with #ifdef]
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-11-01 01:08:14 -04:00
Linus Torvalds 79346507ad Merge git://git.infradead.org/mtd-2.6
* git://git.infradead.org/mtd-2.6: (82 commits)
  mtd: fix build error in m25p80.c
  mtd: Remove redundant mutex from mtd_blkdevs.c
  MTD: Fix wrong check register_blkdev return value
  Revert "mtd: cleanup Kconfig dependencies"
  mtd: cfi_cmdset_0002: make sector erase command variable
  mtd: cfi_cmdset_0002: add CFI detection for SST 38VF640x chips
  mtd: cfi_util: add support for switching SST 39VF640xB chips into QRY mode
  mtd: cfi_cmdset_0001: use defined value of P_ID_INTEL_PERFORMANCE instead of hardcoded one
  block2mtd: dubious assignment
  P4080/mtd: Fix the freescale lbc issue with 36bit mode
  P4080/eLBC: Make Freescale elbc interrupt common to elbc devices
  mtd: phram: use KBUILD_MODNAME
  mtd: OneNAND: S5PC110: Fix double call suspend & resume function
  mtd: nand: fix MTD_MODE_RAW writes
  jffs2: use kmemdup
  mtd: sm_ftl: cosmetic, use bool when possible
  mtd: r852: remove useless pci powerup/down from suspend/resume routines
  mtd: blktrans: fix a race vs kthread_stop
  mtd: blktrans: kill BKL
  mtd: allow to unload the mtdtrans module if its block devices aren't open
  ...

Fix up trivial whitespace-introduced conflict in drivers/mtd/mtdchar.c
2010-10-30 08:31:35 -07:00
David Woodhouse 67577927e8 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
Conflicts:
	drivers/mtd/mtd_blkdevs.c

Merge Grant's device-tree bits so that we can apply the subsequent fixes.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2010-10-30 12:35:11 +01:00
Linus Torvalds 1e431a9d64 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb:
  kgdb,ppc: Individual register get/set for ppc
  kgdbts: prevent re-entry to kgdbts before it unregisters
  debug_core,x86,blackfin: Clean up hw debug disable API
  kdb: Fix early debugging crash regression
  kgdb,arm: fix register dump
  kdb: fix per_cpu command to remove supress mask
  kdb: Add kdb kernel module sample
2010-10-29 11:49:38 -07:00
Dongdong Deng ff10b88b5a kgdb,ppc: Individual register get/set for ppc
commit 534af1082329392bc29f6badf815e69ae2ae0f4c(kgdb,kdb: individual
register set and and get API) introduce dbg_get_reg/dbg_set_reg API
for individual register get and set.

This patch implement those APIs for ppc.

Signed-off-by: Dongdong Deng <dongdong.deng@windriver.com>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2010-10-29 13:14:42 -05:00
Linus Torvalds e3e1288e86 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx: (48 commits)
  DMAENGINE: move COH901318 to arch_initcall
  dma: imx-dma: fix signedness bug
  dma/timberdale: simplify conditional
  ste_dma40: remove channel_type
  ste_dma40: remove enum for endianess
  ste_dma40: remove TIM_FOR_LINK option
  ste_dma40: move mode_opt to separate config
  ste_dma40: move channel mode to a separate field
  ste_dma40: move priority to separate field
  ste_dma40: add variable to indicate valid dma_cfg
  async_tx: make async_tx channel switching opt-in
  move async raid6 test to lib/Kconfig.debug
  dmaengine: Add Freescale i.MX1/21/27 DMA driver
  intel_mid_dma: change the slave interface
  intel_mid_dma: fix the WARN_ONs
  intel_mid_dma: Add sg list support to DMA driver
  intel_mid_dma: Allow DMAC2 to share interrupt
  intel_mid_dma: Allow IRQ sharing
  intel_mid_dma: Add runtime PM support
  DMAENGINE: define a dummy filter function for ste_dma40
  ...
2010-10-27 19:04:36 -07:00
Michael Holzheu d57af9b214 taskstats: use real microsecond granularity for CPU times
The taskstats interface uses microsecond granularity for the user and
system time values.  The conversion from cputime to the taskstats values
uses the cputime_to_msecs primitive which effectively limits the
granularity to milliseconds.  Add the cputime_to_usecs primitive for
architectures that have better, more precise CPU time values.  Remove
cputime_to_msecs primitive because there are no more users left.

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Luck Tony <tony.luck@intel.com>
Cc: Shailabh Nagar <nagar1234@in.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Shailabh Nagar <nagar@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-27 18:03:17 -07:00
Peter Zijlstra ece0e2b640 mm: remove pte_*map_nested()
Since we no longer need to provide KM_type, the whole pte_*map_nested()
API is now redundant, remove it.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Miller <davem@davemloft.net>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26 16:52:08 -07:00
Peter Zijlstra 3e4d3af501 mm: stack based kmap_atomic()
Keep the current interface but ignore the KM_type and use a stack based
approach.

The advantage is that we get rid of crappy code like:

	#define __KM_PTE			\
		(in_nmi() ? KM_NMI_PTE : 	\
		 in_irq() ? KM_IRQ_PTE :	\
		 KM_PTE0)

and in general can stop worrying about what context we're in and what kmap
slots might be appropriate for that.

The downside is that FRV kmap_atomic() gets more expensive.

For now we use a CPP trick suggested by Andrew:

  #define kmap_atomic(page, args...) __kmap_atomic(page)

to avoid having to touch all kmap_atomic() users in a single patch.

[ not compiled on:
  - mn10300: the arch doesn't actually build with highmem to begin with ]

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix up drivers/gpu/drm/i915/intel_overlay.c]
Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Miller <davem@davemloft.net>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Dave Airlie <airlied@linux.ie>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26 16:52:08 -07:00
Linus Torvalds 33081adf8b Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6: (365 commits)
  ALSA: hda - Disable sticky PCM stream assignment for AD codecs
  ALSA: usb - Creative USB X-Fi volume knob support
  ALSA: ca0106: Use card specific dac id for mute controls.
  ALSA: ca0106: Allow different sound cards to use different SPI channel mappings.
  ALSA: ca0106: Create a nice spot for mapping channels to dacs.
  ALSA: ca0106: Move enabling of front dac out of hardcoded setup sequence.
  ALSA: ca0106: Pull out dac powering routine into separate function.
  ALSA: ca0106 - add Sound Blaster 5.1vx info.
  ASoC: tlv320dac33: Use usleep_range for delays
  ALSA: usb-audio: add Novation Launchpad support
  ALSA: hda - Add workarounds for CT-IBG controllers
  ALSA: hda - Fix wrong TLV mute bit for STAC/IDT codecs
  ASoC: tpa6130a2: Error handling for broken chip
  ASoC: max98088: Staticise m98088_eq_band
  ASoC: soc-core: Fix codec->name memory leak
  ALSA: hda - Apply ideapad quirk to Acer laptops with Cxt5066
  ALSA: hda - Add some workarounds for Creative IBG
  ALSA: hda - Fix wrong SPDIF NID assignment for CA0110
  ALSA: hda - Fix codec rename rules for ALC662-compatible codecs
  ALSA: hda - Add alc_init_jacks() call to other codecs
  ...
2010-10-25 08:32:05 -07:00
Lan Chunhe-B25806 0b824d2b10 P4080/mtd: Fix the freescale lbc issue with 36bit mode
When system uses 36bit physical address, res.start is 36bit
physical address. But the function of in_be32 returns 32bit
physical address. Then both of them compared each other is
wrong. So by converting the address of res.start into
the right format fixes this issue.

Signed-off-by: Lan Chunhe-B25806 <b25806@freescale.com>
Signed-off-by: Roy Zang <tie-fei.zang@freescale.com>
Reviewed-by: Anton Vorontsov <cbouatmailru@gmail.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2010-10-25 15:41:04 +01:00
Roy Zang 3ab8f2a2e7 P4080/eLBC: Make Freescale elbc interrupt common to elbc devices
Move Freescale elbc interrupt from nand driver to elbc driver.
Then all elbc devices can use the interrupt instead of ONLY nand.

For former nand driver, it had the two functions:

1. detecting nand flash partitions;
2. registering elbc interrupt.

Now, second function is removed to fsl_lbc.c.

Signed-off-by: Lan Chunhe-B25806 <b25806@freescale.com>
Signed-off-by: Roy Zang <tie-fei.zang@freescale.com>
Reviewed-by: Anton Vorontsov <cbouatmailru@gmail.com>
Cc: Wood Scott-B07421 <B07421@freescale.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2010-10-25 15:40:54 +01:00
Takashi Iwai aa5c14d5c0 Merge branch 'topic/asoc' into for-linus
Conflicts:
	arch/powerpc/platforms/85xx/p1022_ds.c
2010-10-25 10:00:30 +02:00
Linus Torvalds 229aebb873 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
  Update broken web addresses in arch directory.
  Update broken web addresses in the kernel.
  Revert "drivers/usb: Remove unnecessary return's from void functions" for musb gadget
  Revert "Fix typo: configuation => configuration" partially
  ida: document IDA_BITMAP_LONGS calculation
  ext2: fix a typo on comment in ext2/inode.c
  drivers/scsi: Remove unnecessary casts of private_data
  drivers/s390: Remove unnecessary casts of private_data
  net/sunrpc/rpc_pipe.c: Remove unnecessary casts of private_data
  drivers/infiniband: Remove unnecessary casts of private_data
  drivers/gpu/drm: Remove unnecessary casts of private_data
  kernel/pm_qos_params.c: Remove unnecessary casts of private_data
  fs/ecryptfs: Remove unnecessary casts of private_data
  fs/seq_file.c: Remove unnecessary casts of private_data
  arm: uengine.c: remove C99 comments
  arm: scoop.c: remove C99 comments
  Fix typo configue => configure in comments
  Fix typo: configuation => configuration
  Fix typo interrest[ing|ed] => interest[ing|ed]
  Fix various typos of valid in comments
  ...

Fix up trivial conflicts in:
	drivers/char/ipmi/ipmi_si_intf.c
	drivers/usb/gadget/rndis.c
	net/irda/irnet/irnet_ppp.c
2010-10-24 13:41:39 -07:00
Linus Torvalds 1765a1fe5d Merge branch 'kvm-updates/2.6.37' of git://git.kernel.org/pub/scm/virt/kvm/kvm
* 'kvm-updates/2.6.37' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (321 commits)
  KVM: Drop CONFIG_DMAR dependency around kvm_iommu_map_pages
  KVM: Fix signature of kvm_iommu_map_pages stub
  KVM: MCE: Send SRAR SIGBUS directly
  KVM: MCE: Add MCG_SER_P into KVM_MCE_CAP_SUPPORTED
  KVM: fix typo in copyright notice
  KVM: Disable interrupts around get_kernel_ns()
  KVM: MMU: Avoid sign extension in mmu_alloc_direct_roots() pae root address
  KVM: MMU: move access code parsing to FNAME(walk_addr) function
  KVM: MMU: audit: check whether have unsync sps after root sync
  KVM: MMU: audit: introduce audit_printk to cleanup audit code
  KVM: MMU: audit: unregister audit tracepoints before module unloaded
  KVM: MMU: audit: fix vcpu's spte walking
  KVM: MMU: set access bit for direct mapping
  KVM: MMU: cleanup for error mask set while walk guest page table
  KVM: MMU: update 'root_hpa' out of loop in PAE shadow path
  KVM: x86 emulator: Eliminate compilation warning in x86_decode_insn()
  KVM: x86: Fix constant type in kvm_get_time_scale
  KVM: VMX: Add AX to list of registers clobbered by guest switch
  KVM guest: Move a printk that's using the clock before it's ready
  KVM: x86: TSC catchup mode
  ...
2010-10-24 12:47:25 -07:00
Alexander Graf 26e673c300 KVM: PPC: Move of include to __KERNEL__ section
We have to protect the include for linux/of.h by __KERNEL__ so it doesn't
accidently get referenced outside.

This patch fixes this and makes the tree compile again.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2010-10-24 10:52:23 +02:00
Alexander Graf 17bd158006 KVM: PPC: Implement Level interrupts on Book3S
The current interrupt logic is just completely broken. We get a notification
from user space, telling us that an interrupt is there. But then user space
expects us that we just acknowledge an interrupt once we deliver it to the
guest.

This is not how real hardware works though. On real hardware, the interrupt
controller pulls the external interrupt line until it gets notified that the
interrupt was received.

So in reality we have two events: pulling and letting go of the interrupt line.

To maintain backwards compatibility, I added a new request for the pulling
part. The letting go part was implemented earlier already.

With this in place, we can now finally start guests that do not randomly stall
and stop to work at random times.

This patch implements above logic for Book3S.

Signed-off-by: Alexander Graf <agraf@suse.de>
2010-10-24 10:52:19 +02:00
Alexander Graf 8b6db3bc96 KVM: PPC: Implement correct SID mapping on Book3s_32
Up until now we were doing segment mappings wrong on Book3s_32. For Book3s_64
we were using a trick where we know that a single mmu_context gives us 16 bits
of context ids.

The mm system on Book3s_32 instead uses a clever algorithm to distribute VSIDs
across the available range, so a context id really only gives us 16 available
VSIDs.

To keep at least a few guest processes in the SID shadow, let's map a number of
contexts that we can use as VSID pool. This makes the code be actually correct
and shouldn't hurt performance too much.

Signed-off-by: Alexander Graf <agraf@suse.de>
2010-10-24 10:52:15 +02:00
Alexander Graf df1bfa25d8 KVM: PPC: Put segment registers in shared page
Now that the actual mtsr doesn't do anything anymore, we can move the sr
contents over to the shared page, so a guest can directly read and write
its sr contents from guest context.

Signed-off-by: Alexander Graf <agraf@suse.de>
2010-10-24 10:52:11 +02:00
Alexander Graf 8e8651783f KVM: PPC: Interpret SR registers on demand
Right now we're examining the contents of Book3s_32's segment registers when
the register is written and put the interpreted contents into a struct.

There are two reasons this is bad. For starters, the struct has worse real-time
performance, as it occupies more ram. But the more important part is that with
segment registers being interpreted from their raw values, we can put them in
the shared page, allowing guests to mess with them directly.

This patch makes the internal representation of SRs be u32s.

Signed-off-by: Alexander Graf <agraf@suse.de>
2010-10-24 10:52:11 +02:00
Alexander Graf 7508e16c9f KVM: PPC: Add feature bitmap for magic page
We will soon add SR PV support to the shared page, so we need some
infrastructure that allows the guest to query for features KVM exports.

This patch adds a second return value to the magic mapping that
indicated to the guest which features are available.

Signed-off-by: Alexander Graf <agraf@suse.de>
2010-10-24 10:52:09 +02:00
Alexander Graf 2b05d71fef KVM: PPC: Make long relocations be ulong
On Book3S KVM we directly expose some asm pointers to C code as
variables. These need to be relocated and thus break on relocatable
kernels.

To make sure we can at least build, let's mark them as long instead
of u32 where 64bit relocations don't work.

This fixes the following build error:

WARNING: 2 bad relocations^M
> c000000000008590 R_PPC64_ADDR32    .text+0x4000000000008460^M
> c000000000008594 R_PPC64_ADDR32    .text+0x4000000000008598^M

Please keep in mind that actually using KVM on a relocated kernel
might still break. This only fixes the compile problem.

Reported-by: Subrata Modak <subrata@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:50:59 +02:00
Alexander Graf 2d27fc5eac KVM: PPC: Add book3s_32 tlbie flush acceleration
On Book3s_32 the tlbie instruction flushed effective addresses by the mask
0x0ffff000. This is pretty hard to reflect with a hash that hashes ~0xfff, so
to speed up that target we should also keep a special hash around for it.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:50:58 +02:00
Alexander Graf 2e0908afaf KVM: PPC: RCU'ify the Book3s MMU
So far we've been running all code without locking of any sort. This wasn't
really an issue because I didn't see any parallel access to the shadow MMU
code coming.

But then I started to implement dirty bitmapping to MOL which has the video
code in its own thread, so suddenly we had the dirty bitmap code run in
parallel to the shadow mmu code. And with that came trouble.

So I went ahead and made the MMU modifying functions as parallelizable as
I could think of. I hope I didn't screw up too much RCU logic :-). If you
know your way around RCU and locking and what needs to be done when, please
take a look at this patch.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:50:58 +02:00
Alexander Graf 5fc87407b5 KVM: PPC: Expose magic page support to guest
Now that we have the shared page in place and the MMU code knows about
the magic page, we can expose that capability to the guest!

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:50:49 +02:00
Alexander Graf e8508940a8 KVM: PPC: Magic Page Book3s support
We need to override EA as well as PA lookups for the magic page. When the guest
tells us to project it, the magic page overrides any guest mappings.

In order to reflect that, we need to hook into all the MMU layers of KVM to
force map the magic page if necessary.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:50:48 +02:00