Allocate a distinct inode for every vcpu in a VM. This has the following
benefits:
- the filp cachelines are no longer bounced when f_count is incremented on
every ioctl()
- the API and internal code are distinctly clearer; for example, on the
KVM_GET_REGS ioctl, there is no need to copy the vcpu number from
userspace and then copy the registers back; the vcpu identity is derived
from the fd used to make the call
Right now the performance benefits are completely theoretical since (a) we
don't support more than one vcpu per VM and (b) virtualization hardware
inefficiencies completely everwhelm any cacheline bouncing effects. But
both of these will change, and we need to prepare the API today.
Signed-off-by: Avi Kivity <avi@qumranet.com>
This reflects the changed scope, from device-wide to single vm (previously
every device open created a virtual machine).
Signed-off-by: Avi Kivity <avi@qumranet.com>
This avoids having filp->f_op and the corresponding inode->i_fop different,
which is a little unorthodox.
The ioctl list is split into two: global kvm ioctls and per-vm ioctls. A new
ioctl, KVM_CREATE_VM, is used to create VMs and return the VM fd.
Signed-off-by: Avi Kivity <avi@qumranet.com>
The kvmfs inodes will represent virtual machines and vcpus, as necessary,
reducing cacheline bouncing due to inodes and filps being shared.
Signed-off-by: Avi Kivity <avi@qumranet.com>
This patch changes the SVM code to intercept SMIs and handle it
outside the guest.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
This adds a special MSR based hypercall API to KVM. This is to be
used by paravirtual kernels and virtual drivers.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Besides using an established api, this allows using kvm in older kernels.
Signed-off-by: Markus Rechberger <markus.rechberger@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
The whole thing is rotten, but this allows vmx to boot with the guest reboot
fix.
Signed-off-by: Markus Rechberger <markus.rechberger@amd.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
We fail to mark a page dirty in three cases:
- setting the accessed bit in a pte
- setting the dirty bit in a pte
- emulating a write into a pagetable
This fix adds the missing cases.
Signed-off-by: Avi Kivity <avi@qumranet.com>
The information contained within platform_data should be self-contained.
Replace the pointer to a MAC address with the actual MAC address in
struct mv643xx_eth_platform_data.
Signed-off-by: Dale Farnsworth <dale@farnsworth.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Commit 908b637fe7 removed ETH_DMA_ALIGN
but missed a usage of it in a macro, which broke the build.
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This missing line caused transmit errors on the Qlogic 4032 chip.
Signed-off-by: Ron Mercer <ron.mercer@qlogic.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Fix copyright and license ("regents" should not have ever been used).
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
No need to stop tc35815 before resetting the board. This fixes the
build of tc35815 as a module. This also means there is no caller of
tc35815_killall left, so remove that function also.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Conditionalize all PM related stuff in libata core layer using
CONFIG_PM.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Add missing #ifdef CONFIG_PM conditionals around all PM related parts
in libata LLDs.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Some LLDs were missing scsi device PM callbacks while having host/port
suspend support. Add missing ones.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
If you'll recall, over a year ago, I pointed out that the current
Radeon driver erroneously returns -EINVAL for valid blanking codes,
here is a link to that thread:
http://lkml.org/lkml/2006/1/28/6
No other driver does this, and it confuses the X server into thinking
that the device does not support blanking properly.
I looked again and there is simply no reason for the Radeon driver to
return -EINVAL for FB_BLANK_NORMAL. It claims it wants to do this in
order to convince fbcon to blank in software, right here:
if (fb_blank(info, blank))
fbcon_generic_blank(vc, info, blank);
to software blank the screen. But it only causes that to happen
in the FB_BLANK_NORMAL case.
That makes no sense because the Radeon code does this:
val |= CRTC_DISPLAY_DIS;
in the FB_BLANK_NORMAL case so should be blanking the hardware, and
there is therefore no reason to SW blank by returning -EINVAL.
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Antonino Daplas <adaplas@gmail.com>
The QDI init code contains some bugs which mean it only works if you have
a test setup that causes both a successful and failed probe. Fix this
Found by Philip Guo
(Who found it working on code analysis tools not running VLB IDE
controllers)
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Alan Cox noticed several hooks in pata_* drivers were missing, when
he authored his ->cable_detect hook patches. This patch extracts
just those fixes from Alan's patches, adding the necessary hooks
(usually ->freeze, ->thaw, and ->post_internal_cmd) to the drivers.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2.6.21-rc has horrible problems with libata and PATA cable types (and
thus speeds). This occurs because Tejun fixed a pile of other bugs and
we now do cable detect enforcement for drive side detection properly.
Unfortunately we don't do the process around cable detection right. Tejun
identified the problem and pointed to the right Annex in the spec, this patch
implements the needed changes.
The basic requirement is that we have to identify the slave before the
master.
The patch switches the identify order so that we can do the drive side
detection correctly.
[NOTE: patch and description extracted from a larger work written
and signed-off-by Alan Cox]
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The initial simplex handling code is fooled if you suspend and resume.
This also causes problems with some single channel controllers which
claim to be simplex.
The fix is fairly simple, instead of keeping a flag to remember if we
gave away the simplex channel we remember the actual owner. As the owner
is always part of the host_set we don't even need a refcount.
Knowing the owner also means we can reassign simplex DMA channels in
future hotplug code etc if we need to
Signed-off-by: Alan Cox <alan@redhat.com>
(and a signed-off for the patch I sent before while I remember)
Signed-off-by: Jeff Garzik <jeff@garzik.org>
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev:
ahci: improve spurious SDB FIS handling
ahci/pata_jmicron: match class not function number
jmicron ATA: reimplement jmicron ATA quirk
pata_jmicron: drop unnecessary device programming in [re]init
libata: blacklist FUJITSU MHT2060BH for NCQ
sata_sil24: kill unused local variable idx in sil24_fill_sg()
libata: clear drvdata in ata_host_release(), take#2
* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jikos/hid:
HID: fix Logitech DiNovo Edge touchwheel and Logic3 /SpectraVideo middle button
HID: add git tree information to MAINTAINERS
HID: fix broken Logitech S510 keyboard report descriptor; make extra keys work
HID: fix possible double-free on error path in hid parser
HID: hid-debug.c should #include <linux/hid-debug.h>
HID: fix bug in zeroing the last field byte in output reports
USB HID: use CONFIG_HID_DEBUG for outputting report descriptor
USB HID: Fix USB vendor and product IDs endianness for USB HID devices
Spurious SDB FIS during NCQ might not contain spurious completions.
It could be spurious TF update or invalid async notification. Treat
as HSM violation iff a spurious SDB FIS contains spurious completions;
otherwise, just whine once about it.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Make jmiron_ata quirk update pdev->class after programming the device
and update ahci and pata_jmicron such that they match class code
instead of checking function number manually. For ahci, it matches
for vendor and class. For pata_jmicron, it matches vendor, device and
class as IDE class isn't as well defined as AHCI class.
This makes jmicron device matching more conventional and script
friendly.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Reimplement jmicron ATA quirk.
* renamed to quirk_jmicron_ata()
* quirk is invoked only for the affected controllers
* programming is stricter. e.g. conf5 bit24 is cleared if
unnecessary.
* code factored for readability
* JMB360 and JMB368 are programmed into proper mode
Verified on JMB360, 363 and 368.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Channel redirect and AHCI mode enable programmings are done via PCI
quirk for both probe and resume paths. Drop duplicate and possibly
unsafe device programming from pata_jmicron().
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Blacklist FUJITSU MHT2060BH for NCQ. On this drive, NCQ works iff
queue depth is equal to or less than 4. Just turn it off.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Mike Accetta <maccetta@laurelnetworks.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Kill unused local variable idx in sil24_fill_sg().
Spotted by Jeff Garzik.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Clearing drvdata in ->remove_one causes NULL pointer deference. Clear
drvdata only in ata_host_release() after all resources are freed.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This patch fixes a possible race that leads to double freeing an idr index.
When the master begin to close, release_dev() is called and then
pty_close() is called:
if (tty->driver->close)
tty->driver->close(tty, filp);
This is done without helding any locks other than BKL. Inside pty_close(),
being a master close, the devpts entry will be removed:
#ifdef CONFIG_UNIX98_PTYS
if (tty->driver == ptm_driver)
devpts_pty_kill(tty->index);
#endif
But devpts_pty_kill() will call get_node() that may sleep while waiting for
&devpts_root->d_inode->i_sem. When this happens and the slave is being
opened, tty_open() just found the driver and index:
driver = get_tty_driver(device, &index);
if (!driver) {
mutex_unlock(&tty_mutex);
return -ENODEV;
}
This part of the code is already protected under tty_mute. The problem is
that the slave close already got an index. Then init_dev() is called and
blocks waiting for the same &devpts_root->d_inode->i_sem.
When the master close resumes, it removes the devpts entry, and the
relation between idr index and the tty is gone. The master then sleeps
waiting for the tty_mutex on release_dev().
Slave open resumes and found no tty for that index. As result, a NULL tty
is returned and init_dev() doesn't flow to fast_track:
/* check whether we're reopening an existing tty */
if (driver->flags & TTY_DRIVER_DEVPTS_MEM) {
tty = devpts_get_tty(idx);
if (tty && driver->subtype == PTY_TYPE_MASTER)
tty = tty->link;
} else {
tty = driver->ttys[idx];
}
if (tty) goto fast_track;
The result of this, is that a new tty will be created and init_dev() returns
sucessfull. After returning, tty_mutex is dropped and master close may resume.
Master close finds it's the only use and both sides are closing, then releases
the tty and the index. At this point, the idr index is free, but slave still
has it.
Slave open then calls pty_open() and finds that tty->link->count is 0,
because there's no master and returns error. Then tty_open() calls
release_dev() which executes without any warning, as it was a case of last
slave close when the master is already closed (master->count == 0,
slave->count == 1). The tty is then released with the already released idr
index.
This normally would only issue a warning on idr_remove() but in case of a
customer's critical application, it's never too simple:
thread1: opens master, gets index X
thread1: begin closing master
thread2: begin opening slave with index X
thread1: finishes closing master, index X released
thread3: opens master, gets index X, just released
thread2: fails opening slave, releases index X <----
thread4: opens master, gets index X, init_dev() then find an already in use
and healthy tty and fails
If no more indexes are released, ptmx_open() will keep failing, as the
first free index available is X, and it will make init_dev() fail because
you're trying to "reopen a master" which isn't valid.
The patch notices when this race happens and make init_dev() fail
imediately. The init_dev() function is called with tty_mutex held, so it's
safe to continue with tty till the end of function because release_dev()
won't make any further changes without grabbing the tty_mutex.
Without the patch, on some machines it's possible get easily idr warnings
like this one:
idr_remove called for id=15 which is not allocated.
[<c02555b9>] idr_remove+0x139/0x170
[<c02a1b62>] release_mem+0x182/0x230
[<c02a28e7>] release_dev+0x4b7/0x700
[<c02a0ea7>] tty_ldisc_enable+0x27/0x30
[<c02a1e64>] init_dev+0x254/0x580
[<c02a0d64>] check_tty_count+0x14/0xb0
[<c02a4f05>] tty_open+0x1c5/0x340
[<c02a4d40>] tty_open+0x0/0x340
[<c017388f>] chrdev_open+0xaf/0x180
[<c017c2ac>] open_namei+0x8c/0x760
[<c01737e0>] chrdev_open+0x0/0x180
[<c0167bc9>] __dentry_open+0xc9/0x210
[<c0167e2c>] do_filp_open+0x5c/0x70
[<c0167a91>] get_unused_fd+0x61/0xd0
[<c0167e93>] do_sys_open+0x53/0x100
[<c0167f97>] sys_open+0x27/0x30
[<c010303b>] syscall_call+0x7/0xb
using this test application available on:
http://www.ruivo.org/~aris/pty_sodomizer.c
Signed-off-by: Aristeu Sergio Rozanski Filho <aris@ruivo.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>