Commit graph

28880 commits

Author SHA1 Message Date
NeilBrown 4044ba58dd md: don't retry recovery of raid1 that fails due to error on source drive.
If a raid1 has only one working drive and it has a sector which
gives an error on read, then an attempt to recover onto a spare will
fail, but as the single remaining drive is not removed from the
array, the recovery will be immediately re-attempted, resulting
in an infinite recovery loop.

So detect this situation and don't retry recovery once an error
on the lone remaining drive is detected.

Allow recovery to be retried once every time a spare is added
in case the problem wasn't actually a media error.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-01-09 08:31:11 +11:00
NeilBrown efeb53c0e5 md: Allow md devices to be created by name.
Using sequential numbers to identify md devices is somewhat artificial.
Using names can be a lot more user-friendly.

Also, creating md devices by opening the device special file is a bit
awkward.

So this patch provides a new option for creating and naming devices.

Writing a name such as "md_home" to
    /sys/modules/md_mod/parameters/new_array
will cause an array with that name to be created.  It will appear in
/sys/block/ /proc/partitions and /proc/mdstat as 'md_home'.
It will have an arbitrary minor number allocated.

md devices that a created by an open are destroyed on the last
close when the device is inactive.
For named md devices, they will not be destroyed until the array
is explicitly stopped, either with the STOP_ARRAY ioctl or by
writing 'clear' to /sys/block/md_XXXX/md/array_state.

The name of the array must start 'md_' to avoid conflict with
other devices.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-01-09 08:31:10 +11:00
NeilBrown d3374825ce md: make devices disappear when they are no longer needed.
Currently md devices, once created, never disappear until the module
is unloaded.  This is essentially because the gendisk holds a
reference to the mddev, and the mddev holds a reference to the
gendisk, this a circular reference.

If we drop the reference from mddev to gendisk, then we need to ensure
that the mddev is destroyed when the gendisk is destroyed.  However it
is not possible to hook into the gendisk destruction process to enable
this.

So we drop the reference from the gendisk to the mddev and destroy the
gendisk when the mddev gets destroyed.  However this has a
complication.
Between the call
   __blkdev_get->get_gendisk->kobj_lookup->md_probe
and the call
   __blkdev_get->md_open

there is no obvious way to hold a reference on the mddev any more, so
unless something is done, it will disappear and gendisk will be
destroyed prematurely.

Also, once we decide to destroy the mddev, there will be an unlockable
moment before the gendisk is unlinked (blk_unregister_region) during
which a new reference to the gendisk can be created.  We need to
ensure that this reference can not be used.  i.e. the ->open must
fail.

So:
 1/  in md_probe we set a flag in the mddev (hold_active) which
     indicates that the array should be treated as active, even
     though there are no references, and no appearance of activity.
     This is cleared by md_release when the device is closed if it
     is no longer needed.
     This ensures that the gendisk will survive between md_probe and
     md_open.

 2/  In md_open we check if the mddev we expect to open matches
     the gendisk that we did open.
     If there is a mismatch we return -ERESTARTSYS and modify
     __blkdev_get to retry from the top in that case.
     In the -ERESTARTSYS sys case we make sure to wait until
     the old gendisk (that we succeeded in opening) is really gone so
     we loop at most once.

Some udev configurations will always open an md device when it first
appears.   If we allow an md device that was just created by an open
to disappear on an immediate close, then this can race with such udev
configurations and result in an infinite loop the device being opened
and closed, then re-open due to the 'ADD' even from the first open,
and then close and so on.
So we make sure an md device, once created by an open, remains active
at least until some md 'ioctl' has been made on it.  This means that
all normal usage of md devices will allow them to disappear promptly
when not needed, but the worst that an incorrect usage will do it
cause an inactive md device to be left in existence (it can easily be
removed).

As an array can be stopped by writing to a sysfs attribute
  echo clear > /sys/block/mdXXX/md/array_state
we need to use scheduled work for deleting the gendisk and other
kobjects.  This allows us to wait for any pending gendisk deletion to
complete by simply calling flush_scheduled_work().



Signed-off-by: NeilBrown <neilb@suse.de>
2009-01-09 08:31:10 +11:00
Cheng Renquan cd2ac9321c md: need another print_sb for mdp_superblock_1
md_print_devices is called in two code path: MD_BUG(...), and md_ioctl
with PRINT_RAID_DEBUG.  it will dump out all in use md devices
information;

However, it wrongly processed two types of superblock in one:

The header file <linux/raid/md_p.h> has defined two types of superblock,
struct mdp_superblock_s (typedefed with mdp_super_t) according to md with
metadata 0.90, and struct mdp_superblock_1 according to md with metadata
1.0 and later,

These two types of superblock are very different,

The md_print_devices code processed them both in mdp_super_t, that would
lead to wrong informaton dump like:

	[ 6742.345877]
	[ 6742.345887] md:	**********************************
	[ 6742.345890] md:	* <COMPLETE RAID STATE PRINTOUT> *
	[ 6742.345892] md:	**********************************
	[ 6742.345896] md1: <ram7><ram6><ram5><ram4>
	[ 6742.345907] md: rdev ram7, SZ:00065472 F:0 S:1 DN:3
	[ 6742.345909] md: rdev superblock:
	[ 6742.345914] md:  SB: (V:0.90.0) ID:<42ef13c7.598c059a.5f9f1645.801e9ee6> CT:4919856d
	[ 6742.345918] md:     L5 S00065472 ND:4 RD:4 md1 LO:2 CS:65536
	[ 6742.345922] md:     UT:4919856d ST:1 AD:4 WD:4 FD:0 SD:0 CSUM:b7992907 E:00000001
	[ 6742.345924]      D  0:  DISK<N:0,(1,8),R:0,S:6>
	[ 6742.345930]      D  1:  DISK<N:1,(1,10),R:1,S:6>
	[ 6742.345933]      D  2:  DISK<N:2,(1,12),R:2,S:6>
	[ 6742.345937]      D  3:  DISK<N:3,(1,14),R:3,S:6>
	[ 6742.345942] md:     THIS:  DISK<N:3,(1,14),R:3,S:6>
	...
	[ 6742.346058] md0: <ram3><ram2><ram1><ram0>
	[ 6742.346067] md: rdev ram3, SZ:00065472 F:0 S:1 DN:3
	[ 6742.346070] md: rdev superblock:
	[ 6742.346073] md:  SB: (V:1.0.0) ID:<369aad81.00000000.00000000.00000000> CT:9a322a9c
	[ 6742.346077] md:     L-1507699579 S976570180 ND:48 RD:0 md0 LO:65536 CS:196610
	[ 6742.346081] md:     UT:00000018 ST:0 AD:131048 WD:0 FD:8 SD:0 CSUM:00000000 E:00000000
	[ 6742.346084]      D  0:  DISK<N:-1,(-1,-1),R:-1,S:-1>
	[ 6742.346089]      D  1:  DISK<N:-1,(-1,-1),R:-1,S:-1>
	[ 6742.346092]      D  2:  DISK<N:-1,(-1,-1),R:-1,S:-1>
	[ 6742.346096]      D  3:  DISK<N:-1,(-1,-1),R:-1,S:-1>
	[ 6742.346102] md:     THIS:  DISK<N:0,(0,0),R:0,S:0>
	...
	[ 6742.346219] md:	**********************************
	[ 6742.346221]

Here md1 is metadata 0.90.0, and md0 is metadata 1.2

After some more code to distinguish these two types of superblock, in this patch,

it will generate dump information like:

	[ 7906.755790]
	[ 7906.755799] md:	**********************************
	[ 7906.755802] md:	* <COMPLETE RAID STATE PRINTOUT> *
	[ 7906.755804] md:	**********************************
	[ 7906.755808] md1: <ram7><ram6><ram5><ram4>
	[ 7906.755819] md: rdev ram7, SZ:00065472 F:0 S:1 DN:3
	[ 7906.755821] md: rdev superblock (MJ:0):
	[ 7906.755826] md:  SB: (V:0.90.0) ID:<3fca7a0d.a612bfed.5f9f1645.801e9ee6> CT:491989f3
	[ 7906.755830] md:     L5 S00065472 ND:4 RD:4 md1 LO:2 CS:65536
	[ 7906.755834] md:     UT:491989f3 ST:1 AD:4 WD:4 FD:0 SD:0 CSUM:00fb52ad E:00000001
	[ 7906.755836]      D  0:  DISK<N:0,(1,8),R:0,S:6>
	[ 7906.755842]      D  1:  DISK<N:1,(1,10),R:1,S:6>
	[ 7906.755845]      D  2:  DISK<N:2,(1,12),R:2,S:6>
	[ 7906.755849]      D  3:  DISK<N:3,(1,14),R:3,S:6>
	[ 7906.755855] md:     THIS:  DISK<N:3,(1,14),R:3,S:6>
	...
	[ 7906.755972] md0: <ram3><ram2><ram1><ram0>
	[ 7906.755981] md: rdev ram3, SZ:00065472 F:0 S:1 DN:3
	[ 7906.755984] md: rdev superblock (MJ:1):
	[ 7906.755989] md:  SB: (V:1) (F:0) Array-ID:<5fbcf158:55aa:5fbe:9a79:1e939880dcbd>
	[ 7906.755990] md:    Name: "DG5:0" CT:1226410480
	[ 7906.755998] md:       L5 SZ130944 RD:4 LO:2 CS:128 DO:24 DS:131048 SO:8 RO:0
	[ 7906.755999] md:     Dev:00000003 UUID: 9194d744:87f7:a448:85f2:7497b84ce30a
	[ 7906.756001] md:       (F:0) UT:1226410480 Events:0 ResyncOffset:-1 CSUM:0dbcd829
	[ 7906.756003] md:         (MaxDev:384)
	...
	[ 7906.756113] md:	**********************************
	[ 7906.756116]

this md0 (metadata 1.2) information dumping is exactly according to struct
mdp_superblock_1.

Signed-off-by: Cheng Renquan <crquan@gmail.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Dan Williams <dan.j.williams@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NeilBrown <neilb@suse.de>
2009-01-09 08:31:08 +11:00
Cheng Renquan 159ec1fc06 md: use list_for_each_entry macro directly
The rdev_for_each macro defined in <linux/raid/md_k.h> is identical to
list_for_each_entry_safe, from <linux/list.h>, it should be defined to
use list_for_each_entry_safe, instead of reinventing the wheel.

But some calls to each_entry_safe don't really need a safe version,
just a direct list_for_each_entry is enough, this could save a temp
variable (tmp) in every function that used rdev_for_each.

In this patch, most rdev_for_each loops are replaced by list_for_each_entry,
totally save many tmp vars; and only in the other situations that will call
list_del to delete an entry, the safe version is used.

Signed-off-by: Cheng Renquan <crquan@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2009-01-09 08:31:08 +11:00
Andre Noll ccacc7d2cf md: raid0: make hash_spacing and preshift sector-based.
This patch renames the hash_spacing and preshift members of struct
raid0_private_data to spacing and sector_shift respectively and
changes the semantics as follows:

We always have spacing = 2 * hash_spacing. In case
sizeof(sector_t) > sizeof(u32) we also have sector_shift = preshift + 1
while sector_shift = preshift = 0 otherwise.

Note that the values of nb_zone and zone are unaffected by these changes
because in the sector_div() preceeding the assignement of these two
variables both arguments double.

Signed-off-by: Andre Noll <maan@systemlinux.org>
Signed-off-by: NeilBrown <neilb@suse.de>
2009-01-09 08:31:08 +11:00
Andre Noll 83838ed878 md: raid0: Represent the size of strip zones in sectors.
This completes the block -> sector conversion of struct strip_zone.

Signed-off-by: Andre Noll <maan@systemlinux.org>
Signed-off-by: NeilBrown <neilb@suse.de>
2009-01-09 08:31:07 +11:00
Andre Noll 6199d3db0f md: raid0: Represent zone->zone_offset in sectors.
For the same reason as in the previous patch, rename it from zone_offset
to zone_start.

Signed-off-by: Andre Noll <maan@systemlinux.org>
Signed-off-by: NeilBrown <neilb@suse.de>
2009-01-09 08:31:07 +11:00
Andre Noll 019c4e2f3e md: raid0: Represent device offset in sectors.
Rename zone->dev_offset to zone->dev_start to make sure all users
have been converted.

Signed-off-by: Andre Noll <maan@systemlinux.org>
Signed-off-by: NeilBrown <neilb@suse.de>
2009-01-09 08:31:06 +11:00
NeilBrown 0c3573f19d md: use sysfs_notify_dirent to notify changes to md/sync_action.
There is no compelling need for this, but sysfs_notify_dirent is a
nicer interface and the change is good for consistency.

Signed-off-by: NeilBrown <neilb@suse.de>
2009-01-09 08:31:05 +11:00
Linus Torvalds 713404d608 Merge branch 'for-2.6.29' of git://linux-nfs.org/~bfields/linux
* 'for-2.6.29' of git://linux-nfs.org/~bfields/linux: (67 commits)
  nfsd: get rid of NFSD_VERSION
  nfsd: last_byte_offset
  nfsd: delete wrong file comment from nfsd/nfs4xdr.c
  nfsd: git rid of nfs4_cb_null_ops declaration
  nfsd: dprint each op status in nfsd4_proc_compound
  nfsd: add etoosmall to nfserrno
  NFSD: FIDs need to take precedence over UUIDs
  SUNRPC: The sunrpc server code should not be used by out-of-tree modules
  svc: Clean up deferred requests on transport destruction
  nfsd: fix double-locks of directory mutex
  svc: Move kfree of deferral record to common code
  CRED: Fix NFSD regression
  NLM: Clean up flow of control in make_socks() function
  NLM: Refactor make_socks() function
  nfsd: Ensure nfsv4 calls the underlying filesystem on LOCKT
  SUNRPC: Ensure the server closes sockets in a timely fashion
  NFSD: Add documenting comments for nfsctl interface
  NFSD: Replace open-coded integer with macro
  NFSD: Fix a handful of coding style issues in write_filehandle()
  NFSD: clean up failover sysctl function naming
  ...
2009-01-07 17:21:24 -08:00
Linus Torvalds b424e8d3b4 Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (98 commits)
  PCI PM: Put PM callbacks in the order of execution
  PCI PM: Run default PM callbacks for all devices using new framework
  PCI PM: Register power state of devices during initialization
  PCI PM: Call pci_fixup_device from legacy routines
  PCI PM: Rearrange code in pci-driver.c
  PCI PM: Avoid touching devices behind bridges in unknown state
  PCI PM: Move pci_has_legacy_pm_support
  PCI PM: Power-manage devices without drivers during suspend-resume
  PCI PM: Add suspend counterpart of pci_reenable_device
  PCI PM: Fix poweroff and restore callbacks
  PCI: Use msleep instead of cpu_relax during ASPM link retraining
  PCI: PCIe portdrv: Add kerneldoc comments to remining core funtions
  PCI: PCIe portdrv: Rearrange code so that related things are together
  PCI: PCIe portdrv: Fix suspend and resume of PCI Express port services
  PCI: PCIe portdrv: Add kerneldoc comments to some core functions
  x86/PCI: Do not use interrupt links for devices using MSI-X
  net: sfc: Use pci_clear_master() to disable bus mastering
  PCI: Add pci_clear_master() as opposite of pci_set_master()
  PCI hotplug: remove redundant test in cpq hotplug
  PCI: pciehp: cleanup register and field definitions
  ...
2009-01-07 15:41:01 -08:00
Linus Torvalds 7c7758f99d Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: (123 commits)
  wimax/i2400m: add CREDITS and MAINTAINERS entries
  wimax: export linux/wimax.h and linux/wimax/i2400m.h with headers_install
  i2400m: Makefile and Kconfig
  i2400m/SDIO: TX and RX path backends
  i2400m/SDIO: firmware upload backend
  i2400m/SDIO: probe/disconnect, dev init/shutdown and reset backends
  i2400m/SDIO: header for the SDIO subdriver
  i2400m/USB: TX and RX path backends
  i2400m/USB: firmware upload backend
  i2400m/USB: probe/disconnect, dev init/shutdown and reset backends
  i2400m/USB: header for the USB bus driver
  i2400m: debugfs controls
  i2400m: various functions for device management
  i2400m: RX and TX data/control paths
  i2400m: firmware loading and bootrom initialization
  i2400m: linkage to the networking stack
  i2400m: Generic probe/disconnect, reset and message passing
  i2400m: host/device procotol and core driver definitions
  i2400m: documentation and instructions for usage
  wimax: Makefile, Kconfig and docbook linkage for the stack
  ...
2009-01-07 15:37:24 -08:00
Linus Torvalds 67acd8b4b7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/arjan/linux-2.6-async
* git://git.kernel.org/pub/scm/linux/kernel/git/arjan/linux-2.6-async:
  async: don't do the initcall stuff post boot
  bootchart: improve output based on Dave Jones' feedback
  async: make the final inode deletion an asynchronous event
  fastboot: Make libata initialization even more async
  fastboot: make the libata port scan asynchronous
  fastboot: make scsi probes asynchronous
  async: Asynchronous function calls to speed up kernel boot
2009-01-07 15:35:47 -08:00
Benny Halevy db43910cb4 nfsd: get rid of NFSD_VERSION
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-01-07 17:38:32 -05:00
Benny Halevy 87df4de807 nfsd: last_byte_offset
refactor the nfs4 server lock code to use last_byte_offset
to compute the last byte covered by the lock.  Check for overflow
so that the last byte is set to NFS4_MAX_UINT64 if offset + len
wraps around.

Also, use NFS4_MAX_UINT64 for ~(u64)0 where appropriate.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-01-07 17:38:31 -05:00
Linus Torvalds c6906a2cb7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fixes
* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fixes:
  kbuild: fix typos (s/bin_shipped/bin.o_shipped/) in Documentation
  kbuild: add a symlink to the source for separate objdirs
  kconfig: add script to manipulate .config files on the command line
  kbuild: reintroduce ALLSOURCE_ARCHS support for tags/cscope
  bootchart: improve output based on Dave Jones' feedback
  fix modules_install via NFS
  qnx: include <linux/types.h> for definitions of __[us]{8,16,32,64} types
2009-01-07 13:11:28 -08:00
Anders Larsen 8d1a0a13ed qnx: include <linux/types.h> for definitions of __[us]{8,16,32,64} types
On 2008-12-30 11:32:33, Sam Ravnborg wrote:
> We have added a few additional validation checks of the userspace headers:
...
> 3) We should include <linux/types.h> and not <asm/types.h>
> 4) If we use a __[us]{8,16,32,64} type then we must include <linux/types.h>

Satisfy these requirements for the linux/qnx*.h headers.

Signed-off-by: Anders Larsen <al@alarsen.net>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2009-01-07 21:44:20 +01:00
Linus Torvalds fa7b906e7f Merge branch 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6
* 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6:
  i2c: Use snprintf to set adapter names
  Input: apanel - convert to new i2c binding
  i2c: Drop I2C_CLASS_CAM_DIGITAL
  i2c: Drop I2C_CLASS_CAM_ANALOG and I2C_CLASS_SOUND
  i2c: Drop I2C_CLASS_ALL
  i2c: Get rid of remaining bus_id access
  i2c: Replace bus_id with dev_name(), dev_set_name()
2009-01-07 11:59:27 -08:00
Linus Torvalds 08249903ea Merge git://git.kernel.org/pub/scm/linux/kernel/git/hskinnemoen/avr32-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/hskinnemoen/avr32-2.6:
  avr32: Move syscalls.h under arch/avr32/include/asm/
  avr32: Define DIE_OOPS
  avr32: Remove DMATEST from defconfigs
  arch/avr32: Eliminate NULL test and memset after alloc_bootmem
  avr32: data param to at32_add_device_mci() must be non-NULL
  atmel-mci: move atmel-mci.h file to include/linux
  avr32: Hammerhead board support
  avr32: Allow reserving multiple pins at once
  favr-32: Remove deprecated call
  MIMC200: Remove deprecated call
  avr: struct device - replace bus_id with dev_name(), dev_set_name()
  avr32: Introducing asm/syscalls.h
2009-01-07 11:58:30 -08:00
Linus Torvalds 52fefcec97 Merge git://git.kernel.org/pub/scm/linux/kernel/git/czankel/xtensa-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/czankel/xtensa-2.6:
  xtensa: Update platform files to reflect new location of the header files.
  xtensa: switch to packed struct unaligned access implementation
  xtensa: Add xt2000 support files.
  xtensa: move headers files to arch/xtensa/include
  xtensa: use the new byteorder headers
2009-01-07 11:56:29 -08:00
Linus Torvalds 57c44c5f6f Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (24 commits)
  trivial: chack -> check typo fix in main Makefile
  trivial: Add a space (and a comma) to a printk in 8250 driver
  trivial: Fix misspelling of "firmware" in docs for ncr53c8xx/sym53c8xx
  trivial: Fix misspelling of "firmware" in powerpc Makefile
  trivial: Fix misspelling of "firmware" in usb.c
  trivial: Fix misspelling of "firmware" in qla1280.c
  trivial: Fix misspelling of "firmware" in a100u2w.c
  trivial: Fix misspelling of "firmware" in megaraid.c
  trivial: Fix misspelling of "firmware" in ql4_mbx.c
  trivial: Fix misspelling of "firmware" in acpi_memhotplug.c
  trivial: Fix misspelling of "firmware" in ipw2100.c
  trivial: Fix misspelling of "firmware" in atmel.c
  trivial: Fix misspelled firmware in Kconfig
  trivial: fix an -> a typos in documentation and comments
  trivial: fix then -> than typos in comments and documentation
  trivial: update Jesper Juhl CREDITS entry with new email
  trivial: fix singal -> signal typo
  trivial: Fix incorrect use of "loose" in event.c
  trivial: printk: fix indentation of new_text_line declaration
  trivial: rtc-stk17ta8: fix sparse warning
  ...
2009-01-07 11:31:52 -08:00
Detlef Riekenberg 940fbf411e linux/types.h: Don't depend on __GNUC__ for __le64/__be64
The typedefs for __u64 and __s64 where fixed to be available for other
compiler on May 2 2008 by H.  Peter Anvin (in commit edfa5cfa3d)

Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Detlef Riekenberg <wine.dev@web.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-07 11:27:12 -08:00
Rafael J. Wysocki 16cf0ebc35 x86/PCI: Do not use interrupt links for devices using MSI-X
pcibios_enable_device() and pcibios_disable_device() don't handle
IRQs for devices that have MSI enabled and it should treat the
devices with MSI-X enabled in the same way.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-07 11:13:25 -08:00
Ben Hutchings 6a479079c0 PCI: Add pci_clear_master() as opposite of pci_set_master()
During an online device reset it may be useful to disable bus-mastering.
pci_disable_device() does that, and far more besides, so is not suitable
for an online reset.

Add pci_clear_master() which does just this.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-07 11:13:23 -08:00
Kenji Kaneshige 322162a71b PCI: pciehp: cleanup register and field definitions
Clean up register definitions related to PCI Express Hot plug.

  - Add register definitions into include/linux/pci_regs.h, and use
    them instead of pciehp's locally definied register definitions.
  - Remove pciehp's locally defined register definitions
  - Remove unused register definitions in pciehp.
  - Some minor cleanups.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-07 11:13:22 -08:00
Stephen Hemminger db5679437a PCI: add interface to set visible size of VPD
The VPD on all devices may not be 32K. Unfortunately, there is no
generic way to find the size, so this adds a simple API hook
to reset it.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-07 11:13:18 -08:00
Stephen Hemminger 287d19ce2e PCI: revise VPD access interface
Change PCI VPD API which was only used by sysfs to something usable
in drivers.
   * move iteration over multiple words to the low level
   * use conventional types for arguments
   * add exportable wrapper

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-07 11:13:17 -08:00
Bjorn Helgaas 68feac87de PCI: add pci_common_swizzle() for INTx swizzling
This patch adds pci_common_swizzle(), which swizzles INTx values all the
way up to a root bridge.

This common implementation can replace several architecture-specific
ones.  This should someday be combined with pci_get_interrupt_pin(),
but I left it separate for now to make reviewing easier.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-07 11:13:12 -08:00
Kenji Kaneshige e8c331e963 PCI hotplug: introduce functions for ACPI slot detection
Some ACPI related PCI hotplug code can be shared among PCI hotplug
drivers. This patch introduces the following functions in
drivers/pci/hotplug/acpi_pcihp.c to share the code, and changes
acpiphp and pciehp to use them.

- int acpi_pci_detect_ejectable(struct pci_bus *pbus)
  This checks if the specified PCI bus has ejectable slots.

- int acpi_pci_check_ejectable(struct pci_bus *pbus, acpi_handle handle)
  This checks if the specified handle is ejectable ACPI PCI slot. The
  'pbus' parameter is needed to check if 'handle' is PCI related ACPI
  object.

This patch also introduces the following inline function in
include/linux/pci-acpi.h, which is useful to get ACPI handle of the
PCI bridge from struct pci_bus of the bridge's secondary bus.

- static inline acpi_handle acpi_pci_get_bridge_handle(struct pci_bus *pbus)
  This returns ACPI handle of the PCI bridge which generates PCI bus
  specified by 'pbus'.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-07 11:13:11 -08:00
Yu Zhao fde09c6d8f PCI: define PCI resource names in an 'enum'
This patch moves all definitions of the PCI resource names to an 'enum',
and also replaces some hard-coded resource variables with symbol
names. This change eases introduction of device specific resources.

Reviewed-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-07 11:13:01 -08:00
Yu Zhao 14add80b51 PCI: remove unnecessary arg of pci_update_resource()
This cleanup removes unnecessary argument 'struct resource *res' in
pci_update_resource(), so it takes same arguments as other companion
functions (pci_assign_resource(), etc.).

Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-07 11:13:00 -08:00
Andrew Morton 1684f5ddd4 PCI: uninline pci_ioremap_bar()
It's too large to be inlined.

Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-07 11:12:57 -08:00
Bjorn Helgaas 57c2cf71c1 PCI: add pci_swizzle_interrupt_pin()
This patch adds pci_swizzle_interrupt_pin(), which implements the
INTx swizzling algorithm specified in Table 9-1 of the "PCI-to-PCI
Bridge Architecture Specification," revision 1.2.

There are many architecture-specific implementations of this
swizzle that can be replaced by this common one.

Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-07 11:12:50 -08:00
Arjan van de Ven e8de1481fd resource: allow MMIO exclusivity for device drivers
Device drivers that use pci_request_regions() (and similar APIs) have a
reasonable expectation that they are the only ones accessing their device.
As part of the e1000e hunt, we were afraid that some userland (X or some
bootsplash stuff) was mapping the MMIO region that the driver thought it
had exclusively via /dev/mem or via various sysfs resource mappings.

This patch adds the option for device drivers to cause their reserved
regions to the "banned from /dev/mem use" list, so now both kernel memory
and device-exclusive MMIO regions are banned.
NOTE: This is only active when CONFIG_STRICT_DEVMEM is set.

In addition to the config option, a kernel parameter iomem=relaxed is
provided for the cases where developers want to diagnose, in the field,
drivers issues from userspace.

Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-07 11:12:32 -08:00
Andrew Patterson 2361694191 ACPI/PCI: remove obsolete _OSC capability support functions
The acpi_query_osc, __pci_osc_support_set, pci_osc_support_set, and
pcie_osc_support_set functions have been obsoleted in favor of setting
these capabilities during root bridge discovery with
pci_acpi_osc_support.  There are no longer any callers of these
functions, so remove them.

Signed-off-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-07 11:12:32 -08:00
Andrew Patterson 07ae95f988 ACPI/PCI: PCI MSI _OSC support capabilities called when root bridge added
The _OSC capability OSC_MSI_SUPPORT is set when the root bridge is added
with pci_acpi_osc_support(), so we no longer need to do it in the PCI
MSI driver.  Also adds the function pci_msi_enabled, which returns true
if pci=nomsi is not on the kernel command-line.

Signed-off-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-07 11:12:31 -08:00
Andrew Patterson 3e1b16002a ACPI/PCI: PCIe ASPM _OSC support capabilities called when root bridge added
The _OSC capabilities OSC_ACTIVE_STATE_PWR_SUPPORT and
OSC_CLOCK_PWR_CAPABILITY_SUPPORT are set when the root bridge is added
with pci_acpi_osc_support(), so we no longer need to do it in the ASPM
driver.  Also add the function pcie_aspm_enabled, which returns true if
pcie_aspm=off is not on the kernel command-line.

Signed-off-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-07 11:12:29 -08:00
Andrew Patterson 0ef5f8f615 ACPI/PCI: PCI extended config _OSC support called when root bridge added
The _OSC capability OSC_EXT_PCI_CONFIG_SUPPORT is set when the root
bridge is added with pci_acpi_osc_support() if we can access PCI
extended config space.

This adds the function pci_ext_cfg_avail which returns true if we can
access PCI extended config space (offset greater than 0xff). It
currently only returns false if arch=x86 and raw_pci_ext_ops is not set
(which might happen if pci=nommcfg is set on the kernel command-line).

Signed-off-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-07 11:12:28 -08:00
Andrew Patterson 990a7ac564 ACPI/PCI: call _OSC support during root bridge discovery
Add pci_acpi_osc_support() and call it when a PCI bridge is added.  This
allows us to avoid having every individual PCI root bridge driver call
_OSC support for every root bridge in their probe functions, a
significant savings in boot time.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-07 11:12:27 -08:00
Andrew Patterson 8b62091e20 ACPI/PCI: include missing acpi.h file in pci-acpi.h.
The pci-acpi.h file will not compile without including linux/acpi.h.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-07 11:12:26 -08:00
Sheng Yang f7b7baae6b PCI: add PCI Advanced Feature Capability defines
PCI Advanced Features Capability is introduced by "Conventional PCI
Advanced Caps ECN" (can be downloaded in pcisig.com).  Add defines for
the various AF capabilities, including function level reset (FLR).

Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-07 11:12:24 -08:00
Inaky Perez-Gonzalez e306987434 wimax: export linux/wimax.h and linux/wimax/i2400m.h with headers_install
These two files are what user space can use to establish communication
with the WiMAX kernel API and to speak the Intel 2400m Wireless WiMAX
connection's control protocol.

Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-01-07 10:00:22 -08:00
Inaky Perez-Gonzalez ea24652d25 i2400m: host/device procotol and core driver definitions
The wimax/i2400m.h defines the structures and constants for the
host-device protocols:

 - boot / firmware upload protocol

 - general data transport protocol

 - control protocol

It is done in such a way that can also be used verbatim by user space.

drivers/net/wimax/i2400m.h defines all the APIs used by the core,
bus-generic driver (i2400m) and the bus specific drivers
(i2400m-BUSNAME). It also gives a roadmap to the driver
implementation.

debug-levels.h adds the core driver's debug settings.

Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-01-07 10:00:18 -08:00
Inaky Perez-Gonzalez ea912f4e7f wimax: debug macros and debug settings for the WiMAX stack
This file contains a simple debug framework that is used in the stack;
it allows the debug level to be controlled at compile-time (so the
debug code is optimized out) and at run-time (for what wasn't compiled
out).

This is eventually going to be moved to use dynamic_printk(). Just
need to find time to do it.

Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-01-07 10:00:17 -08:00
Inaky Perez-Gonzalez ace22f0881 wimax: headers for kernel API and user space interaction
Definitions for the user/kernel API protocol through generic
netlink. User space can copy it verbatim and use it.

Kernel API definition declares the main data types and calls for the
drivers to integrate into the WiMAX stack. Provides usage
documentation.

Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-01-07 10:00:16 -08:00
Inaky Perez-Gonzalez 5e07878787 debugfs: add helpers for exporting a size_t simple value
In the same spirit as debugfs_create_*(), introduce helpers for
exporting size_t values over debugfs.

The only trick done is that the format verifier is kept at %llu
instead of %zu; otherwise type warnings would pop up:

format ‘%zu’ expects type ‘size_t’, but argument 2 has type ‘long long unsigned int’

There is no real way to fix this one--however, we can consider %llu
and %zu to be compatible if we consider that we are using the same for
validating in debugfs_create_{x,u}{8,16,32}().

Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-01-07 10:00:16 -08:00
Greg Kroah-Hartman 34c65d82e0 USB: remove info() macro from usb.h
USB should not be having it's own printk macros, so remove info() and
use the system-wide standard of dev_info() wherever possible.

No one in the tree is using the macro, so it can now be removed.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-01-07 10:00:14 -08:00
Greg Kroah-Hartman 338b67b0c1 USB: remove warn() macro from usb.h
USB should not be having it's own printk macros, so remove warn() and
use the system-wide standard of dev_warn() wherever possible.  In the
few places that will not work out, use a basic printk().

Now that all in-tree users are gone, remove the macro.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-01-07 10:00:14 -08:00
Alan Stern 25ff1c316f USB: storage: add last-sector hacks
This patch (as1189b) adds some hacks to usb-storage for dealing with
the growing problems involving bad capacity values and last-sector
accesses:

	A new flag, US_FL_CAPACITY_OK, is created to indicate that
	the device is known to report its capacity correctly.  An
	unusual_devs entry for Linux's own File-backed Storage Gadget
	is added with this flag set, since g_file_storage always
	reports the correct capacity and since the capacity need
	not be even (it is determined by the size of the backing
	file).

	An entry in unusual_devs.h which has only the CAPACITY_OK
	flag set shouldn't prejudice libusual, since the device will
	work perfectly well with either usb-storage or ub.  So a
	new macro, COMPLIANT_DEV, is added to let libusual know
	about these entries.

	When a last-sector access succeeds and the total number of
	sectors is odd (the unexpected case, in which guessing that
	the number is even might cause trouble), a WARN is triggered.
	The kerneloops.org project will collect these warnings,
	allowing us to add CAPACITY_OK flags for the devices in
	question before implementing the default-to-even heuristic.
	If users want to prevent the stack dump produced by the WARN,
	they can disable the hack by adding an unusual_devs entry
	for their device with the CAPACITY_OK flag.

	When a last-sector access fails three times in a row and
	neither the FIX_CAPACITY nor the CAPACITY_OK flag is set,
	we assume the last-sector bug is present.  We replace the
	existing status and sense data with values that will cause
	the SCSI core to fail the access immediately rather than
	retry indefinitely.  This should fix the difficulties
	people have been having with Nokia phones.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-01-07 10:00:11 -08:00