Commit graph

12838 commits

Author SHA1 Message Date
Ingo Molnar 0bf52b9817 net: Fix spinlock use in alloc_netdev_mq()
-tip testing found this lockdep warning:

[    2.272010] calling  net_dev_init+0x0/0x164 @ 1
[    2.276033] device class 'net': registering
[    2.280191] INFO: trying to register non-static key.
[    2.284005] the code is fine but needs lockdep annotation.
[    2.284005] turning off the locking correctness validator.
[    2.284005] Pid: 1, comm: swapper Not tainted 2.6.31-rc5-tip #1145
[    2.284005] Call Trace:
[    2.284005]  [<7958eb4e>] ? printk+0xf/0x11
[    2.284005]  [<7904f83c>] __lock_acquire+0x11b/0x622
[    2.284005]  [<7908c9b7>] ? alloc_debug_processing+0xf9/0x144
[    2.284005]  [<7904e2be>] ? mark_held_locks+0x3a/0x52
[    2.284005]  [<7908dbc4>] ? kmem_cache_alloc+0xa8/0x13f
[    2.284005]  [<7904e475>] ? trace_hardirqs_on_caller+0xa2/0xc3
[    2.284005]  [<7904fdf6>] lock_acquire+0xb3/0xd0
[    2.284005]  [<79489678>] ? alloc_netdev_mq+0xf5/0x1ad
[    2.284005]  [<79591514>] _spin_lock_bh+0x2d/0x5d
[    2.284005]  [<79489678>] ? alloc_netdev_mq+0xf5/0x1ad
[    2.284005]  [<79489678>] alloc_netdev_mq+0xf5/0x1ad
[    2.284005]  [<793a38f2>] ? loopback_setup+0x0/0x74
[    2.284005]  [<798eecd0>] loopback_net_init+0x20/0x5d
[    2.284005]  [<79483efb>] register_pernet_device+0x23/0x4b
[    2.284005]  [<798f5c9f>] net_dev_init+0x115/0x164
[    2.284005]  [<7900104f>] do_one_initcall+0x4a/0x11a
[    2.284005]  [<798f5b8a>] ? net_dev_init+0x0/0x164
[    2.284005]  [<79066f6d>] ? register_irq_proc+0x8c/0xa8
[    2.284005]  [<798cc29a>] do_basic_setup+0x42/0x52
[    2.284005]  [<798cc30a>] kernel_init+0x60/0xa1
[    2.284005]  [<798cc2aa>] ? kernel_init+0x0/0xa1
[    2.284005]  [<79003e03>] kernel_thread_helper+0x7/0x10
[    2.284078] device: 'lo': device_add
[    2.288248] initcall net_dev_init+0x0/0x164 returned 0 after 11718 usecs
[    2.292010] calling  neigh_init+0x0/0x66 @ 1
[    2.296010] initcall neigh_init+0x0/0x66 returned 0 after 0 usecs

it's using an zero-initialized spinlock. This is a side-effect of:

        dev_unicast_init(dev);

in alloc_netdev_mq() making use of dev->addr_list_lock.

The device has just been allocated freshly, it's not accessible
anywhere yet so no locking is needed at all - in fact it's wrong
to lock it here (the lock isnt initialized yet).

This bug was introduced via:

| commit a6ac65db23
| Date:   Thu Jul 30 01:06:12 2009 +0000
|
|     net: restore the original spinlock to protect unicast list

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Jiri Pirko <jpirko@redhat.com>
Tested-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-05 08:35:11 -07:00
David S. Miller eca4c3d2dd Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2009-08-03 19:05:50 -07:00
Luis R. Rodriguez 371842448c cfg80211: fix regression on beacon world roaming feature
A regression was added through patch a4ed90d6:

"cfg80211: respect API on orig_flags on channel for beacon hint"

We did indeed respect _orig flags but the intention was not clearly
stated in the commit log. This patch fixes firmware issues picked
up by iwlwifi when we lift passive scan of beaconing restrictions
on channels its EEPROM has been configured to always enable.

By doing so though we also disallowed beacon hints on devices
registering their wiphy with custom world regulatory domains
enabled, this happens to be currently ath5k, ath9k and ar9170.
The passive scan and beacon restrictions on those devices would
never be lifted even if we did find a beacon and the hardware did
support such enhancements when world roaming.

Since Johannes indicates iwlwifi firmware cannot be changed to
allow beacon hinting we set up a flag now to specifically allow
drivers to disable beacon hints for devices which cannot use them.

We enable the flag on iwlwifi to disable beacon hints and by default
enable it for all other drivers. It should be noted beacon hints lift
passive scan flags and beacon restrictions when we receive a beacon from
an AP on any 5 GHz non-DFS channels, and channels 12-14 on the 2.4 GHz
band. We don't bother with channels 1-11 as those channels are allowed
world wide.

This should fix world roaming for ath5k, ath9k and ar9170, thereby
improving scan time when we receive the first beacon from any AP,
and also enabling beaconing operation (AP/IBSS/Mesh) on cards which
would otherwise not be allowed to do so. Drivers not using custom
regulatory stuff (wiphy_apply_custom_regulatory()) were not affected
by this as the orig_flags for the channels would have been cleared
upon wiphy registration.

I tested this with a world roaming ath5k card.

Cc: Jouni Malinen <jouni.malinen@atheros.com>
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-08-03 16:31:21 -04:00
Johannes Berg cd3468bad9 cfg80211: add two missing NULL pointer checks
These pointers can be NULL, the is_mesh() case isn't
ever hit in the current kernel, but cmp_ies() can be
hit under certain conditions.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: stable@kernel.org [2.6.29, 2.6.30]
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-08-03 16:31:21 -04:00
Dave Young af0d3b103b bluetooth: rfcomm_init bug fix
rfcomm tty may be used before rfcomm_tty_driver initilized,
The problem is that now socket layer init before tty layer, if userspace
program do socket callback right here then oops will happen.

reporting in:
http://marc.info/?l=linux-bluetooth&m=124404919324542&w=2

make 3 changes:
1. remove #ifdef in rfcomm/core.c,
make it blank function when rfcomm tty not selected in rfcomm.h

2. tune the rfcomm_init error patch to ensure
tty driver initilized before rfcomm socket usage.

3. remove __exit for rfcomm_cleanup_sockets
because above change need call it in a __init function.

Reported-by: Oliver Hartkopp <oliver@hartkopp.net>
Tested-by: Oliver Hartkopp <oliver@hartkopp.net>
Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-03 13:24:39 -07:00
Jiri Pirko a6ac65db23 net: restore the original spinlock to protect unicast list
There is a path when an assetion in dev_unicast_sync() appears.

igmp6_group_added -> dev_mc_add -> __dev_set_rx_mode ->
-> vlan_dev_set_rx_mode -> dev_unicast_sync

Therefore we cannot protect this list with rtnl. This patch restores the
original protecting this list with spinlock.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Tested-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-02 12:20:46 -07:00
Eric Dumazet 144586301f net: net_assign_generic() fix
memcpy() should take into account size of pointers,
not only number of pointers to copy.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-02 12:20:36 -07:00
roel kluin a3e8ee6820 ipv4: ARP neigh procfs buffer overflow
If arp_format_neigh_entry() can be called with n->dev->addr_len == 0, then a
write to hbuffer[-1] occurs.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-30 13:27:29 -07:00
Julia Lawall ca7daea612 net/netlabel: Add kmalloc NULL tests
The test on map4 should be a test on map6.

The semantic match that finds this problem is as follows:
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@@
expression *x;
identifier f;
constant char *C;
@@

x = \(kmalloc\|kcalloc\|kzalloc\)(...);
... when != x == NULL
    when != x != NULL
    when != (x || ...)
(
kfree(x)
|
f(...,C,...,x,...)
|
*f(...,x,...)
|
*x->f
)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-30 10:58:28 -07:00
David S. Miller a1b97440ee Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2009-07-30 10:35:45 -07:00
Johannes Berg 89c3a8aca2 mac80211: fix suspend
Jan reported that his b43-based laptop hangs during suspend.
The problem turned out to be mac80211 asking the driver to
stop the hardware before removing interfaces, and interface
removal caused b43 to touch the hardware (while down, which
causes the hang).

This patch fixes mac80211 to do reorder these operations to
have them in the correct order -- first remove interfaces
and then stop the hardware. Some more code is necessary to
be able to do so in a race-free manner, in particular it is
necessary to not process frames received during quiescing.

Fixes http://bugzilla.kernel.org/show_bug.cgi?id=13337.

Reported-by: Jan Scholz <scholz@fias.uni-frankfurt.de>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-29 14:52:01 -04:00
Luis R. Rodriguez 78f1a8b758 mac80211: do not queue work after suspend in the dynamic ps timer
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-27 15:19:38 -04:00
Deepak Saxena 0cbb0a781a net: irda: init spinlock after memcpy
irttp_dup() copies a tsap_cb struct, but does not initialize the
spinlock in the new structure, which confuses lockdep.

Signed-off-by: Deepak Saxena <dsaxena@mvista.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-27 10:49:44 -07:00
Xiaotian Feng c587aea951 net/bridge: use kobject_put to release kobject in br_add_if error path
kobject_init_and_add will alloc memory for kobj->name, so in br_add_if
error path, simply use kobject_del will not free memory for kobj->name.
Fix by using kobject_put instead, kobject_put will internally calls
kobject_del and frees memory for kobj->name.

Signed-off-by: Xiaotian Feng <dfeng@redhat.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-26 19:20:51 -07:00
Ralf Baechle dcf777f6ed NET: ROSE: Don't use static buffer.
The use of a static buffer in rose2asc() to return its result is not
threadproof and can result in corruption if multiple threads are trying
to use one of the procfs files based on rose2asc().

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-26 19:11:14 -07:00
Christian Lamparter 9e81eccf19 cfg80211: double free in __cfg80211_scan_done
This patch fixes a double free corruption in __cfg80211_scan_done:

 ================================================
 BUG kmalloc-512: Object already free
 ------------------------------------------------

 INFO: Allocated in load_elf_binary+0x18b/0x19af age=6
 INFO: Freed in load_elf_binary+0x104e/0x19af age=5
 INFO: Slab 0xffffea0001bae4c0 objects=14 used=7
 INFO: Object 0xffff88007e8a9918 @offset=6424 fp=0xffff88007e8a9488

 Bytes b4 0xffff88007e8a9908:  00 00 00 00 00 00 00 00 5a 5a
 [...]
 Pid: 28705, comm: rmmod Tainted: P         C 2.6.31-rc2-wl #1
 Call Trace:
  [<ffffffff810da9f4>] print_trailer+0x14e/0x16e
  [<ffffffff810daa56>] object_err+0x42/0x61
  [<ffffffff810dbcd9>] __slab_free+0x2af/0x396
  [<ffffffffa0ec9694>] ? wiphy_unregister+0x92/0x142 [cfg80211]
  [<ffffffff810dd5e3>] kfree+0x13c/0x17a
  [<ffffffffa0ec9694>] ? wiphy_unregister+0x92/0x142 [cfg80211]
  [<ffffffffa0ec9694>] wiphy_unregister+0x92/0x142 [cfg80211]
  [<ffffffffa0eed163>] ieee80211_unregister_hw+0xc8/0xff [mac80211]
  [<ffffffffa0f3fbc8>] p54_unregister_common+0x31/0x66 [p54common]
  [...]
 FIX kmalloc-512: Object at 0xffff88007e8a9918 not freed

The code path which leads to the *funny* double free:

       request = rdev->scan_req;
       dev = dev_get_by_index(&init_net, request->ifidx);
	/*
	 * the driver was unloaded recently and
	 * therefore dev_get_by_index will return NULL!
	 */
        if (!dev)
                goto out;
	[...]
	rdev->scan_req = NULL; /* not executed... */
	[...]
 out:
        kfree(request);

Signed-off-by: Christian Lamparter <chunkeey@web.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-21 12:07:44 -04:00
Niko Jokinen 6c95e2a2f0 nl80211: Memory leak fixed
Potential memory leak via msg pointer in nl80211_get_key() function.

Signed-off-by: Niko Jokinen <ext-niko.k.jokinen@nokia.com>
Signed-off-by: Luciano Coelho <luciano.coelho@nokia.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-21 12:07:42 -04:00
Javier Cardona 35946a5710 mac80211: use correct address for mesh Path Error
For forwarded frames, we save the precursor address in addr1 in case it
needs to be used to send a Path Error.  mesh_path_discard_frame,
however, was using addr2 instead of addr1 to send Path Error frames, so
correct that and also make the comment regarding this more clear.

Signed-off-by: Andrey Yurovsky <andrey@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-21 12:07:40 -04:00
Alan Jenkins 48ab3578a6 rfkill: fix rfkill_set_states() to set the hw state
The point of this function is to set the software and hardware state at
the same time.  When I tried to use it, I found it was only setting the
software state.

Signed-off-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-21 12:07:38 -04:00
Pavel Roskin 8ef86c7bfa mac80211: fix injection in monitor mode
The location of the 802.11 header is calculated incorrectly due to a
wrong placement of parentheses.  Found by kmemcheck.

Signed-off-by: Pavel Roskin <proski@gnu.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-21 12:07:38 -04:00
Johannes Berg f54c142725 rfkill: allow toggling soft state in sysfs again
Apparently there actually _are_ tools that try to set
this in sysfs even though it wasn't supposed to be used
this way without claiming first. Guess what: now that
I've cleaned it all up it doesn't matter and we can
simply allow setting the soft-block state in sysfs.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Tested-By: Darren Salt <linux@youmustbejoking.demon.co.uk>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-21 12:07:37 -04:00
Johannes Berg e2e414d923 mac80211: disable mesh
My kvm instance was complaining a lot about sleeping
in atomic contexts in the mesh code, and it turns out
that both mesh_path_add() and mpp_path_add() need to
be able to sleep (they even use synchronize_rcu()!).
I put in a might_sleep() to annotate that, but I see
no way, at least right now, of actually making sure
those functions are only called from process context
since they are both called during TX and RX and the
mesh code itself even calls them with rcu_read_lock()
"held".

Therefore, let's disable it completely for now.

It's possible that I'm only seeing this because the
hwsim's beaconing is broken and thus the peers aren't
discovered right away, but it is possible that this
happens even if beaconing is working, for a peer that
doesn't exist or so.

It should be possible to solve this by deferring the
freeing of the tables to call_rcu() instead of using
synchronize_rcu(), and also using atomic allocations,
but maybe it makes more sense to rework the code to
not call these from atomic contexts and defer more of
the work to the workqueue. Right now, I can't work on
either of those solutions though.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-21 12:07:35 -04:00
Rémi Denis-Courmont f249fb7830 Fix error return for setsockopt(SO_TIMESTAMPING)
I guess it should be -EINVAL rather than EINVAL. I have not checked
when the bug came in. Perhaps a candidate for -stable?

Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-20 08:23:36 -07:00
John Dykstra e547bc1ecc tcp: Use correct peer adr when copying MD5 keys
When the TCP connection handshake completes on the passive
side, a variety of state must be set up in the "child" sock,
including the key if MD5 authentication is being used.  Fix TCP
for both address families to label the key with the peer's
destination address, rather than the address from the listening
sock, which is usually the wildcard.

Reported-by:   Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: John Dykstra <john.dykstra1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-20 07:49:08 -07:00
John Dykstra e3afe7b75e tcp: Fix MD5 signature checking on IPv4 mapped sockets
Fix MD5 signature checking so that an IPv4 active open
to an IPv6 socket can succeed.  In particular, use the
correct address family's signature generation function
for the SYN/ACK.

Reported-by:   Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: John Dykstra <john.dykstra1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-20 07:49:07 -07:00
Eric Dumazet 4dc6dc7162 net: sock_copy() fixes
Commit e912b1142b
(net: sk_prot_alloc() should not blindly overwrite memory)
took care of not zeroing whole new socket at allocation time.

sock_copy() is another spot where we should be very careful.
We should not set refcnt to a non null value, until
we are sure other fields are correctly setup, or
a lockless reader could catch this socket by mistake,
while not fully (re)initialized.

This patch puts sk_node & sk_refcnt to the very beginning
of struct sock to ease sock_copy() & sk_prot_alloc() job.

We add appropriate smp_wmb() before sk_refcnt initializations
to match our RCU requirements (changes to sock keys should
be committed to memory before sk_refcnt setting)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-16 18:05:26 -07:00
David S. Miller 95aa1fe4ab Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6 2009-07-16 17:34:50 -07:00
Eric Dumazet 941297f443 netfilter: nf_conntrack: nf_conntrack_alloc() fixes
When a slab cache uses SLAB_DESTROY_BY_RCU, we must be careful when allocating
objects, since slab allocator could give a freed object still used by lockless
readers.

In particular, nf_conntrack RCU lookups rely on ct->tuplehash[xxx].hnnode.next
being always valid (ie containing a valid 'nulls' value, or a valid pointer to next
object in hash chain.)

kmem_cache_zalloc() setups object with NULL values, but a NULL value is not valid
for ct->tuplehash[xxx].hnnode.next.

Fix is to call kmem_cache_alloc() and do the zeroing ourself.

As spotted by Patrick, we also need to make sure lookup keys are committed to
memory before setting refcount to 1, or a lockless reader could get a reference
on the old version of the object. Its key re-check could then pass the barrier.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-07-16 14:03:40 +02:00
Patrick McHardy aa6a03eb0a netfilter: xt_osf: fix nf_log_packet() arguments
The first argument is the address family, the second one the hook
number.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-07-16 14:01:54 +02:00
Lothar Waßmann b13bb2e993 net/can: add module alias to can protocol drivers
Add appropriate MODULE_ALIAS() to facilitate autoloading of can protocol drivers

Signed-off-by: Lothar Wassmann <LW@KARO-electronics.de>
Acked-by: Oliver Hartkopp <oliver@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-15 11:20:38 -07:00
Lothar Waßmann f7e5cc0c40 net/can bugfix: use after free bug in can protocol drivers
Fix a use after free bug in can protocol drivers

The release functions of the can protocol drivers lack a call to
sock_orphan() which leads to referencing freed memory under certain
circumstances.

This patch fixes a bug reported here:
https://lists.berlios.de/pipermail/socketcan-users/2009-July/000985.html

Signed-off-by: Lothar Wassmann <LW@KARO-electronics.de>
Acked-by: Oliver Hartkopp <oliver@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-15 11:20:37 -07:00
Andreas Jaggi ee686ca919 gre: fix ToS/DiffServ inherit bug
Fixes two bugs:
- ToS/DiffServ inheritance was unintentionally activated when using impair fixed ToS values
- ECN bit was lost during ToS/DiffServ inheritance

Signed-off-by: Andreas Jaggi <aj@open.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-14 09:35:59 -07:00
Sascha Hlusiak f2ba025b20 sit: fix regression: do not release skb->dst before xmit
The sit module makes use of skb->dst in it's xmit function, so since
93f154b594 ("net: release dst entry in dev_hard_start_xmit()") sit
tunnels are broken, because the flag IFF_XMIT_DST_RELEASE is not
unset.

This patch unsets that flag for sit devices to fix this
regression.

Signed-off-by: Sascha Hlusiak <contact@saschahlusiak.de>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-11 20:30:52 -07:00
Eric Dumazet e51a67a9c8 net: ip_push_pending_frames() fix
After commit 2b85a34e91
(net: No more expensive sock_hold()/sock_put() on each tx)
we do not take any more references on sk->sk_refcnt on outgoing packets.

I forgot to delete two __sock_put() from ip_push_pending_frames()
and ip6_push_pending_frames().

Reported-by: Emil S Tantilov <emils.tantilov@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Tested-by: Emil S Tantilov <emils.tantilov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-11 20:26:21 -07:00
Eric Dumazet e912b1142b net: sk_prot_alloc() should not blindly overwrite memory
Some sockets use SLAB_DESTROY_BY_RCU, and our RCU code correctness
depends on sk->sk_nulls_node.next being always valid. A NULL
value is not allowed as it might fault a lockless reader.

Current sk_prot_alloc() implementation doesnt respect this hypothesis,
calling kmem_cache_alloc() with __GFP_ZERO. Just call memset() around
the forbidden field.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-11 20:26:19 -07:00
Jiri Olsa a57de0b433 net: adding memory barrier to the poll and receive callbacks
Adding memory barrier after the poll_wait function, paired with
receive callbacks. Adding fuctions sock_poll_wait and sk_has_sleeper
to wrap the memory barrier.

Without the memory barrier, following race can happen.
The race fires, when following code paths meet, and the tp->rcv_nxt
and __add_wait_queue updates stay in CPU caches.

CPU1                         CPU2

sys_select                   receive packet
  ...                        ...
  __add_wait_queue           update tp->rcv_nxt
  ...                        ...
  tp->rcv_nxt check          sock_def_readable
  ...                        {
  schedule                      ...
                                if (sk->sk_sleep && waitqueue_active(sk->sk_sleep))
                                        wake_up_interruptible(sk->sk_sleep)
                                ...
                             }

If there was no cache the code would work ok, since the wait_queue and
rcv_nxt are opposit to each other.

Meaning that once tp->rcv_nxt is updated by CPU2, the CPU1 either already
passed the tp->rcv_nxt check and sleeps, or will get the new value for
tp->rcv_nxt and will return with new data mask.
In both cases the process (CPU1) is being added to the wait queue, so the
waitqueue_active (CPU2) call cannot miss and will wake up CPU1.

The bad case is when the __add_wait_queue changes done by CPU1 stay in its
cache, and so does the tp->rcv_nxt update on CPU2 side.  The CPU1 will then
endup calling schedule and sleep forever if there are no more data on the
socket.

Calls to poll_wait in following modules were ommited:
	net/bluetooth/af_bluetooth.c
	net/irda/af_irda.c
	net/irda/irnet/irnet_ppp.c
	net/mac80211/rc80211_pid_debugfs.c
	net/phonet/socket.c
	net/rds/af_rds.c
	net/rfkill/core.c
	net/sunrpc/cache.c
	net/sunrpc/rpc_pipe.c
	net/tipc/socket.c

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-09 17:06:57 -07:00
Anton Vorontsov 1b614fb9a0 netpoll: Fix carrier detection for drivers that are using phylib
Using early netconsole and gianfar driver this error pops up:

  netconsole: timeout waiting for carrier

It appears that net/core/netpoll.c:netpoll_setup() is using
cond_resched() in a loop waiting for a carrier.

The thing is that cond_resched() is a no-op when system_state !=
SYSTEM_RUNNING, and so drivers/net/phy/phy.c's state_queue is never
scheduled, therefore link detection doesn't work.

I belive that the main problem is in cond_resched()[1], but despite
how the cond_resched() story ends, it might be a good idea to call
msleep(1) instead of cond_resched(), as suggested by Andrew Morton.

[1] http://lkml.org/lkml/2009/7/7/463

Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-08 20:09:44 -07:00
David S. Miller d2daeabf62 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2009-07-08 18:13:13 -07:00
Jarek Poplawski 345aa03120 ipv4: Fix fib_trie rebalancing, part 4 (root thresholds)
Pawel Staszewski wrote:
<blockquote>
Some time ago i report this:
http://bugzilla.kernel.org/show_bug.cgi?id=6648

and now with 2.6.29 / 2.6.29.1 / 2.6.29.3 and 2.6.30 it back
dmesg output:
oprofile: using NMI interrupt.
Fix inflate_threshold_root. Now=15 size=11 bits
...
Fix inflate_threshold_root. Now=15 size=11 bits

cat /proc/net/fib_triestat
Basic info: size of leaf: 40 bytes, size of tnode: 56 bytes.
Main:
        Aver depth:     2.28
        Max depth:      6
        Leaves:         276539
        Prefixes:       289922
        Internal nodes: 66762
          1: 35046  2: 13824  3: 9508  4: 4897  5: 2331  6: 1149  7: 5
9: 1  18: 1
        Pointers: 691228
Null ptrs: 347928
Total size: 35709  kB
</blockquote>

It seems, the current threshold for root resizing is too aggressive,
and it causes misleading warnings during big updates, but it might be
also responsible for memory problems, especially with non-preempt
configs, when RCU freeing is delayed long after call_rcu.

It should be also mentioned that because of non-atomic changes during
resizing/rebalancing the current lookup algorithm can miss valid leaves
so it's additional argument to shorten these activities even at a cost
of a minimally longer searching.

This patch restores values before the patch "[IPV4]: fib_trie root
node settings", commit: 965ffea43d from
v2.6.22.

Pawel's report:
<blockquote>
I dont see any big change of (cpu load or faster/slower
routing/propagating routes from bgpd or something else) - in avg there
is from 2% to 3% more of CPU load i dont know why but it is - i change
from "preempt" to "no preempt" 3 times and check this my "mpstat -P ALL
1 30"
always avg cpu load was from 2 to 3% more compared to "no preempt"
[...]
cat /proc/net/fib_triestat
Basic info: size of leaf: 20 bytes, size of tnode: 36 bytes.
Main:
        Aver depth:     2.44
        Max depth:      6
        Leaves:         277814
        Prefixes:       291306
        Internal nodes: 66420
          1: 32737  2: 14850  3: 10332  4: 4871  5: 2313  6: 942  7: 371  8: 3  17: 1
        Pointers: 599098
Null ptrs: 254865
Total size: 18067  kB
</blockquote>

According to this and other similar reports average depth is slightly
increased (~0.2), and root nodes are shorter (log 17 vs. 18), but
there is no visible performance decrease. So, until memory handling is
improved or added parameters for changing this individually, this
patch resets to safer defaults.

Reported-by: Pawel Staszewski <pstaszewski@itcare.pl>
Reported-by: Jorge Boncompte [DTI2] <jorge@dti2.net>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Tested-by: Pawel Staszewski <pstaszewski@itcare.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-08 10:46:45 -07:00
Luciano Coelho 3938b45c1c mac80211: minstrel: avoid accessing negative indices in rix_to_ndx()
If rix is not found in mi->r[], i will become -1 after the loop.  This value
is eventually used to access arrays, so we were accessing arrays with a
negative index, which is obviously not what we want to do.  This patch fixes
this potential problem.

Signed-off-by: Luciano Coelho <luciano.coelho@nokia.com>
Acked-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-07 12:55:28 -04:00
Johannes Berg 2dce4c2b5f cfg80211: fix refcount leak
The code in cfg80211's cfg80211_bss_update erroneously
grabs a reference to the BSS, which means that it will
never be freed.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: stable@kernel.org [2.6.29, 2.6.30]
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-07 12:55:28 -04:00
Andrey Yurovsky 59615b5f9d mac80211: fix allocation in mesh_queue_preq
We allocate a PREQ queue node in mesh_queue_preq, however the allocation
may cause us to sleep.  Use GFP_ATOMIC to prevent this.

[ 1869.126498] BUG: scheduling while atomic: ping/1859/0x10000100
[ 1869.127164] Modules linked in: ath5k mac80211 ath
[ 1869.128310] Pid: 1859, comm: ping Not tainted 2.6.30-wl #1
[ 1869.128754] Call Trace:
[ 1869.129293]  [<c1023a2b>] __schedule_bug+0x48/0x4d
[ 1869.129866]  [<c13b5533>] __schedule+0x77/0x67a
[ 1869.130544]  [<c1026f2e>] ? release_console_sem+0x17d/0x185
[ 1869.131568]  [<c807cf47>] ? mesh_queue_preq+0x2b/0x165 [mac80211]
[ 1869.132318]  [<c13b5b3e>] schedule+0x8/0x1f
[ 1869.132807]  [<c1023c12>] __cond_resched+0x16/0x2f
[ 1869.133478]  [<c13b5bf0>] _cond_resched+0x27/0x32
[ 1869.134191]  [<c108a370>] kmem_cache_alloc+0x1c/0xcf
[ 1869.134714]  [<c10273ae>] ? printk+0x15/0x17
[ 1869.135670]  [<c807cf47>] mesh_queue_preq+0x2b/0x165 [mac80211]
[ 1869.136731]  [<c807d1f8>] mesh_nexthop_lookup+0xee/0x12d [mac80211]
[ 1869.138130]  [<c807417e>] ieee80211_xmit+0xe6/0x2b2 [mac80211]
[ 1869.138935]  [<c80be46d>] ? ath5k_hw_setup_rx_desc+0x0/0x66 [ath5k]
[ 1869.139831]  [<c80c97bc>] ? ath5k_tasklet_rx+0xba/0x506 [ath5k]
[ 1869.140863]  [<c8075191>] ieee80211_subif_start_xmit+0x6c9/0x6e4
[mac80211]
[ 1869.141665]  [<c105cf1c>] ? handle_level_irq+0x78/0x9d
[ 1869.142390]  [<c12e3f93>] dev_hard_start_xmit+0x168/0x1c7
[ 1869.143092]  [<c12f1f17>] __qdisc_run+0xe1/0x1b7
[ 1869.143612]  [<c12e25ff>] qdisc_run+0x18/0x1a
[ 1869.144248]  [<c12e62f4>] dev_queue_xmit+0x16a/0x25a
[ 1869.144785]  [<c13b6dcc>] ? _read_unlock_bh+0xe/0x10
[ 1869.145465]  [<c12eacdb>] neigh_resolve_output+0x19c/0x1c7
[ 1869.146182]  [<c130e2da>] ? ip_finish_output+0x0/0x51
[ 1869.146697]  [<c130e2a0>] ip_finish_output2+0x182/0x1bc
[ 1869.147358]  [<c130e327>] ip_finish_output+0x4d/0x51
[ 1869.147863]  [<c130e9d5>] ip_output+0x80/0x85
[ 1869.148515]  [<c130cc49>] dst_output+0x9/0xb
[ 1869.149141]  [<c130dec6>] ip_local_out+0x17/0x1a
[ 1869.149632]  [<c130e0bc>] ip_push_pending_frames+0x1f3/0x255
[ 1869.150343]  [<c13247ff>] raw_sendmsg+0x5e6/0x667
[ 1869.150883]  [<c1033c55>] ? insert_work+0x6a/0x73
[ 1869.151834]  [<c8071e00>] ?
ieee80211_invoke_rx_handlers+0x17da/0x1ae8 [mac80211]
[ 1869.152630]  [<c132bd68>] inet_sendmsg+0x3b/0x48
[ 1869.153232]  [<c12d7deb>] __sock_sendmsg+0x45/0x4e
[ 1869.153740]  [<c12d8537>] sock_sendmsg+0xb8/0xce
[ 1869.154519]  [<c80be46d>] ? ath5k_hw_setup_rx_desc+0x0/0x66 [ath5k]
[ 1869.155289]  [<c1036b25>] ? autoremove_wake_function+0x0/0x30
[ 1869.155859]  [<c115992b>] ? __copy_from_user_ll+0x11/0xce
[ 1869.156573]  [<c1159d99>] ? copy_from_user+0x31/0x54
[ 1869.157235]  [<c12df646>] ? verify_iovec+0x40/0x6e
[ 1869.157778]  [<c12d869a>] sys_sendmsg+0x14d/0x1a5
[ 1869.158714]  [<c8072c40>] ? __ieee80211_rx+0x49e/0x4ee [mac80211]
[ 1869.159641]  [<c80c83fe>] ? ath5k_rxbuf_setup+0x6d/0x8d [ath5k]
[ 1869.160543]  [<c80be46d>] ? ath5k_hw_setup_rx_desc+0x0/0x66 [ath5k]
[ 1869.161434]  [<c80beba4>] ? ath5k_hw_get_rxdp+0xe/0x10 [ath5k]
[ 1869.162319]  [<c80c97bc>] ? ath5k_tasklet_rx+0xba/0x506 [ath5k]
[ 1869.163063]  [<c1005627>] ? enable_8259A_irq+0x40/0x43
[ 1869.163594]  [<c101edb8>] ? __dequeue_entity+0x23/0x27
[ 1869.164793]  [<c100187a>] ? __switch_to+0x2b/0x105
[ 1869.165442]  [<c1021d5f>] ? finish_task_switch+0x5b/0x74
[ 1869.166129]  [<c12d963a>] sys_socketcall+0x14b/0x17b
[ 1869.166612]  [<c1002b95>] syscall_call+0x7/0xb

Signed-off-by: Andrey Yurovsky <andrey@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-07 12:55:27 -04:00
Jiri Slaby 1f5fc70a25 Wireless: nl80211, fix lock imbalance
Don't forget to unlock cfg80211_mutex in one fail path of
nl80211_set_wiphy.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-07-07 12:55:25 -04:00
Wei Yongjun 1bc4ee4088 sctp: fix warning at inet_sock_destruct() while release sctp socket
Commit 'net: Move rx skb_orphan call to where needed' broken sctp protocol
with warning at inet_sock_destruct(). Actually, sctp can do this right with
sctp_sock_rfree_frag() and sctp_skb_set_owner_r_frag() pair.

    sctp_sock_rfree_frag(skb);
    sctp_skb_set_owner_r_frag(skb, newsk);

This patch not revert the commit d55d87fdff,
instead remove the sctp_sock_rfree_frag() function.

------------[ cut here ]------------
WARNING: at net/ipv4/af_inet.c:151 inet_sock_destruct+0xe0/0x142()
Modules linked in: sctp ipv6 dm_mirror dm_region_hash dm_log dm_multipath
scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last unloaded: scsi_wait_scan]
Pid: 1808, comm: sctp_test Not tainted 2.6.31-rc2 #40
Call Trace:
 [<c042dd06>] warn_slowpath_common+0x6a/0x81
 [<c064a39a>] ? inet_sock_destruct+0xe0/0x142
 [<c042dd2f>] warn_slowpath_null+0x12/0x15
 [<c064a39a>] inet_sock_destruct+0xe0/0x142
 [<c05fde44>] __sk_free+0x19/0xcc
 [<c05fdf50>] sk_free+0x18/0x1a
 [<ca0d14ad>] sctp_close+0x192/0x1a1 [sctp]
 [<c0649f7f>] inet_release+0x47/0x4d
 [<c05fba4d>] sock_release+0x19/0x5e
 [<c05fbab3>] sock_close+0x21/0x25
 [<c049c31b>] __fput+0xde/0x189
 [<c049c3de>] fput+0x18/0x1a
 [<c049988f>] filp_close+0x56/0x60
 [<c042f422>] put_files_struct+0x5d/0xa1
 [<c042f49f>] exit_files+0x39/0x3d
 [<c043086a>] do_exit+0x1a5/0x5dd
 [<c04a86c2>] ? d_kill+0x35/0x3b
 [<c0438fa4>] ? dequeue_signal+0xa6/0x115
 [<c0430d05>] do_group_exit+0x63/0x8a
 [<c0439504>] get_signal_to_deliver+0x2e1/0x2f9
 [<c0401d9e>] do_notify_resume+0x7c/0x6b5
 [<c043f601>] ? autoremove_wake_function+0x0/0x34
 [<c04a864e>] ? __d_free+0x3d/0x40
 [<c04a867b>] ? d_free+0x2a/0x3c
 [<c049ba7e>] ? vfs_write+0x103/0x117
 [<c05fc8fa>] ? sys_socketcall+0x178/0x182
 [<c0402a56>] work_notifysig+0x13/0x19
---[ end trace 9db92c463e789fba ]---

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-06 12:47:08 -07:00
Stephane Contri 1ded3f59f3 dsa: fix 88e6xxx statistics counter snapshotting
The bit that tells us whether a statistics counter snapshot operation
has completed is located in the GLOBAL register block, not in the
GLOBAL2 register block, so fix up mv88e6xxx_stats_wait() to poll the
right register address.

Signed-off-by: Stephane Contri <Stephane.Contri@grassvalley.com>
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Cc: stable@kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-05 18:03:35 -07:00
Brian Haley a1ed05263b IPv6: preferred lifetime of address not getting updated
There's a bug in addrconf_prefix_rcv() where it won't update the
preferred lifetime of an IPv6 address if the current valid lifetime
of the address is less than 2 hours (the minimum value in the RA).

For example, If I send a router advertisement with a prefix that
has valid lifetime = preferred lifetime = 2 hours we'll build
this address:

3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
    inet6 2001:1890:1109:a20:217:8ff:fe7d:4718/64 scope global dynamic
       valid_lft 7175sec preferred_lft 7175sec

If I then send the same prefix with valid lifetime = preferred
lifetime = 0 it will be ignored since the minimum valid lifetime
is 2 hours:

3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
    inet6 2001:1890:1109:a20:217:8ff:fe7d:4718/64 scope global dynamic
       valid_lft 7161sec preferred_lft 7161sec

But according to RFC 4862 we should always reset the preferred lifetime
even if the valid lifetime is invalid, which would cause the address
to immediately get deprecated.  So with this patch we'd see this:

5: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
    inet6 2001:1890:1109:a20:21f:29ff:fe5a:ef04/64 scope global deprecated dynamic
       valid_lft 7163sec preferred_lft 0sec

The comment winds-up being 5x the size of the code to fix the problem.

Update the preferred lifetime of IPv6 addresses derived from a prefix
info option in a router advertisement even if the valid lifetime in
the option is invalid, as specified in RFC 4862 Section 5.5.3e.  Fixes
an issue where an address will not immediately become deprecated.
Reported by Jens Rosenboom.

Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-03 19:10:13 -07:00
Wei Yongjun 59cae0092e xfrm6: fix the proto and ports decode of sctp protocol
The SCTP pushed the skb above the sctp chunk header, so the
check of pskb_may_pull(skb, nh + offset + 1 - skb->data) in
_decode_session6() will never return 0 and the ports decode
of sctp will always fail. (nh + offset + 1 - skb->data < 0)

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-03 19:10:10 -07:00
Wei Yongjun c615c9f3f3 xfrm4: fix the ports decode of sctp protocol
The SCTP pushed the skb data above the sctp chunk header, so the check
of pskb_may_pull(skb, xprth + 4 - skb->data) in _decode_session4() will
never return 0 because xprth + 4 - skb->data < 0, the ports decode of
sctp will always fail.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-03 19:10:06 -07:00
Abhishek Kulkarni 15da4b1612 net/9p: Fix crash due to bad mount parameters.
It is not safe to use match_int without checking the token type returned
by match_token (especially when the token type returned is Opt_err and
args is empty). Fix it.

Signed-off-by: Abhishek Kulkarni <adkulkar@umail.iu.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-02 13:17:01 -07:00
Eric W. Biederman f8a68e752b Revert "ipv4: arp announce, arp_proxy and windows ip conflict verification"
This reverts commit 73ce7b01b4.

After discovering that we don't listen to gratuitious arps in 2.6.30
I tracked the failure down to this commit.

The patch makes absolutely no sense.  RFC2131 RFC3927 and RFC5227.
are all in agreement that an arp request with sip == 0 should be used
for the probe (to prevent learning) and an arp request with sip == tip
should be used for the gratitous announcement that people can learn
from.

It appears the author of the broken patch got those two cases confused
and modified the code to drop all gratuitous arp traffic.  Ouch!

Cc: stable@kernel.org
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-30 19:47:08 -07:00