Commit graph

29853 commits

Author SHA1 Message Date
Alexander Aring b236b954de 6lowpan: remove skb->dev assignment
This patch removes the assignment of skb->dev. We don't need it here because
we use the netdev_alloc_skb_ip_align function which already sets the
skb->dev.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Reviewed-by: Werner Almesberger <werner@almesberger.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-28 19:47:52 -04:00
Alexander Aring b614442f34 6lowpan: use netdev_alloc_skb instead dev_alloc_skb
This patch uses the netdev_alloc_skb instead dev_alloc_skb function and
drops the seperate assignment to skb->dev.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Reviewed-by: Werner Almesberger <werner@almesberger.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-28 19:47:51 -04:00
Alexander Aring 53cb5717b4 6lowpan: remove unnecessary check on err >= 0
The err variable can only be zero in this case.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Reviewed-by: Werner Almesberger <werner@almesberger.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-28 19:47:51 -04:00
Alexander Aring 545f3613a8 6lowpan: remove unnecessary ret variable
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Reviewed-by: Werner Almesberger <werner@almesberger.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-28 19:47:51 -04:00
wangweidong 747edc0f9e sctp: merge two if statements to one
Two if statements do the same work, we can merge them to
one. And fix some typos. There is just code simplification,
no functional changes.

Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-28 01:02:34 -04:00
wangweidong 3dc0a548a0 sctp: remove the repeat initialize with 0
kmem_cache_zalloc had set the allocated memory to zero. I think no need
to initialize with 0. And move the comments to the function begin.

Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-28 01:02:34 -04:00
wangweidong 2bccbadf20 sctp: fix some comments in chunk.c and associola.c
fix some typos

Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-28 01:02:34 -04:00
Eric Dumazet 8c3a897bfa inet: restore gso for vxlan
Alexei reported a performance regression on vxlan, caused
by commit 3347c96029 "ipv4: gso: make inet_gso_segment() stackable"

GSO vxlan packets were not properly segmented, adding IP fragments
while they were not expected.

Rename 'bool tunnel' to 'bool encap', and add a new boolean
to express the fact that UDP should be fragmented.
This fragmentation is triggered by skb->encapsulation being set.

Remove a "skb->encapsulation = 1" added in above commit,
as its not needed, as frags inherit skb->frag from original
GSO skb.

Reported-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-28 00:23:06 -04:00
Alexei Starovoitov 7f29405403 net: fix rtnl notification in atomic context
commit 991fb3f74c "dev: always advertise rx_flags changes via netlink"
introduced rtnl notification from __dev_set_promiscuity(),
which can be called in atomic context.

Steps to reproduce:
ip tuntap add dev tap1 mode tap
ifconfig tap1 up
tcpdump -nei tap1 &
ip tuntap del dev tap1 mode tap

[  271.627994] device tap1 left promiscuous mode
[  271.639897] BUG: sleeping function called from invalid context at mm/slub.c:940
[  271.664491] in_atomic(): 1, irqs_disabled(): 0, pid: 3394, name: ip
[  271.677525] INFO: lockdep is turned off.
[  271.690503] CPU: 0 PID: 3394 Comm: ip Tainted: G        W    3.12.0-rc3+ #73
[  271.703996] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012
[  271.731254]  ffffffff81a58506 ffff8807f0d57a58 ffffffff817544e5 ffff88082fa0f428
[  271.760261]  ffff8808071f5f40 ffff8807f0d57a88 ffffffff8108bad1 ffffffff81110ff8
[  271.790683]  0000000000000010 00000000000000d0 00000000000000d0 ffff8807f0d57af8
[  271.822332] Call Trace:
[  271.838234]  [<ffffffff817544e5>] dump_stack+0x55/0x76
[  271.854446]  [<ffffffff8108bad1>] __might_sleep+0x181/0x240
[  271.870836]  [<ffffffff81110ff8>] ? rcu_irq_exit+0x68/0xb0
[  271.887076]  [<ffffffff811a80be>] kmem_cache_alloc_node+0x4e/0x2a0
[  271.903368]  [<ffffffff810b4ddc>] ? vprintk_emit+0x1dc/0x5a0
[  271.919716]  [<ffffffff81614d67>] ? __alloc_skb+0x57/0x2a0
[  271.936088]  [<ffffffff810b4de0>] ? vprintk_emit+0x1e0/0x5a0
[  271.952504]  [<ffffffff81614d67>] __alloc_skb+0x57/0x2a0
[  271.968902]  [<ffffffff8163a0b2>] rtmsg_ifinfo+0x52/0x100
[  271.985302]  [<ffffffff8162ac6d>] __dev_notify_flags+0xad/0xc0
[  272.001642]  [<ffffffff8162ad0c>] __dev_set_promiscuity+0x8c/0x1c0
[  272.017917]  [<ffffffff81731ea5>] ? packet_notifier+0x5/0x380
[  272.033961]  [<ffffffff8162b109>] dev_set_promiscuity+0x29/0x50
[  272.049855]  [<ffffffff8172e937>] packet_dev_mc+0x87/0xc0
[  272.065494]  [<ffffffff81732052>] packet_notifier+0x1b2/0x380
[  272.080915]  [<ffffffff81731ea5>] ? packet_notifier+0x5/0x380
[  272.096009]  [<ffffffff81761c66>] notifier_call_chain+0x66/0x150
[  272.110803]  [<ffffffff8108503e>] __raw_notifier_call_chain+0xe/0x10
[  272.125468]  [<ffffffff81085056>] raw_notifier_call_chain+0x16/0x20
[  272.139984]  [<ffffffff81620190>] call_netdevice_notifiers_info+0x40/0x70
[  272.154523]  [<ffffffff816201d6>] call_netdevice_notifiers+0x16/0x20
[  272.168552]  [<ffffffff816224c5>] rollback_registered_many+0x145/0x240
[  272.182263]  [<ffffffff81622641>] rollback_registered+0x31/0x40
[  272.195369]  [<ffffffff816229c8>] unregister_netdevice_queue+0x58/0x90
[  272.208230]  [<ffffffff81547ca0>] __tun_detach+0x140/0x340
[  272.220686]  [<ffffffff81547ed6>] tun_chr_close+0x36/0x60

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-25 19:03:45 -04:00
Hannes Frederic Sowa 66415cf8a1 net: initialize hashrnd in flow_dissector with net_get_random_once
We also can defer the initialization of hashrnd in flow_dissector
to its first use. Since net_get_random_once is irq safe now we don't
have to audit the call paths if one of this functions get called by an
interrupt handler.

Cc: David S. Miller <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-25 19:03:39 -04:00
Hannes Frederic Sowa f84be2bd96 net: make net_get_random_once irq safe
I initial build non irq safe version of net_get_random_once because I
would liked to have the freedom to defer even the extraction process of
get_random_bytes until the nonblocking pool is fully seeded.

I don't think this is a good idea anymore and thus this patch makes
net_get_random_once irq safe. Now someone using net_get_random_once does
not need to care from where it is called.

Cc: David S. Miller <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-25 19:03:39 -04:00
Nikolay Aleksandrov 974daef7f8 net: add missing dev_put() in __netdev_adjacent_dev_insert
I think that a dev_put() is needed in the error path to preserve the
proper dev refcount.

CC: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Acked-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-25 19:03:39 -04:00
Hagen Paul Pfeifer 4a3ad7b3ea netem: markov loss model transition fix
The transition from markov state "3 => lost packets within a burst
period" to "1 => successfully transmitted packets within a gap period"
has no *additional* loss event. The loss already happen for transition
from 1 -> 3, this additional loss will make things go wild.

E.g. transition probabilities:

p13:   10%
p31:  100%

Expected:

Ploss = p13 / (p13 + p31)
Ploss = ~9.09%

... but it isn't. Even worse: we get a double loss - each time.
So simple don't return true to indicate loss, rather break and return
false.

Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Stefano Salsano <stefano.salsano@uniroma2.it>
Cc: Fabio Ludovici <fabio.ludovici@yahoo.it>
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-25 19:03:39 -04:00
David S. Miller b45bd46dec Included changes:
- data structure reshaping to accommodate multiple routing protocol
   implementations
 - routing protocol API enhancement
 - send to userspace the event "batman-adv Gateway loss" in case of soft-iface
   destruction and a "batman-adv Gateway" was configured
 - improve the TT component to support and advertise runtime flag changes
 - minor code refactoring
 - make the ICMP kernel-to-userspace communication more generic
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJSZ+VyAAoJEADl0hg6qKeOccIP/1RE68wFqd20BTh6WtnKo4s6
 H/F5WMUrk/HCNe3p4JnIOYv5WESR3tqTylCqBIl3yzcsks2KsLm4zEG5FIx/0J3Y
 1Mgy8V/xlGVN7M+wFSLCgzpTnJ3aCvy7+ied2qoz9KsC4vgiKUimDkPTsbUL7NUp
 OBmJGYfe/nLDoPI/CXu3nJCtNXcDgv5a5Z8ZMBuipeK++JsBMZRLfJEIM+7Q4Ouc
 KNLZFXavwtXAxsHLpWDS48MkVAz0tbyy4P6e2k7iQQq+W1WZjCoFMz0xLIKRFf+Y
 yOOZXItpTTX7rzLxFHkLAopZo9UMPsFjm/OceFBlnAbp24ftfR0b4gjFAUQiKFsq
 lGlGIXVkhsR6arfQ4SIlrrGOW7h+Kea2I1aPWC7yoi/97+22Nrr/a903p+kkhP4t
 sAoMk7DbbdajcV01RULF+xjaBFvEdaSfSBVB5j76Gf9AxNZfSGd7wH1qPG7O7HFT
 jO6Z4fbG6bbHBHMt9j4o9oGg4h5X8epbhZB7e8rwoBe/dSzw+B4CK2Y+j1/QW3C4
 PDqL3t5yi0O0+dhkI2G8DmhTOm+ZKVZt60WIMM+G5T6DiLyECveexGWIbjjZR67w
 qgVXsvE0PwHabY8Ne/z+IlnHY8zegUFYVusQ0lQfLpkKhdjoYXLF709RT42r2QIN
 8/6WlNGD2YG/0sWEDF7H
 =sYKR
 -----END PGP SIGNATURE-----

Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge

Antonio Quartulli says:

====================
this is another set of changes intended for net-next/linux-3.13.
(probably our last pull request for this cycle)

Patches 1 and 2 reshape two of our main data structures in a way that they can
easily be extended in the future to accommodate new routing protocols.

Patches from 3 to 9 improve our routing protocol API and its users so that all
the protocol-related code is not mixed up with the other components anymore.

Patch 10 limits the local Translation Table maximum size to a value such that it
can be fully transfered over the air if needed. This value depends on
fragmentation being enabled or not and on the mtu values.

Patch 11 makes batman-adv send a uevent in case of soft-interface destruction
while a "bat-Gateway" was configured (this informs userspace about the GW not
being available anymore).

Patches 13 and 14 enable the TT component to detect non-mesh client flag
changes at runtime (till now those flags where set upon client detection and
were not changed anymore).

Patch 16 is a generalisation of our user-to-kernel space communication (and
viceversa) used to exchange ICMP packets to send/received to/from the mesh
network. Now it can easily accommodate new ICMP packet types without breaking
the existing userspace API anymore.

Remaining patches are minor changes and cleanups.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-23 17:12:33 -04:00
Hannes Frederic Sowa 7088ad74e6 inet: remove old fragmentation hash initializing
All fragmentation hash secrets now get initialized by their
corresponding hash function with net_get_random_once. Thus we can
eliminate the initial seeding.

Also provide a comment that hash secret seeding happens at the first
call to the corresponding hashing function.

Cc: David S. Miller <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-23 17:01:41 -04:00
Hannes Frederic Sowa b1190570b4 ipv6: split inet6_hash_frag for netfilter and initialize secrets with net_get_random_once
Defer the fragmentation hash secret initialization for IPv6 like the
previous patch did for IPv4.

Because the netfilter logic reuses the hash secret we have to split it
first. Thus introduce a new nf_hash_frag function which takes care to
seed the hash secret.

Cc: David S. Miller <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-23 17:01:40 -04:00
Hannes Frederic Sowa e7b519ba55 ipv4: initialize ip4_frags hash secret as late as possible
Defer the generation of the first hash secret for the ipv4 fragmentation
cache as late as possible.

ip4_frags.rnd gets initial seeded by inet_frags_init and regulary
reseeded by inet_frag_secret_rebuild. Either we call ipqhashfn directly
from ip_fragment.c in which case we initialize the secret directly.

If we first get called by inet_frag_secret_rebuild we install a new secret
by a manual call to get_random_bytes. This secret will be overwritten
as soon as the first call to ipqhashfn happens. This is safe because we
won't race while publishing the new secrets with anyone else.

Cc: Eric Dumazet <edumazet@google.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-23 17:01:40 -04:00
David S. Miller c3fa32b976 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/usb/qmi_wwan.c
	include/net/dst.h

Trivial merge conflicts, both were overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-23 16:49:34 -04:00
Hannes Frederic Sowa 34d92d5315 net: always inline net_secret_init
Currently net_secret_init does not get inlined, so we always have a call
to net_secret_init even in the fast path.

Let's specify net_secret_init as __always_inline so we have the nop in
the fast-path without the call to net_secret_init and the unlikely path
at the epilogue of the function.

jump_labels handle the inlining correctly.

Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-23 16:26:46 -04:00
Simon Wunderlich da6b8c20a5 batman-adv: generalize batman-adv icmp packet handling
Instead of handling icmp packets only up to length of icmp_packet_rr,
the code should handle any icmp length size. Therefore the length
truncating is moved to when the packet is actually sent to userspace
(this does not support lengths longer than icmp_packet_rr yet). Longer
packets are forwarded without truncating.

This patch also cleans up some parts where the icmp header struct could
be used instead of other icmp_packet(_rr) structs to make the code more
readable.

Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
2013-10-23 17:03:47 +02:00
Simon Wunderlich 15c33da6e8 batman-adv: Start new development cycle
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
2013-10-23 17:03:46 +02:00
Antonio Quartulli 0eb01568f0 batman-adv: include the sync-flags when compute the global/local table CRC
Flags covered by TT_SYNC_MASK are kept in sync among the
nodes in the network and therefore they have to be
considered while computing the global/local table CRC.

In this way a generic originator is able to understand if
its table contains the correct flags or not.

Bits from 4 to 7 in the TT flags fields are now reserved for
"synchronized" flags only.

This allows future developers to add more flags of this type
without breaking compatibility.

It's important to note that not all the remote TT flags are
synchronised. This comes from the fact that some flags are
used to inject an information once only.

Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2013-10-23 17:03:46 +02:00
Antonio Quartulli 3c4f7ab60c batman-adv: improve the TT component to support runtime flag changes
Some flags (i.e. the WIFI flag) may change after that the
related client has already been announced. However it is
useful to informa the rest of the network about this change.

Add a runtime-flag-switch detection mechanism and
re-announce the related TT entry to advertise the new flag
value.

This mechanism can be easily exploited by future flags that
may need the same treatment.

Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2013-10-23 17:03:45 +02:00
Antonio Quartulli 0c69aecc5b batman-adv: invoke dev_get_by_index() outside of is_wifi_iface()
Upcoming changes need to perform other checks on the
incoming net_device struct.

To avoid performing dev_get_by_index() for each and every
check, it is better to move it outside of is_wifi_iface()
and search the netdev object once only.

Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2013-10-23 17:03:44 +02:00
Antonio Quartulli 8257f55ae2 batman-adv: send GW_DEL event in case of soft-iface destruction
In case of soft_iface destruction send a GW DEL event to
userspace so that applications which are listening for GW
events are informed about the lost of connectivity and can
react accordingly.

Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
2013-10-23 17:03:44 +02:00
Marek Lindner a19d3d85e1 batman-adv: limit local translation table max size
The local translation table size is limited by what can be
transferred from one node to another via a full table request.

The number of entries fitting into a full table request depend
on whether the fragmentation is enabled or not. Therefore this
patch introduces a max table size check and refuses to add
more local clients when that size is reached. Moreover, if the
max full table packet size changes (MTU change or fragmentation
is disabled) the local table is downsized instantaneously.

Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Acked-by: Antonio Quartulli <ordex@autistici.org>
2013-10-23 17:03:43 +02:00
Antonio Quartulli 4627456a77 batman-adv: adapt the TT component to use the new API functions
Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2013-10-23 17:03:42 +02:00
Antonio Quartulli d0015fdd3d batman-adv: provide orig_node routing API
Some operations executed on an orig_node depends on the
current routing algorithm being used. To easily make this
mechanism routing algorithm agnostic add a orig_node
specific API that each algorithm can populate with its own
routines.

Such routines are then invoked by the code when needed,
without knowing which routing algorithm is currently in use

With this patch 3 API functions are added:
- orig_free (to free routing depending internal structs)
- orig_add_if (to change the inner state of an orig_node
  when a new hard interface is added)
- orig_del_if (to change the inner state of an orig_node
  when an hard interface is removed)

Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2013-10-23 17:03:21 +02:00
Antonio Quartulli 81e26b1a1c batman-adv: adapt the neighbor purging routine to use the new API functions
Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2013-10-23 15:33:12 +02:00
Antonio Quartulli 6680a1249f batman-adv: adapt bonding to use the new API functions
Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2013-10-23 15:33:12 +02:00
Antonio Quartulli c43c981e50 batman-adv: add bat_neigh_is_equiv_or_better API function
Each routing protocol has its own metric semantic and
therefore is the protocol itself the only component able to
compare two metrics to check their "similarity".

This new API allows each routing protocol to implement its
own logic and make the external code protocol agnostic.

Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2013-10-23 15:33:11 +02:00
Antonio Quartulli a3285a8f20 batman-adv: add bat_neigh_cmp API function
This new API allows to compare the two neighbours based on
the metric avoiding the user to deal with any routing
algorithm specific detail

Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2013-10-23 15:33:10 +02:00
Antonio Quartulli 737a2a2297 batman-adv: add bat_orig_print API function
Each routing protocol has its own metric and private
variables, therefore it is useful to introduce a new API
for originator information printing.

This API needs to be implemented by each protocol in order
to provide its specific originator table output.

Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2013-10-23 15:33:10 +02:00
Antonio Quartulli bbad0a5e36 batman-adv: make struct batadv_orig_node algorithm agnostic
some of the struct batadv_orig_node members are B.A.T.M.A.N. IV
specific and therefore they are moved in a algorithm specific
substruct in order to make batadv_orig_node routing algorithm
agnostic

Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2013-10-23 15:33:09 +02:00
Antonio Quartulli 0538f75991 batman-adv: make struct batadv_neigh_node algorithm agnostic
some of the fields in struct batadv_neigh_node are strictly
related to the B.A.T.M.A.N. IV algorithm. In order to
make the struct usable by any routing algorithm it has to be
split and made more generic

Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2013-10-23 15:33:08 +02:00
Linus Lüssing 454594f3b9 Revert "bridge: only expire the mdb entry when query is received"
While this commit was a good attempt to fix issues occuring when no
multicast querier is present, this commit still has two more issues:

1) There are cases where mdb entries do not expire even if there is a
querier present. The bridge will unnecessarily continue flooding
multicast packets on the according ports.

2) Never removing an mdb entry could be exploited for a Denial of
Service by an attacker on the local link, slowly, but steadily eating up
all memory.

Actually, this commit became obsolete with
"bridge: disable snooping if there is no querier" (b00589af3b)
which included fixes for a few more cases.

Therefore reverting the following commits (the commit stated in the
commit message plus three of its follow up fixes):

====================
Revert "bridge: update mdb expiration timer upon reports."
This reverts commit f144febd93.
Revert "bridge: do not call setup_timer() multiple times"
This reverts commit 1faabf2aab.
Revert "bridge: fix some kernel warning in multicast timer"
This reverts commit c7e8e8a8f7.
Revert "bridge: only expire the mdb entry when query is received"
This reverts commit 9f00b2e7cf.
====================

CC: Cong Wang <amwang@redhat.com>
Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Reviewed-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-22 14:41:02 -04:00
ZHAO Gang 0a6957e7d4 net: remove function sk_reset_txq()
What sk_reset_txq() does is just calls function sk_tx_queue_reset(),
and sk_reset_txq() is used only in sock.h, by dst_negative_advice().
Let dst_negative_advice() calls sk_tx_queue_reset() directly so we
can remove unneeded sk_reset_txq().

Signed-off-by: ZHAO Gang <gamerh2o@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-22 14:00:21 -04:00
Neal Cardwell 02cf4ebd82 tcp: initialize passive-side sk_pacing_rate after 3WHS
For passive TCP connections, upon receiving the ACK that completes the
3WHS, make sure we set our pacing rate after we get our first RTT
sample.

On passive TCP connections, when we receive the ACK completing the
3WHS we do not take an RTT sample in tcp_ack(), but rather in
tcp_synack_rtt_meas(). So upon receiving the ACK that completes the
3WHS, tcp_ack() leaves sk_pacing_rate at its initial value.

Originally the initial sk_pacing_rate value was 0, so passive-side
connections defaulted to sysctl_tcp_min_tso_segs (2 segs) in skbuffs
made in the first RTT. With a default initial cwnd of 10 packets, this
happened to be correct for RTTs 5ms or bigger, so it was hard to
see problems in WAN or emulated WAN testing.

Since 7eec4174ff ("pkt_sched: fq: fix non TCP flows pacing"), the
initial sk_pacing_rate is 0xffffffff. So after that change, passive
TCP connections were keeping this value (and using large numbers of
segments per skbuff) until receiving an ACK for data.

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-21 18:56:23 -04:00
Hannes Frederic Sowa c2f17e827b ipv6: probe routes asynchronous in rt6_probe
Routes need to be probed asynchronous otherwise the call stack gets
exhausted when the kernel attemps to deliver another skb inline, like
e.g. xt_TEE does, and we probe at the same time.

We update neigh->updated still at once, otherwise we would send to
many probes.

Cc: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-21 18:56:22 -04:00
Eric Dumazet 61c1db7fae ipv6: sit: add GSO/TSO support
Now ipv6_gso_segment() is stackable, its relatively easy to
implement GSO/TSO support for SIT tunnels

Performance results, when segmentation is done after tunnel
device (as no NIC is yet enabled for TSO SIT support) :

Before patch :

lpq84:~# ./netperf -H 2002:af6:1153:: -Cc
MIGRATED TCP STREAM TEST from ::0 (::) port 0 AF_INET6 to 2002:af6:1153:: () port 0 AF_INET6
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB

 87380  16384  16384    10.00      3168.31   4.81     4.64     2.988   2.877

After patch :

lpq84:~# ./netperf -H 2002:af6:1153:: -Cc
MIGRATED TCP STREAM TEST from ::0 (::) port 0 AF_INET6 to 2002:af6:1153:: () port 0 AF_INET6
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB

 87380  16384  16384    10.00      5525.00   7.76     5.17     2.763   1.840

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-21 18:49:39 -04:00
Eric Dumazet d3e5e0062d ipv6: gso: make ipv6_gso_segment() stackable
In order to support GSO on SIT tunnels, we need to make
inet_gso_segment() stackable.

It should not assume network header starts right after mac
header.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-21 18:49:39 -04:00
Eric W. Biederman fd2d5356d9 ipv4: Allow unprivileged users to use per net sysctls
Allow unprivileged users to use:
/proc/sys/net/ipv4/icmp_echo_ignore_all
/proc/sys/net/ipv4/icmp_echo_ignore_broadcasts
/proc/sys/net/ipv4/icmp_ignore_bogus_error_response
/proc/sys/net/ipv4/icmp_errors_use_inbound_ifaddr
/proc/sys/net/ipv4/icmp_ratelimit
/proc/sys/net/ipv4/icmp_ratemask
/proc/sys/net/ipv4/ping_group_range
/proc/sys/net/ipv4/tcp_ecn
/proc/sys/net/ipv4/ip_local_ports_range

These are occassionally handy and after a quick review I don't see
any problems with unprivileged users using them.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-21 18:43:03 -04:00
Eric W. Biederman 0a6fa23dcb ipv4: Use math to point per net sysctls into the appropriate struct net.
Simplify maintenance of ipv4_net_table by using math to point the per
net sysctls into the appropriate struct net, instead of manually
reassinging all of the variables into hard coded table slots.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-21 18:43:02 -04:00
Eric W. Biederman 2e685cad57 tcp_memcontrol: Kill struct tcp_memcontrol
Replace the pointers in struct cg_proto with actual data fields and kill
struct tcp_memcontrol as it is not fully redundant.

This removes a confusing, unnecessary layer of abstraction.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-21 18:43:02 -04:00
Eric W. Biederman a4fe34bf90 tcp_memcontrol: Remove the per netns control.
The code that is implemented is per memory cgroup not per netns, and
having per netns bits is just confusing.  Remove the per netns bits to
make it easier to see what is really going on.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-21 18:43:02 -04:00
Eric W. Biederman f594d63199 tcp_memcontrol: Remove setting cgroup settings via sysctl
The code is broken and does not constrain sysctl_tcp_mem as
tcp_update_limit does.  With the result that it allows the cgroup tcp
memory limits to be bypassed.

The semantics are broken as the settings are not per netns and are in a
per netns table, and instead looks at current.

Since the code is broken in both design and implementation and does not
implement the functionality for which it was written remove it.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-21 18:43:02 -04:00
Eric W. Biederman cd91cce620 tcp_memcontrol: Remove tcp_max_memory
This function is never called. Remove it.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-21 18:43:02 -04:00
Julian Anastasov 56e42441ed netfilter: nf_conntrack: fix rt6i_gateway checks for H.323 helper
Now when rt6_nexthop() can return nexthop address we can use it
for proper nexthop comparison of directly connected destinations.
For more information refer to commit bbb5823cf7
("netfilter: nf_conntrack: fix rt_gateway checks for H.323 helper").

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-21 18:37:01 -04:00
Julian Anastasov 550bab42f8 ipv6: fill rt6i_gateway with nexthop address
Make sure rt6i_gateway contains nexthop information in
all routes returned from lookup or when routes are directly
attached to skb for generated ICMP packets.

The effect of this patch should be a faster version of
rt6_nexthop() and the consideration of local addresses as
nexthop.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-21 18:37:01 -04:00
David S. Miller 5bf47256f5 Included changed:
- email addresses update in documentation, source files and MAINTAINERS
 - make the TT component distinguish non-mesh clients based on the VLAN they
   belong to
 - improve all the internal components to properly work on a per-VLAN basis
   (enabled by the new TT-VLAN feature)
 - enhance the sysfs interface in order to provide behaviour switches on a
   per-VLAN basis (enabled by the new TT-VLAN feature)
 - improve TT lock mechanism
 - improve unicast transmission APIs
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJSYvnjAAoJEADl0hg6qKeOGFAQAK+P3XkProP+MWhgpQzWDc+F
 TCVvi3eQ9NKVzxVnglRTznEVqXePVArK9oWb39KbCeguoqsuo7T8I+oNK06qPCH6
 1aodkBqJLq0OT1EWIxMo+1eOHCevRRqBjS1Jh0DMxuugMsKZZu3/DrHK59ay/4y7
 8wRb8CqQrKpILsh43cKRm9SPNJj0nmFPIwoWmgu++ffPyfIPMnBHSowMEgxqJ3h4
 Vp4adjJQU2D3qa1Vln99MYzdJaUhRDVDjxdAroCbuk6M1bl9o88UjhFxRvZJN8JN
 HdxiMN1hvlDJ7OsiBGw42RROnibyqkui8BZl5hP85sjbKSSU9lCqMJ1XWW+gVNhx
 sKA7LIm7NPNW9Ysvgd3FhjX/cg18WgjC2HHU26uMhYmGrGUfP8eBw55XidabApgb
 TpGhKjFxhYqfGnPhAtarsqLYfxWh6vbb1G6cyaC5jJ4baIa5YKqt8tejHCNiFJLI
 WrVnmi0TJfGjdoULfUdkBOx/pI6zyZ3PWPISbIDUIslQXrnEzKUj37VbN3N0Qlj1
 QcVcC+iVd3/gJ/dnvKzmeGjsm2nKK5eEwewRtNuPkQSaM13/dN2CUQ4/+/6BSY1D
 wODn+Wc5zCoi8sxvVb7TcT+NLO27QkH0REJh4W9KxJx9NSws0BfiVBcTKPFCra+x
 gMsgNYNdgoCTLQBNj8KK
 =sEEg
 -----END PGP SIGNATURE-----

Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge

Antonio Quartulli says:

====================
this is another batch intended for net-next/linux-3.13.

This pull request is a bit bigger than usual, but 6 patches are very small
(three of them are about email updates)..

Patch 1 is fixing a previous merge conflict resolution that went wrong
(I realised that only now while checking other patches..).
Patches from 2 to 4 that are updating our emails in all the proper files
(Documentation/, headers and MAINTAINERS).

Patches 5, 6 and 7 are bringing a big improvement to the TranslationTable
component: it is now able to group non-mesh clients based on the VLAN they
belong to. In this way a lot a new enhancements are now possible thanks to the
fact that each batman-adv behaviour can be applied on a per VLAN basis.

And, of course, in patches from 8 to 12 you have some of the enhancements I was
talking about:
- make the batman-Gateway selection VLAN dependent
- make DAT (Distributed ARP Table) group ARP entries on a VLAN basis (this
  allows DAT to work even when the admin decided to use the same IP subnet on
  different VLANs)
- make the AP-Isolation behaviour switchable on each VLAN independently
- export VLAN specific attributes via sysfs. Switches like the AP-Isolation are
  now exported once per VLAN (backward compatibility of the sysfs interface has
  been preserved)

Patches 13 and 14 are small code cleanups.
Patch 15 is a minor improvement in the TT locking mechanism.

Patches 16 and 17 are other enhancements to the TT component. Those allow a
node to parse a "non-mesh client announcement message" and accept only those
TT entries belonging to certain VLANs.

Patch 18 exploits this parse&accept mechanism to make the Bridge Loop Avoidance
component reject only TT entries connected to the VLAN where it is operating.
Previous to this change, BLA was rejecting all the entries coming from any other
Backbone node, regardless of the VLAN (for more details about how the Bridge
Loop Avoidance works please check [1]).

[1] http://www.open-mesh.org/projects/batman-adv/wiki/Bridge-loop-avoidance-II
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-19 19:52:42 -04:00