Commit graph

1999 commits

Author SHA1 Message Date
Andrew Thompson f0ac1eedd5 Simplify ipsec_bpf by using bpf_mtap2(). 2006-06-27 01:53:12 +00:00
Andrew Thompson bdea400f3b Add a pseudo interface for packet filtering IPSec connections before or after
encryption. There are two functions, a bpf tap which has a basic header with
the SPI number which our current tcpdump knows how to display, and handoff to
pfil(9) for packet filtering.

Obtained from:	OpenBSD
Based on:	kern/94829
No objections:	arch, net
MFC after:	1 month
2006-06-26 22:30:08 +00:00
Yaroslav Tykhiy 15ed2fa1f1 Fix the VLAN_ARRAY case, mostly regarding improper use of atomic(9)
in place of conventional rw locking.  Alas, atomic(9) can't buy us
lockless operation so easily.
2006-06-21 13:48:34 +00:00
Yaroslav Tykhiy 5cb8c31af1 Track interface department events and detach vlans from
departing trunk so that we don't get into trouble later
by dereferencing a stale pointer to dead trunk's things.

Prodded by:	oleg
Sponsored by:	RiNet (Cronyx Plus LLC)
MFC after:	1 week
2006-06-21 07:29:44 +00:00
Gleb Smirnoff 457f48e65c - First initialize ifnet, and then insert it into global
list.
- First remove from global list, then start destroying.

PR:		kern/97679
Submitted by:	Alex Lyashkov <shadow itt.net.ru>
Reviewed by:	rwatson, brooks
2006-06-21 06:02:35 +00:00
Andrew Thompson 690d79381a Allow gif interfaces to be added as span ports, the user may want to send a
copy of all packets to the other side of the world.
2006-06-20 21:28:18 +00:00
Max Laier 0dad3f0e15 Import interface groups from OpenBSD. This allows to group interfaces in
order to - for example - apply firewall rules to a whole group of
interfaces.  This is required for importing pf from OpenBSD 3.9

Obtained from:	OpenBSD (with changes)
Discussed on:	-net (back in April)
2006-06-19 22:20:45 +00:00
Andrew Thompson 615fccc52b Fix spelling mistake in comment. 2006-06-19 02:25:11 +00:00
Christian S.J. Peron 19ba8395e1 Since we are doing some bpf(4) clean up, change a couple of function prototypes
to be consistent. Also, ANSI'fy function definitions. There is no functional
change here.
2006-06-15 15:39:12 +00:00
Christian S.J. Peron 7eae78a419 If bpf(4) has not been compiled into the kernel, initialize the bpf interface
pointer to a zeroed, statically allocated bpf_if structure. This way the
LIST_EMPTY() macro will always return true. This allows us to remove the
additional unconditional memory reference for each packet in the fast path.

Discussed with:	sam
2006-06-14 02:23:28 +00:00
Andrew Thompson 80829fccd7 Use bit operations to get a locally administered address rather than using a
hardcoded OUI code.
2006-06-12 22:43:37 +00:00
Max Khon affcaf7871 Fix KASSERT conditions in if_deregister_com_alloc(). 2006-06-11 22:09:28 +00:00
Andrew Thompson b3a1f9373a Allow bridge and carp to play nicely together by returning the packet if its
destined for a carp interface.

Obtained from:	OpenBSD
MFC after:	2 weeks
2006-06-08 23:40:16 +00:00
Qing Li 1a41f91052 Assuming the interface has an address of x.x.x.195, a mask of
255.255.255.0, and a default route with gateway x.x.x.1. Now if
the address mask is changed to something more specific, e.g.,
255.255.255.128, then after the mask change the default gateway
is no longer reachable.

Since the default route is still present in the routing table,
when the output code tries to resolve the address of the default
gateway in function rt_check(), again, the default route will be
returned by rtalloc1(). Because the lock is currently held on the
rtentry structure, one more attempt to hold the lock will trigger
a crash due to "lock recursed on non-recursive mutex ..."

This is a general problem. The fix checks for the above condition
so that an existing route entry is not mistaken for a new cloned
route. Approriately, an ENETUNREACH error is returned back to the
caller

Approved by:	andre
2006-06-05 21:20:21 +00:00
Christian S.J. Peron ffdc0471d4 Back out previous two commits, this caused some problems in the namespace
resulting in some build failures. Instead, to fix the problem of bpf not
being present, check the pointer before dereferencing it.

This is a temporary bandaid until we can decide on how we want to handle
the bpf code not being present. This will be fixed shortly.
2006-06-03 18:48:14 +00:00
Christian S.J. Peron 727b73816c Temporarily include files so that our macro checks do something useful. 2006-06-03 18:16:54 +00:00
Christian S.J. Peron 5255290c9c Make sure we don't try to dereference the the if_bpf pointer when bpf has
not been compiled into the the kernel.

Submitted by:	benno
2006-06-03 06:37:00 +00:00
Sam Leffler ff046a6c6b add missed calls to bpf_peers_present 2006-06-02 23:14:40 +00:00
Christian S.J. Peron 16d878cc99 Fix the following bpf(4) race condition which can result in a panic:
(1) bpf peer attaches to interface netif0
	(2) Packet is received by netif0
	(3) ifp->if_bpf pointer is checked and handed off to bpf
	(4) bpf peer detaches from netif0 resulting in ifp->if_bpf being
	    initialized to NULL.
	(5) ifp->if_bpf is dereferenced by bpf machinery
	(6) Kaboom

This race condition likely explains the various different kernel panics
reported around sending SIGINT to tcpdump or dhclient processes. But really
this race can result in kernel panics anywhere you have frequent bpf attach
and detach operations with high packet per second load.

Summary of changes:

- Remove the bpf interface's "driverp" member
- When we attach bpf interfaces, we now set the ifp->if_bpf member to the
  bpf interface structure. Once this is done, ifp->if_bpf should never be
  NULL. [1]
- Introduce bpf_peers_present function, an inline operation which will do
  a lockless read bpf peer list associated with the interface. It should
  be noted that the bpf code will pickup the bpf_interface lock before adding
  or removing bpf peers. This should serialize the access to the bpf descriptor
  list, removing the race.
- Expose the bpf_if structure in bpf.h so that the bpf_peers_present function
  can use it. This also removes the struct bpf_if; hack that was there.
- Adjust all consumers of the raw if_bpf structure to use bpf_peers_present

Now what happens is:

	(1) Packet is received by netif0
	(2) Check to see if bpf descriptor list is empty
	(3) Pickup the bpf interface lock
	(4) Hand packet off to process

From the attach/detach side:

	(1) Pickup the bpf interface lock
	(2) Add/remove from bpf descriptor list

Now that we are storing the bpf interface structure with the ifnet, there is
is no need to walk the bpf interface list to locate the correct bpf interface.
We now simply look up the interface, and initialize the pointer. This has a
nice side effect of changing a bpf interface attach operation from O(N) (where
N is the number of bpf interfaces), to O(1).

[1] From now on, we can no longer check ifp->if_bpf to tell us whether or
    not we have any bpf peers that might be interested in receiving packets.

In collaboration with:	sam@
MFC after:	1 month
2006-06-02 19:59:33 +00:00
Gleb Smirnoff 6e86062956 Fix gif_output() so that GIF_UNLOCK() is performed only in case
we have locked the softc.

PR:		kern/98298
Submitted by:	Eugene Grosbein
2006-06-02 14:10:52 +00:00
Robert Watson 4421f50dbc raw_disconnect() now disconnects but does not detach the raw pcb. As a
result, raw_uabort() now needs to call raw_detach() directly.  As
raw_uabort() is never called, and raw_disconnect() is probably not ever
actually called in practice, this is likely not a functional change, but
improves congruence between protocols, and avoids a NULL raw cb pointer
after disconnect, which could result in a panic.

MFC after:	1 month
2006-06-02 08:27:15 +00:00
Gleb Smirnoff 4ec449ae88 - Add definition for IFM_10G_CX4.
- Put IFM_10G_CX4 and IFM_10G_SR into IFMEDIA_BAUDRATE array.

Requested by:	Jack Vogel <jfvogel gmail.com>
2006-06-02 07:50:58 +00:00
Andrew Thompson f3b90d48bb Announce all interfaces to devd on attach/detach. This adds a new devctl
notification so all interfaces including pseudo are reported. When netif
creates the clones at startup devctl_disable has not been turned off yet so the
interfaces will not be initialised twice, enforce this by adding an explicit
order between rc.d/netif and rc.d/devd.

This change allows actions to taken in userland when an interface is cloned
and the pseudo interface will be automatically configured if a ifconfig_<int>=""
line exists in rc.conf.

Reviewed by:		brooks
No objections on:	net
2006-06-01 00:41:07 +00:00
Marius Strobl fa67ebf9bb Revert the (int *) -> (intptr_t *) conversion done as part of rev. 1.59
for IOCTLs where casting data to intptr_t * isn't the right thing to do
as _IO() isn't used for them but _IOR(..., int)/_IOW(..., int) are (i.e.
for all IOCTLs except VMIO_SIOCSIFFLAGS), fixing tap(4) on big-endian
LP64 machines.

PR:		sparc64/98084
OK'ed by:	emax
MFC after:	1 week
2006-05-30 20:08:12 +00:00
Ruslan Ermilov 293c06a186 Fix -Wundef warnings. 2006-05-30 19:24:01 +00:00
David Malone a58327bd09 Avoid unwanted sign extension of indexed byte load in bpf code.
PR:		89748
Submitted by:	Guy Harris <guy@alum.mit.edu>
Obtained from:	NetBSD via OpenBSD
MFC after:	2 weeks
2006-05-28 20:00:02 +00:00
Maksim Yevmenkin 7a9adfdd85 Do not call knlist_destroy() in tapclose(). Instead call it when device is
actually destroyed. Also move call to knlist_init() into tapcreate(). This
should fix panic described in kern/95357.

PR:			kern/95357
No response from:	freebsd-current@
MFC after:		3 days
2006-05-17 17:05:02 +00:00
Andrew Thompson dc1b1b7b6a Fix style(9) nits, whitespace and parentheses. 2006-05-16 22:50:41 +00:00
Qing Li e034e82c56 The current routing code allows insertion of indirect routes that have
gateways which are unreachable except through the default router. For
example, assuming there is a default route configured, and inserting
a route

	"route add 64.102.54.0/24 60.80.1.1"

is currently allowed even when 60.80.1.1 is only reachable through
the default route. However, an error is thrown when this route is
utilized, say,

	"ping 64.102.54.1"  will return an error

This type of route insertion should be disallowed becasue:

1) Let's say that somehow our code allowed this packet to flow to
   the default router, and the default router knows the next hop is
   60.80.1.1, then the question is why bother inserting this route in
   the 1st place, just simply use the default route.

2) Since we're not talking about source routing here, the default
   router could very well choose a different path than using 60.80.1.1
   for the next hop, again it defeats the purpose of adding this route.

Reviewed by:	ru, gnn, bz
Approved by:	andre
2006-05-16 19:11:11 +00:00
Daniel Hartmeier 2557a639a5 Recalculate IP checksum after running pfil hooks.
Reviewed by:	thompsa
Tested by:	Adam McDougall <mcdouga9@egr.msu.edu>
2006-05-15 11:49:01 +00:00
Max Laier 656faadcb8 Remove ip6fw. Since ipfw has full functional IPv6 support now and - in
contrast to ip6fw - is properly lockes, it is time to retire ip6fw.
2006-05-12 20:39:23 +00:00
John Baldwin 73dbd3da73 Remove various bits of conditional Alpha code and fixup a few comments. 2006-05-12 05:04:46 +00:00
Jeffrey Hsu a393a28afa Correct test for fragmented packet. 2006-05-11 00:53:43 +00:00
Christian S.J. Peron 1fc9e38706 Pickup locks for the BPF interface structure. It's quite possible that
bpf(4) descriptors can be added and removed on this interface while we
are processing stats.

MFC after:	2 weeks
2006-05-07 03:21:43 +00:00
Bjoern A. Zeeb ac4a76ebc9 In rtrequest and rtinit check for sa_len != 0 for the given
destination. These checks are needed so we do not install
a route looking like this:
(0)                192.0.2.200        UH       tun0 =>

When removing this route  the kernel will start to walk
the address space which looks like a hang on 64bit platforms
because it'll take ages while on 32bit you should see a panic
when kernel debugging options are turned on.

The problem is in rtrequest1:
	if (netmask) {
		rt_maskedcopy(dst, ndst, netmask);
	} else
		bcopy(dst, ndst, dst->sa_len);

In both cases the len might be 0 if the application forgot to
set it.  If so ndst will be all-zero  leading to above
mentioned strange routes.

This is an application error but we must not fail/hang/panic
because of this.

Looks ok:	gnn
No objections:	net@ (silence)
MFC after:	8 weeks
2006-05-04 18:33:37 +00:00
Andrew Thompson 7f87a57ca3 Add support for fragmenting ipv4 packets.
The packet filter may reassemble the ip fragments and return a packet that is
larger than the MTU of the sending interface. There is no check for DF or icmp
replies as we can only get a large packet to fragment by reassembling a
previous fragment, and this only happens after a call to pfil(9).

Obtained from:	OpenBSD (mostly)
Glanced at by:	mlaier
MFC after:	1 month
2006-04-29 05:37:25 +00:00
Robert Watson e0cf89fc53 Use ANSI C function protypes and declarations for if_arcsubr.
MFC after:	1 month
2006-04-12 07:44:31 +00:00
Robert Watson 9d20951479 Correct an assertion in raw_uattach(): this is a library call that other
protocols invoke after allocating a PCB, so so_pcb should be non-NULL.
It is only used by the two IPSEC implementations, so I didn't hit it in
my testing.

Reported by:	pjd
MFC after:	3 months
2006-04-09 15:15:28 +00:00
Andre Oppermann d214ccb6ba Undo damage from wrong MFC to HEAD.
Pointed out by:	jkim, remko
2006-04-04 20:20:51 +00:00
Andre Oppermann bedf8e3354 MFC rev. 1.32: Add link status descriptions and related structures for userland
applications.

Approved by:	re
2006-04-04 20:02:51 +00:00
Robert Watson 0154484bef In raw and raw-derived socket types, maintain and enforce invariant that
the so_pcb pointer on the socket is always non-NULL.  This eliminates
countless unnecessary error checks, replacing them with assertions.

MFC after:	3 months
2006-04-01 15:55:44 +00:00
Robert Watson bc725eafc7 Chance protocol switch method pru_detach() so that it returns void
rather than an error.  Detaches do not "fail", they other occur or
the protocol flags SS_PROTOREF to take ownership of the socket.

soclose() no longer looks at so_pcb to see if it's NULL, relying
entirely on the protocol to decide whether it's time to free the
socket or not using SS_PROTOREF.  so_pcb is now entirely owned and
managed by the protocol code.  Likewise, no longer test so_pcb in
other socket functions, such as soreceive(), which have no business
digging into protocol internals.

Protocol detach routines no longer try to free the socket on detach,
this is performed in the socket code if the protocol permits it.

In rts_detach(), no longer test for rp != NULL in detach, and
likewise in other protocols that don't permit a NULL so_pcb, reduce
the incidence of testing for it during detach.

netinet and netinet6 are not fully updated to this change, which
will be in an upcoming commit.  In their current state they may leak
memory or panic.

MFC after:	3 months
2006-04-01 15:42:02 +00:00
Robert Watson ac45e92ff2 Change protocol switch pru_abort() API so that it returns void rather
than an int, as an error here is not meaningful.  Modify soabort() to
unconditionally free the socket on the return of pru_abort(), and
modify most protocols to no longer conditionally free the socket,
since the caller will do this.

This commit likely leaves parts of netinet and netinet6 in a situation
where they may panic or leak memory, as they have not are not fully
updated by this commit.  This will be corrected shortly in followup
commits to these components.

MFC after:      3 months
2006-04-01 15:15:05 +00:00
Robert Watson a260bd4131 Add IFF_NEEDSGIANT to kernel PPP support. I have no idea why this wasn't
here, but it should have been.

MFC after:	3 days
2006-03-30 08:18:27 +00:00
Andrew Thompson 64cb85059e Assert that the mbuf is not shared to ensure problems like the last commit are
not reintroduced.
2006-03-26 20:52:47 +00:00
Roman Kurakin 5cb7f13aee m_dup () packet not m_copypacket () since we will modify it. For more
details see PR kern/94448.

PR:     kern/94448

Original patch: Eygene A. Ryabinkin <rea-fbsd at rea dot mbslab dot kiae dot ru>Final patch:    thompsa@
Tested by:      thompsa@, Eygene A. Ryabinkin

MFC after:      7 days
2006-03-23 22:57:10 +00:00
Gleb Smirnoff 93a69f5703 No direct call to carp_ifdetach() anymore. It is called by
event handler.

PR:		kern/82908
Submitted by:	Dan Lukes <dan obluda.cz>
2006-03-21 14:31:18 +00:00
Maksim Yevmenkin a9e17e2e05 Add kqueue(2) support on if_tap(4) interfaces. While I'm here, replace
K&R style function declarations with ANSI style. Also fix endian bugs
accessing ioctl arguments that are passed by value.

PR:		kern/93897
Submitted by:	Vilmos Nebehaj < vili at huwico dot hu >
MFC after:	1 week
2006-03-16 18:22:01 +00:00
Andre Oppermann e4bd8f103e Add link status descriptions and related structures for userland
applications.

Open[BGP|OSPF]D make use of this to determine the link status of
interfaces to make the right routing descisions.

Obtained from:	OpenBSD
MFC after:	3 days
2006-03-15 19:43:25 +00:00
Andre Oppermann 22cafcf0b8 - Fill in the correct rtm_index for RTM_ADD and RTM_CHANGE messages.
- Allow RTM_CHANGE to change a number of route flags as specified by
  RTF_FMASK.

- The unused rtm_use field in struct rt_msghdr is redesignated as
  rtm_fmask field to communicate route flag changes in RTM_CHANGE
  messages from userland.  The use count of a route was moved to
  rtm_rmx a long time ago.  For source code compatibility reasons
  a define of rtm_use to rtm_fmask is provided.

These changes faciliate running of multiple cooperating routing
daemons at the same time without causing undesired interference.
Open[BGP|OSPF]D make use of these features to have IGP routes
override EGP ones.

Obtained from:	OpenBSD (claudio@)
MFC after:	3 days
2006-03-15 19:39:09 +00:00
Ruslan Ermilov ceec92fe5d Don't acquire a lock before calling vlan_unconfig().
This fixes a panic when doing "ifconfig ... -vlandev".

OK'ed by:	glebius
2006-03-09 14:42:51 +00:00
Andrew Thompson e1457c3eb1 If we miss the LINK_UP event from the network interface then the bridge port
will remain in the disabled state until another link event happens in the
future (if at all). Add a timer to periodically check the interface state and
recover.

Reported by:	Nik Lam <freebsdnik j2d.lam.net.au>
MFC after:	3 days
2006-03-06 02:28:41 +00:00
Christian S.J. Peron de572b371b Unbreak byte counters when network interfaces are in monitor mode by
re-organizing the monitor return logic. We perform interface monitoring
checks after we have determined if the CRC is still on the packet, if
it is, m_adj() is called which will adjust the packet length. This
ensures that we are not including CRC lengths in the byte counters for
each packet.

Discussed with:	andre, glebius
2006-03-03 17:21:08 +00:00
Andrew Thompson 158a726c96 Since we are using random ethernet addresses for the bridge, it is possible
that we might have address collisions, so make sure that this hardware address
isn't already in use on another bridge.

Submitted by:	csjp
MFC after:	1 month
2006-03-03 09:12:21 +00:00
Christian S.J. Peron 6f75ef188b Slightly re-worked bpf(4) code associated with bridging: if we have a
destination interface as a member of our bridge or this is a unicast packet,
push it through the bpf(4) machinery.

For broadcast or multicast packets, don't bother with the bpf(4) because it will
be re-injected into ether_input. We do this before we pass the packets through
the pfil(9) framework, as it is possible that pfil(9) will drop the packet or
possibly modify it, making it very difficult to debug firewall issues on the
bridge.

Further, implemented IFF_MONITOR for bridge interfaces. This does much the same
thing that it does for regular network interfaces: it pushes the packet to any
bpf(4) peers and then returns. This bypasses all of the bridge machinery,
saving mutex acquisitions, list traversals, and other operations performed by
the bridging code.

This change to the bridging code is useful in situations where individuals use a
bridge to multiplex RX/TX signals from two interfaces, as is required by some
network taps for de-multiplexing links and transmitting the RX/TX signals
out through two separate interfaces. This behaviour is quite common for network
taps monitoring links, especially for certain manufacturers.

Reviewed by:	thompsa
MFC after:	1 month
Sponsored by:	Seccuris Labs
2006-03-03 05:58:18 +00:00
Andrew Thompson 43dc0e8c41 Fix up the Bridge Identifier field in the BPDU packet.
- use the cu_bridge_id rather than the cu_rootid for the bridge address [1]
 - the memcmp return value is not signed so the wrong interface may have been
   selected
 - fix up the calculation of sc_bridge_id

PR:		kern/93909 [1]
MFC after:	3 days
2006-02-28 00:13:24 +00:00
Wojciech A. Koszek 51b4ccb464 This patch fixes a problem, which exists if you have IPSEC in your kernel
and want to have crypto support loaded as KLD. By moving zlib to separate
module and adding MODULE_DEPEND directives, it is possible to use such
configuration without complication. Otherwise, since IPSEC is linked with
zlib (just like crypto.ko) you'll get following error:

	interface zlib.1 already present in the KLD 'kernel'!

Approved by:	cognet (mentor)
2006-02-27 16:56:22 +00:00
Yaroslav Tykhiy 33499e2ae5 Don't to forget to unlock the rwlock on trunk before destroying it.
This should fix panic on "kldunload if_vlan" while vlanX are still there.

Reviewed by:	glebius
2006-02-24 17:25:16 +00:00
Gleb Smirnoff a7c959fe18 Fix build. 2006-02-15 08:25:40 +00:00
Gleb Smirnoff efd19b8fd0 - Introduce ifmedia_baudrate(), which returns correct baudrate of the
given media status. [1]
- Utilize ifmedia_baudrate() in miibus_statchg() to update ifp->if_baudrate.

Obtained from:	NetBSD [1]
2006-02-14 12:10:03 +00:00
Ed Maste 11edc47706 Bump the MODULE_VERSION for HEAD, as the vlan(4) API is different in
RELENG_6, and would require a lower version number.

Requested by:	glebius
Approved by:	rwatson (mentor)
2006-02-10 18:38:33 +00:00
Yaroslav Tykhiy 802dadcfeb Avoid frobbing IFF_UP at any cost (which is close to
zero in this case.)  A kernel driver has IFF_DRV_RUNNING
at its full disposal while IFF_UP may be toggled only by
humans or their daemonic deputies from the userland.

MFC after:	3 days
2006-02-10 11:01:10 +00:00
Ed Maste 7f8b993473 Add a MODULE_VERSION so that other modules (perhaps third-party) can
depend on this one.

Approved by:	rwatson (mentor)
2006-02-09 22:11:58 +00:00
Qing Li 6b7b44acd9 The code in rn_walktree_from() that checks if we backed up too far
did not stop at the right node. Change the backtracking check from
smaller-than to smaller-or-equal to prevent this from happening.
While here fix one additional problem where the insertion of the
default route traversed the entire tree.

PR:		kern/38752
Submitted by:	qingli (before I became committer)
Reviewed by:	andre
MFC after:	3 days
2006-02-07 20:25:39 +00:00
Qing Li d03e5467a4 Remove two unnecessary type casts, of which both had a typo in
it anyways.

Approved by: andre
MFC after: 3 days
2006-02-07 20:09:02 +00:00
Oleg Bulyzhin 3ecf1851df Properly initialize args structure before passing it to ipfw_chk(): having
uninitialized args.inp is unhealthy for uid/gid/jail ipfw rules.

PR:		kern/92589
Approved by:	glebius (mentor)
MFC after:	1 week
2006-02-03 23:03:07 +00:00
Gleb Smirnoff 05a2398f32 In vlan_config() first call vlan_inithash(), then lock mutex, because
vlan_inithash() calls malloc(M_WAITOK).
2006-02-02 22:11:38 +00:00
Christian S.J. Peron fa918e1ef7 define lock.h before rwlock.h for DEBUG_LOCKS 2006-02-02 20:33:10 +00:00
Paul Saab 19cf04981a Implement SIOCGIFCONF for 32bit binaries. 2006-02-02 19:58:37 +00:00
Christian S.J. Peron f5cdbcf14c Use PFIL_HOOKED macros in if_bridge and pass the right argument to
rw_assert. This un-breaks the build.

Submitted by:	Kostik Belousov
Pointy hat to:	csjp
2006-02-02 16:41:20 +00:00
Christian S.J. Peron 604afec496 Somewhat re-factor the read/write locking mechanism associated with the packet
filtering mechanisms to use the new rwlock(9) locking API:

- Drop the variables stored in the phil_head structure which were specific to
  conditions and the home rolled read/write locking mechanism.
- Drop some includes which were used for condition variables
- Drop the inline functions, and convert them to macros. Also, move these
  macros into pfil.h
- Move pfil list locking macros intp phil.h as well
- Rename ph_busy_count to ph_nhooks. This variable will represent the number
  of IN/OUT hooks registered with the pfil head structure
- Define PFIL_HOOKED macro which evaluates to true if there are any
  hooks to be ran by pfil_run_hooks
- In the IP/IP6 stacks, change the ph_busy_count comparison to use the new
  PFIL_HOOKED macro.
- Drop optimization in pfil_run_hooks which checks to see if there are any
  hooks to be ran, and returns if not. This check is already performed by the
  IP stacks when they call:

        if (!PFIL_HOOKED(ph))
                goto skip_hooks;

- Drop in assertion which makes sure that the number of hooks never drops
  below 0 for good measure. This in theory should never happen, and if it
  does than there are problems somewhere
- Drop special logic around PFIL_WAITOK because rw_wlock(9) does not sleep
- Drop variables which support home rolled read/write locking mechanism from
  the IPFW firewall chain structure.
- Swap out the read/write firewall chain lock internal to use the rwlock(9)
  API instead of our home rolled version
- Convert the inlined functions to macros

Reviewed by:	mlaier, andre, glebius
Thanks to:	jhb for the new locking API
2006-02-02 03:13:16 +00:00
Andrew Thompson 6637e0f390 Fix two bugs with the bridge
- code expects memcmp() to return a signed value, our memcmp() returns 0 if
   args are equal and > 0 if not.

 - It's possible to hijack interface for static entry. If bridge recieves
   packet from interface marked as learning it will replace the bridge_rtnode
   entry for the source address even if such entry marked as static.

Submitted by:	Gleb Kurtsov <k-gleb yandex.ru>
MFC after:	3 days
2006-01-31 21:21:28 +00:00
Yaroslav Tykhiy 64a17d2e86 Set IFF_BROADCAST and IFF_MULTICAST on vlan interfaces from the
beginning and simply refuse to attach to a parent without either
flag.

Our network stack cannot handle well IFF_BROADCAST or IFF_MULTICAST
on an interface changing on the fly.  E.g., IP will or won't assign
a broadcast address to an interface and join the all-hosts multicast
group on it depending on its IFF_BROADCAST and IFF_MULTICAST settings.
Should the flags alter later, IP will miss the change and keep using
bogus settings.  This can lead to evil things like supplying an
invalid broadcast address or trying to leave a multicast group that
hasn't been joined.  So just avoid touching the flags since an
interface was created.  This has no practical purpose.

Discussed with:	-net, glebius, oleg
MFC after:	1 week
2006-01-31 16:41:05 +00:00
Gleb Smirnoff 75ee267c22 Merge the //depot/user/yar/vlan branch into CVS. It contains some collective
work by yar, thompsa and myself. The checksum offloading part also involves
work done by Mihail Balikov.

The most important changes:

o   Instead of global linked list of all vlan softc use a per-trunk
  hash. The size of hash is dynamically adjusted, depending on
  number of entries. This changes struct ifnet, replacing counter
  of vlans with a pointer to trunk structure. This change is an
  improvement for setups with big number of VLANs, several interfaces
  and several CPUs. It is a small regression for a setup with a single
  VLAN interface.
    An alternative to dynamic hash is a per-trunk static array with
  4096 entries, which is a compile time option - VLAN_ARRAY. In my
  experiments the array is not an improvement, probably because such
  a big trunk structure doesn't fit into CPU cache.
o   Introduce an UMA zone for VLAN tags. Since drivers depend on it,
  the zone is declared in kern_mbuf.c, not in optional vlan(4) driver.
  This change is a big improvement for any setup utilizing vlan(4).
o   Use rwlock(9) instead of mutex(9) for locking. We are the first
  ones to do this! :)
o   Some drivers can do hardware VLAN tagging + hardware checksum
  offloading. Add an infrastructure for this. Whenever vlan(4) is
  attached to a parent or parent configuration is changed, the flags
  on vlan(4) interface are updated.

In collaboration with:	yar, thompsa
In collaboration with:	Mihail Balikov <mihail.balikov interbgc.com>
2006-01-30 13:45:15 +00:00
Gleb Smirnoff 25af0bb50e Add some initial locking to gif(4). It doesn't covers the whole driver,
however IPv4-in-IPv4 tunnels are now stable on SMP. Details:

- Add per-softc mutex.
- Hold the mutex on output.

The main problem was the rtentry, placed in softc. It could be
freed by ip_output(). Meanwhile, another thread being in
in_gif_output() can read and write this rtentry.

Reported by:	many
Tested by:	Alexander Shiryaev <aixp mail.ru>
2006-01-30 08:39:09 +00:00
Colin Percival 02d4ab93fb Make sure buffers in if_bridge are fully initialized before copying
them to userland.

Security:	FreeBSD-SA-06:06.kmem
2006-01-25 10:00:40 +00:00
Yaroslav Tykhiy 83ec464f61 Be consistent in checking ifa->ifa_addr for NULL.
Found by:	Coverity Prevent (tm)
MFC after:	3 days
2006-01-23 10:30:34 +00:00
Bjoern A. Zeeb 3f2e28fe9f Fix stack corruptions on amd64.
Vararg functions have a different calling convention than regular
functions on amd64. Casting a varag function to a regular one to
match the function pointer declaration will hide the varargs from
the caller and we will end up with an incorrectly setup stack.

Entirely remove the varargs from these functions and change the
functions to match the declaration of the function pointers.
Remove the now unnecessary casts.

Lots of explanations and help from:     peter
Reviewed by:                            peter
PR:                                     amd64/89261
MFC after:                              6 days
2006-01-21 10:44:34 +00:00
Andre Oppermann 5d691e6da8 Return mbuf pointer or NULL from ip_fastforward() as the mbuf pointer
may have changed by m_pullup() during fastforward processing.

While this is a bug it is actually never triggered in real world
situations and it is not remotely exploitable.

Found by:	Coverity Prevent(tm)
Coverity ID:	CID780
Sponsored by:	TCP/IP Optimization Fundraise 2005
2006-01-18 14:24:39 +00:00
Andrew Thompson 7c2fb83a0b Add code that clears certain capabilities from the member interface, these are
restored when its removed from the bridge.

At the moment we only clear IFCAP_TXCSUM. Since a locally generated packet on
the bridge may be sent out any one or more interfaces it cant be assumed that
every card does hardware csums. Most bridges don't generate a lot of traffic
themselves so turning off offloading won't hurt, bridged packets are
unaffected.

Tested by:	Bruce Walker (bmw borderware.com)
MFC after:	5 days
2006-01-14 03:51:31 +00:00
Robert Watson 3208581a15 Check the right ifnet pointer to see if if_alloc() failed or not in
ef_clone(); we were testing the original ifnet, not the one allocated.

When aborting ef_clone() due to if_alloc() failing, free the allocated
efnet structure rather than leaking it.

Noticed by:	Coverity Prevent analysis tool
MFC after:	3 days
2006-01-13 23:24:09 +00:00
Robert Watson ae7c484e82 When freeing the chain of if_ef devices on an aborted load, use
SLIST_FOREACH_SAFE() rather than SLIST_FOREACH(), as elements are
freed on each iteration of the loop.  This prevents use-after-free.

Noticed by:	Coverity Prevent analysis tool
MFC after:	3 days
2006-01-13 23:20:46 +00:00
Brooks Davis 118b438d73 Get rid of the bogus IFP2FC() macro and use IFP2FWC(). IFP2FC()
attempted to cast a struct ifnet to a struct fw_com which resulted in
data corruption.

PR:		kern/91307
Submitted by:	Alex Semenyaka <alex at semenyaka do ru>
MFC After:	6 days
2006-01-11 05:37:21 +00:00
Hartmut Brandt 154508976b Add a new leaf to the net.link.generic.ifdata.%d sysctl to retrieve
the name and unit number assigned by the driver. This is needed by
SNMP to find interfaces after they have been renamed.

MFC after:	4 weeks
2006-01-04 12:57:09 +00:00
Jung-uk Kim 142f81c25d Correctly check the filter length. I committed the wrong version.
Pointy hat to me.
2006-01-03 20:34:41 +00:00
Jung-uk Kim dccb7faff6 - Explicitly validate an empty filter to match bpf_filter() comment[1].
- Do not use BPF JIT compiler for an empty filter.

[1] Pointed out by:	darrenr
2006-01-03 20:26:03 +00:00
Andrew Thompson f0feaf4f19 Fix a brain-o in the last commit, the conditional was always false. 2006-01-02 23:02:43 +00:00
Andrew Thompson 94e45ae5e8 Reorganise bridge_rtupdate slightly to reduce duplication. 2006-01-02 22:44:54 +00:00
Andrew Thompson ef9ac7c49a Reset the route expiry time on each update rather than always letting them get
GC'd and recreated.
2006-01-02 22:29:41 +00:00
Andrew Thompson bc9f74c7cb It is better to use time_uptime here since it is monotonic.
Pointed out by:	glebius
2006-01-02 22:23:03 +00:00
Andrew Thompson ec311647fb Minor whitespace cleanup. 2006-01-02 09:50:34 +00:00
Andrew Thompson f595d62759 Read time_second directly rather than calling getmicrotime().
Obtained from:	DragonflyBSD
2006-01-02 09:36:53 +00:00
Andrew Thompson a47f91cdc4 When pfil(9) is enabled the bridge only considers ETHERTYPE_ARP, ETHERTYPE_IP and
ETHERTYPE_IPV6 frames. Change this to be a sysctl knob so that is able to still
bridge non-IP packets if desired.

Also return early if all pfil_* sysctls are turned off, the user obviously does
not want to filter on the bridge.
2005-12-29 09:39:15 +00:00
Sam Leffler a8af2cc7ce add a sysctl to turn debug msgs on/off when built with IFMEDIA_DEBUG 2005-12-25 23:28:23 +00:00
Oleg Bulyzhin c54c76cc2f 1) remove useless check of loop_copy - corresponding code was removed in
rev. 1.70 five years ago.
2) convert loop_copy to "non-negative" flag

Approved by:	glebius (mentor)
MFC after:	2 weeks
2005-12-22 12:16:20 +00:00
Andrew Thompson 73ff045c57 Add RFC 3378 EtherIP support. This change makes it possible to add gif
interfaces to bridges, which will then send and receive IP protocol 97 packets.
Packets are Ethernet frames with an EtherIP header prepended.

Obtained from:	NetBSD
MFC after:	2 weeks
2005-12-21 21:29:45 +00:00
Andrew Thompson 1e4200620a As of r1.21 all broadcast packets are reprocessed by ether_input as arriving on
the bridge, this caused these packets to show up twice via bpf. Do not process
them twice with BPF_TAP.

MFC after:	3 days
2005-12-21 09:39:59 +00:00
Gleb Smirnoff d147662cd3 - Fix VLAN_INPUT_TAG() macro, so that it doesn't touch mtag in
case if memory allocation failed.
- Remove fourth argument from VLAN_INPUT_TAG(), that was used
  incorrectly in almost all drivers. Indicate failure with
  mbuf value of NULL.

In collaboration with:	yongari, ru, sam
2005-12-18 18:24:27 +00:00
Andrew Thompson 9d5e4aa8b1 Use M_ZERO for the bridge_iflist to ensure there are no unexpected suprises. 2005-12-17 10:12:20 +00:00
Andrew Thompson 6b74382014 Minor whitespace cleanup. 2005-12-17 10:03:48 +00:00
Andrew Thompson e0a87e8acd Change from a callback in if_ethersubr to using EVENTHANDLER in order to detach
span ports when they disappear. The span port does not have a pointer to the
softc so revert r1.31 and bring back the softc linked-list.

MFC after:	2 weeks
2005-12-17 06:33:51 +00:00
Andrew Thompson 7536320f62 It is not safe to use m_copypacket() here as the returned mbuf is readonly,
change to m_dup and keep the alignment on the layer3 header.

MFC after:	1 week
2005-12-15 19:34:39 +00:00
Andrew Thompson 91f6764e93 Add support for creating span ports so that one can snoop bridged traffic
from another interface/machine/network.

Obtained from:	OpenBSD
MFC after:	2 weeks
2005-12-14 02:52:13 +00:00
Jung-uk Kim 200bc1f049 Do not accept an empty bpf program. 2005-12-08 00:05:03 +00:00
Jung-uk Kim 848c454cc1 Add BPF Just-In-Time compiler support for ng_bpf(4).
The sysctl is changed from net.bpf.jitter.enable to net.bpf_jitter.enable
and this controls both bpf(4) and ng_bpf(4) now.
2005-12-07 21:30:47 +00:00
Jung-uk Kim 6a96c4832f s/M_WAITOK/M_NOWAIT/ while mutex is held.
Pointed out by:	csjp
2005-12-06 07:22:01 +00:00
Jung-uk Kim ae275efcae Add experimental BPF Just-In-Time compiler for amd64 and i386.
Use the following kernel configuration option to enable:

	options BPF_JITTER

If you want to use bpf_filter() instead (e. g., debugging), do:

	sysctl net.bpf.jitter.enable=0

to turn it off.

Currently BIOCSETWF and bpf_mtap2() are unsupported, and bpf_mtap() is
partially supported because 1) no need, 2) avoid expensive m_copydata(9).

Obtained from:	WinPcap 3.1 (for i386)
2005-12-06 02:58:12 +00:00
Ruslan Ermilov 3238c6bd33 Fix -Wundef from compiling the amd64 LINT. 2005-12-04 10:06:06 +00:00
Ruslan Ermilov f4e9888107 Fix -Wundef. 2005-12-04 02:12:43 +00:00
Andrew Thompson 53b5c4604a The bridge is capable of sending broadcast packets so enable IFF_BROADCAST
Requested by:	des
2005-11-29 20:29:44 +00:00
Gleb Smirnoff 62f0bf3250 Take if_baudrate from the parent. This fixes problem with SNMP
daemons reporting zero speed for vlan(4) interfaces.
2005-11-28 12:46:35 +00:00
Ruslan Ermilov 434dbbb396 Fix the following bugs:
- In ifc_name2unit(), disallow leading zeroes in a unit.

  Exploit: ifconfig lo01 create

- In ifc_name2unit(), properly handle overflows.  Otherwise,
  either of two local panic()'s can occur, either because
  no interface with such a name could be found after it was
  successfully created, or because the code will bogusly
  assume that it's a wildcard (unit < 0 due to overflow).

  Exploit: ifconfig lo<overflowed_integer> create

- Previous revision made the following sequence trigger
  a KASSERT() failure in queue(3):

  Exploit: ifconfig lo0 destroy; ifconfig lo0 destroy

  This is because IFC_IFLIST_REMOVE() is always called
  before ifc->ifc_destroy() has been run, not accounting
  for the fact that the latter can fail and leave the
  interface operating (like is the case for "lo0").
  So we ended up calling LIST_REMOVE() twice.  We cannot
  defer IFC_IFLIST_REMOVE() until after a call to
  ifc->ifc_destroy() because the ifnet may have been
  removed and its memory has been freed, so recover from
  this by re-inserting the ifnet in the cloned interfaces
  list if ifc->ifc_destroy() indicates a failure.
2005-11-24 18:56:14 +00:00
Andre Oppermann 147f74d176 Purge layer specific mbuf flags on layer crossings to avoid confusing
upper or lower layers.

Sponsored by:	TCP/IP Optimization Fundraise 2005
2005-11-18 16:23:26 +00:00
Andrew Thompson 16e7e7d4bc Fix a second missed case where the refcount is not decremented.
MFC after:	3 days
2005-11-13 20:26:19 +00:00
Andrew Thompson bb4b5f54a5 Fix a mbuf and refcnt leak in the broadcast code.
If the packet is rejected from pfil(9) then continue the loop rather than
returning, this means that we can still try to send it out the remaining
interfaces but more importantly the mbuf is freed and refcount decremented on
exit.
2005-11-13 19:36:59 +00:00
Ruslan Ermilov 4a0d6638b3 - Store pointer to the link-level address right in "struct ifnet"
rather than in ifindex_table[]; all (except one) accesses are
  through ifp anyway.  IF_LLADDR() works faster, and all (except
  one) ifaddr_byindex() users were converted to use ifp->if_addr.

- Stop storing a (pointer to) Ethernet address in "struct arpcom",
  and drop the IFP2ENADDR() macro; all users have been converted
  to use IF_LLADDR() instead.
2005-11-11 16:04:59 +00:00
Ruslan Ermilov f0a2ef4889 Use the more appropriate ifnet_byindex() instead of ifaddr_byindex(). 2005-11-11 12:32:49 +00:00
Gleb Smirnoff d314617e8a Force this interface to be RUNNING. 2005-11-11 11:17:57 +00:00
Ruslan Ermilov d09ed26fd8 - Make IFP2ENADDR() a pointer to IF_LLADDR() rather than another
copy of Ethernet address.

- Change iso88025_ifattach() and fddi_ifattach() to accept MAC
  address as an argument, similar to ether_ifattach(), to make
  this work.
2005-11-11 07:36:14 +00:00
Ruslan Ermilov 303989a2f3 Use sparse initializers for "struct domain" and "struct protosw",
so they are easier to follow for the human being.
2005-11-09 13:29:16 +00:00
Andrew Thompson 4e7e0183e1 Move the cloned interface list management in to if_clone. For some drivers the
softc lists and associated mutex are now unused so these have been removed.

Calling if_clone_detach() will now destroy all the cloned interfaces for the
driver and in most cases is all thats needed to unload.

Idea by:	brooks
Reviewed by:	brooks
2005-11-08 20:08:34 +00:00
Gleb Smirnoff 6d3a3ab735 - Do not raise IFF_DRV_OACTIVE flag in vlan_start, because this
can lead to stalled interface
- Explain this fact in a comment.

Reviewed by:	rwatson, thompsa, yar
2005-11-06 19:43:04 +00:00
Andre Oppermann 34333b16cd Retire MT_HEADER mbuf type and change its users to use MT_DATA.
Having an additional MT_HEADER mbuf type is superfluous and redundant
as nothing depends on it.  It only adds a layer of confusion.  The
distinction between header mbuf's and data mbuf's is solely done
through the m->m_flags M_PKTHDR flag.

Non-native code is not changed in this commit.  For compatibility
MT_HEADER is mapped to MT_DATA.

Sponsored by:	TCP/IP Optimization Fundraise 2005
2005-11-02 13:46:32 +00:00
Andrew Thompson 1a2661371b If we have been called from ether_ifdetach() then do not try and clear the
promisc flag from the member interface, this is a no-op anyway since the
interface is disappearing. The driver may have already released
its resources such as miibus and this is likely to panic the kernel.

Submitted and tested by:	Wojciech A. Koszek
MFC after:			2 weeks
2005-10-23 22:30:07 +00:00
Christian S.J. Peron 57c1493b3a Before we export network interface data through the ifmibdata structure,
OR the flags bits with the driver managed status flags. This fixes an
issue where RUNNING flags would not be reported to processes, which
conflicts with the flags information provided by ifconfig(8).
2005-10-23 01:44:08 +00:00
Poul-Henning Kamp 2cccccddd4 Use new (inline) functions for calls into driver. 2005-10-16 20:44:18 +00:00
Andrew Thompson 4c84347939 Make four more functions static that were missed in the last commit. 2005-10-14 20:57:02 +00:00
Andrew Thompson 6b32f3d3f2 Change most of the bridge and stp funtions to static. This has highlighted
that the following funtions are not used, wrap in '#ifdef noused' for the
moment.

 bstp_enable_change_detection
 bstp_disable_change_detection
 bstp_set_bridge_priority
 bstp_set_port_priority
 bstp_set_path_cost
2005-10-14 10:38:12 +00:00
Andrew Thompson fd6238a659 Further clean up the bridge hooks in if_ethersubr.c and ng_ether.c
- move the function pointer definitions to if_bridgevar.h
- move most of the logic to the new BRIDGE_INPUT and BRIDGE_OUTPUT macros
- remove unneeded functions from if_bridgevar.h and sort a little.
2005-10-14 02:38:47 +00:00
Andrew Thompson 20a65f37a0 From 101 ways to panic your kernel.
Use bridge_ifdetach() to notify the bridge that a member has been detached. The
bridge can then remove it from its interface list and not try to send out via a
dead pointer.
2005-10-13 23:05:55 +00:00
Julian Elischer d0a2acd430 Consolidate two adjacent conditional blocks
I actually believe the code in question should be elsewhere (in the preceding
function).

MFC after:	1 week
2005-10-13 21:48:27 +00:00
Ruslan Ermilov 199474fd36 Remove a stale comment. 2005-10-13 17:26:14 +00:00
Andrew Thompson 9cff52f7f6 Clean up the if_bridge hooks a bit in if_ethersubr.c and ng_ether.c, move
the broadcast/multicast test to bridge_input().

Requested by:	glebius
2005-10-13 09:43:30 +00:00
Andrew Thompson febd0759f3 Change the reference counting to count the number of cloned interfaces for each
cloner. This ensures that ifc->ifc_units is not prematurely freed in
if_clone_detach() before the clones are destroyed, resulting in memory modified
after free. This could be triggered with if_vlan.

Assert that all cloners have been destroyed when freeing the memory.

Change all simple cloners to destroy their clones with ifc_simple_destroy() on
module unload so the reference count is properly updated. This also cleans up
the interface destroy routines and allows future optimisation.

Discussed with:	brooks, pjd, -current
Reviewed by:	brooks
2005-10-12 19:52:16 +00:00
Warner Losh 680d937a4b Be pedantic here: We're converting from network byte order to host
byte order in these cases.  This is a nop in terms of the generated
code, but is logically incorrect.

PR: 73852
2005-10-12 19:12:46 +00:00
Andrew Thompson 8eb8e358a0 Do not unconditionally set a spanning tree port to forwarding as the link may be
down when we attach. We wont get updated until a linkstate change happens.

Go via bstp_ifupdstatus() which checks the media status first.
2005-10-11 02:58:32 +00:00
Gleb Smirnoff 6512768b89 A deja vu of:
http://lists.freebsd.org/pipermail/cvs-src/2004-October/033496.html

The same problem applies to if_bridge(4), too.

- Copy-and-paste the if_bridge(4) related block from
  if_ethersubr.c to ng_ether.c
- Add XXXs, so that copy-and-paste would be noticed by
  any future editors of this code.
- Also add XXXs near if_bridge(4) declarations.

Silence from:	thompsa
2005-10-07 14:14:47 +00:00
Tai-hwa Liang 11e0838887 Fixing a boot time panic(when if_fwip is compiled into kernel) by renaming
module name to something that wouldn't conflict with
sys/dev/firewire/firewire.c.

Submitted by:	Cai, Quanqing <caiquanqing at gmail dot com>
PR:		kern/82727
MFC after:	3 days
2005-10-06 07:09:34 +00:00
Andrew Thompson 64465c6bd3 Fix KASSERT function name in ether_output, use __func__ while I am here. 2005-10-06 01:21:40 +00:00
Gleb Smirnoff f0796cd26c - Don't pollute opt_global.h with DEVICE_POLLING and introduce
opt_device_polling.h
- Include opt_device_polling.h into appropriate files.
- Embrace with HAVE_KERNEL_OPTION_HEADERS the include in the files that
  can be compiled as loadable modules.

Reviewed by:	bde
2005-10-05 10:09:17 +00:00
Christian S.J. Peron cb1d4f92ec Protect PID initializations for statistics by the bpf descriptor
locks. Also while we are here, protect the bpf descriptor during
knlist_remove{add} operations.

Discussed with:	rwatson
2005-10-04 15:06:10 +00:00
Robert Watson cea2165b10 Rename net.isr.enable to net.isr.dispatch.
No compatibility code is provided, as this will be the production name
as of 6.0.

MFC after:	3 days
Requested by:	scottl
2005-10-04 07:59:28 +00:00
Yaroslav Tykhiy 1cf236fb0c Improve handling flags that must be propagated
to the parent interface, such as IFF_PROMISC and
IFF_ALLMULTI.  In addition, vlan(4) gains ability
to migrate from one parent to another w/o losing
its own flags.

PR:		kern/81978
MFC after:	2 weeks
2005-10-03 02:24:21 +00:00
Yaroslav Tykhiy b5c8bd5924 Clean up consistency checks in if_setflag():
. use KASSERT for all checks so that the source of an error can be detected;
. use __func__ instead of spelling function name each time;
. fix a typo.
2005-10-03 02:14:51 +00:00
Yaroslav Tykhiy 7aebc5e86e Log a message about entering or leaving permanently promiscuous mode,
as it is done for usual promiscuous mode already.  This info is important
because promiscuous mode in the hands of a malicious party can jeopardize
the whole network.
2005-10-03 01:47:43 +00:00
Andrew Thompson d5edd47e8f Do not packet filter in the bridge_start() routine, locally generated packets
are already filtered by the higher layers.

Approved by:	mlaier (mentor)
MFC after:	3 days
2005-10-02 19:15:56 +00:00
Gleb Smirnoff 4092996774 Big polling(4) cleanup.
o Axe poll in trap.

o Axe IFF_POLLING flag from if_flags.

o Rework revision 1.21 (Giant removal), in such a way that
  poll_mtx is not dropped during call to polling handler.
  This fixes problem with idle polling.

o Make registration and deregistration from polling in a
  functional way, insted of next tick/interrupt.

o Obsolete kern.polling.enable. Polling is turned on/off
  with ifconfig.

Detailed kern_poll.c changes:
  - Remove polling handler flags, introduced in 1.21. The are not
    needed now.
  - Forget and do not check if_flags, if_capenable and if_drv_flags.
  - Call all registered polling handlers unconditionally.
  - Do not drop poll_mtx, when entering polling handlers.
  - In ether_poll() NET_LOCK_GIANT prior to locking poll_mtx.
  - In netisr_poll() axe the block, where polling code asks drivers
    to unregister.
  - In netisr_poll() and ether_poll() do polling always, if any
    handlers are present.
  - In ether_poll_[de]register() remove a lot of error hiding code. Assert
    that arguments are correct, instead.
  - In ether_poll_[de]register() use standard return values in case of
    error or success.
  - Introduce poll_switch() that is a sysctl handler for kern.polling.enable.
    poll_switch() goes through interface list and enabled/disables polling.
    A message that kern.polling.enable is deprecated is printed.

Detailed driver changes:
  - On attach driver announces IFCAP_POLLING in if_capabilities, but
    not in if_capenable.
  - On detach driver calls ether_poll_deregister() if polling is enabled.
  - In polling handler driver obtains its lock and checks IFF_DRV_RUNNING
    flag. If there is no, then unlocks and returns.
  - In ioctl handler driver checks for IFCAP_POLLING flag requested to
    be set or cleared. Driver first calls ether_poll_[de]register(), then
    obtains driver lock and [dis/en]ables interrupts.
  - In interrupt handler driver checks IFCAP_POLLING flag in if_capenable.
    If present, then returns.This is important to protect from spurious
    interrupts.

Reviewed by:	ru, sam, jhb
2005-10-01 18:56:19 +00:00
Max Laier b6de9e91bd Remove bridge(4) from the tree. if_bridge(4) is a full functional
replacement and has additional features which make it superior.

Discussed on:	-arch
Reviewed by:	thompsa
X-MFC-after:	never (RELENG_6 as transition period)
2005-09-27 18:10:43 +00:00
Andrew Thompson ef64cd1947 Fix an alignment panic my preserving the 2byte padding (ETHER_ALIGN) on our
copied mbuf, which keeps the IP header 32-bit aligned. This copied mbuf is
reinjected back into ether_input and off to the IP routines.

Reported and tested by:	Peter van Dijk
Approved by:		mlaier (mentor)
MFC after:		3 days
2005-09-22 01:46:11 +00:00
Gleb Smirnoff 2d7e9ead07 Several fixes to rt_setgate(), that fix problems with route changing:
- Rearrange code so that in a case of failure the affected
  route is not changed. Otherwise, a bogus rtentry will be
  left and later rt_check() can recurse on its lock. [1]
- Remove comment about protocol cloning.
- Fix two places where rtentry mutex was recursed on, because
  accessed via two different pointers, that were actually pointing
  to the same rtentry in some cases. [1]
- Return EADDRINUSE instead of bogus EDQUOT, in case when gateway
  uses the same route. [2]

Reported & tested by:	ps, Andrej Zverev <az inec.ru> [1]
PR:			kern/64090 [2]
2005-09-21 11:58:10 +00:00