Commit graph

1741 commits

Author SHA1 Message Date
Andre Oppermann 035ba19027 Undo a tad little optimization to bpf_mtap() introduced in rev. 1.95
which broke the correct handling of the BIOCGSEESENT flag in the bpf
listener.

PR:		kern/56441
Submitted by:	<vys at renet.ru>
MFC after:	3 days
2005-09-14 16:37:05 +00:00
Andre Oppermann 17a8471fcd Remove bogous semicolons at the end of the definitions of
'do { ... } while (0)' macros.

PR:		kern/83088
Sumbitted by:	<antoine.brodin at laposte.net>
2005-09-14 14:57:04 +00:00
Robert Watson 0a53be4671 In netkqfilter(), return EINVAL instead of 1 (EPERM) when a filter type
is requested on a network interface file descriptor that is non-applicable.

MFC after:	3 days
2005-09-12 19:26:03 +00:00
Craig Rodrigues 6a3d26b2b7 Forward declare z_errmsg with static linkage since it is defined
with static linkage later in the file.  Eliminates GCC 4.0 error.
2005-09-11 16:13:02 +00:00
Christian S.J. Peron fe0fc7efe3 Protect interface and address lists using the appropriate mutex. These
locks were not aquired because the user buffers were not wired, thus it was
possible that that SYSCTL_OUT could sleep, causing a number of different
problems such as lock ordering issues and dead locks.

-Wire user supplied buffer to ensure SYSCTL_OUT will not sleep.
-Pickup ifnet locks to protect the list.
-Where applicable pickup address locks.
-Pickup radix node head locks.
-Remove splnet stubs
-Remove various comments about locking here, because they are no
 longer needed.

It is the hope that these changes will make sysctl_rtsock MP safe.

MFC after:	3 weeks
2005-09-10 15:12:24 +00:00
David E. O'Brien 5b1c0294e4 Forward declaring static variables as extern is invalid ISO-C. Now that
GCC can properly handle forward static declarations, do this properly.
2005-09-07 10:06:14 +00:00
Andrew Thompson 59280079d3 Add support for multicast to the bridge and allow inet6 addresses to be
assigned to the interface.

IPv6 auto-configuration is disabled. An IPv6 link-local address has a
link-local scope within one link, the spec is unclear for the bridge case and
it may cause scope violation.

An address can be assigned in the usual way;
  ifconfig bridge0 inet6 xxxx:...

Tested by:	bmah
Reviewed by:	ume (netinet6)
Approved by:	mlaier (mentor)
MFC after:	1 week
2005-09-06 21:11:59 +00:00
Christian S.J. Peron b75a24a075 Instead of caching the PID which opened the bpf descriptor, continuously
refresh the PID which has the descriptor open. The PID is refreshed in various
operations like ioctl(2), kevent(2) or poll(2). This produces more accurate
information about current bpf consumers. While we are here remove the bd_pcomm
member of the bpf stats structure because now that we have an accurate PID we
can lookup the via the kern.proc.pid sysctl variable. This is the trick that
NetBSD decided to use to deal with this issue.

Special care needs to be taken when MFC'ing this change, as we have made a
change to the bpf stats structure. What will end up happening is we will leave
the pcomm structure but just mark it as being un-used. This way we keep the ABI
in tact.

MFC after:	1 month
Discussed with:	Rui Paulo < rpaulo at NetBSD dot org >
2005-09-05 23:08:04 +00:00
Sam Leffler 62313e4c3f reclaim sbuf and clear lock on error in ifconf
Submitted by:	Ted Unangst
Reviewed by:	rwatson
MFC after:	3 days
2005-09-04 17:32:47 +00:00
Yaroslav Tykhiy eefbcf0e62 Use VLAN_TAG_VALUE() not only to read a dot1q tag
value from an m_tag, but also to set it.  This reduces
complex code duplication and improves its readability.

Alas, we shouldn't rename the macro to VLAN_TAG_LVALUE()
globally because that would cause pain for kernel module
port maintainers and vendors using FreeBSD as their codebase.
Added a clarifying comment instead.

Discussed with:	ru, glebius
X-MFC-After:	6.0-RELEASE (MFC is good just to reduce the diff)
2005-08-31 11:36:50 +00:00
Gleb Smirnoff ba26134b19 Fix fallout from revision 1.77, mark outgoing packets with M_VLANTAG flag.
PR:		kern/80646
Reviewed by:	yar
MFC after:	3 days
2005-08-30 14:14:08 +00:00
Andrew Thompson 68e84b98b2 Fix a panic in softclock() if the interface is destroyed with a bpf consumer
attached.

This is caused by bpf_detachd clearing IFF_PROMISC on the interface which does
a SIOCSIFFLAGS ioctl. The problem here is that while the interface has been
stopped, IFF_UP has not been cleared so IFF_UP != IFF_DRV_RUNNING, this causes
the ioctl function to init() the interface which resets the callouts.

The destroy then completes and frees the softc but softclock will panic on a
dead callout pointer.

Ensure ifp->if_flags matches reality by clearing IFF_UP when we destroy.

Silence from:	rwatson
Approved by:	mlaier (mentor)
MFC after:	3 days
2005-08-27 01:17:42 +00:00
Robert Watson 7e994955ac De-spl parts of the routing socket code now generally protected
through locking; leave some spl references around code where there
are open questions about global variable references.  Also, add
an XXX regarding locking in sysctl.

MFC after:	3 days
2005-08-25 13:30:04 +00:00
Andrew Thompson dba31bdea1 The mtu check in bridge_enqueue is bogus as the maximum Ethernet frame is
actually 1514, so comparing the mbuf length which includes the Ethernet header
to the interface MTU is wrong.

The check was a little over the top so just remove it.

Approved by:	mlaier (mentor)
MFC after:	3 days
2005-08-23 19:49:00 +00:00
Max Laier 0bdf5171c8 Don't loop back packets that have been routed by pf. This fixes an endless
loop where the same packet is sent over and over again.

Obtained from:	OpenBSD
Reported by:	Sergey Lapin
Tested by:	Sergey Lapin
MFC after:	7 days
2005-08-23 14:13:17 +00:00
Christian S.J. Peron 93e39f0b93 Introduce two new ioctl(2) commands, BIOCLOCK and BIOCSETWF. These commands
enhance the security of bpf(4) by further relinquishing the privilege of
the bpf(4) consumer (assuming the ioctl commands are being implemented).

Once BIOCLOCK is executed, the device becomes locked which prevents the
execution of ioctl(2) commands which can change the underly parameters of the
bpf(4) device. An example might be the setting of bpf(4) filter programs or
attaching to different network interfaces.

BIOCSETWF can be used to set write filters for outgoing packets. Currently if
a bpf(4) consumer is compromised, the bpf(4) descriptor can essentially be used
as a raw socket, regardless of consumer's UID. Write filters give users the
ability to constrain which packets can be sent through the bpf(4) descriptor.

These features are currently implemented by a couple programs which came from
OpenBSD, such as the new dhclient and pflogd.

-Modify bpf_setf(9) to accept a "cmd" parameter. This will be used to specify
 whether a read or write filter is to be set.
-Add a bpf(4) filter program as a parameter to bpf_movein(9) as we will run the
 filter program on the mbuf data once we move the packet in from user-space.
-Rather than execute two uiomove operations, (one for the link header and the
 other for the packet data), execute one and manually copy the linker header
 into the sockaddr structure via bcopy.
-Restructure bpf_setf to compensate for write filters, as well as read.
-Adjust bpf(4) stats structures to include a bd_locked member.

It should be noted that the FreeBSD and OpenBSD implementations differ a bit in
the sense that we unconditionally enforce the lock, where OpenBSD enforces it
only if the calling credential is not root.

Idea from:	OpenBSD
Reviewed by:	mlaier
2005-08-22 19:35:48 +00:00
Christian S.J. Peron 4ddfb5312a Add missing braces around bpf_filter which were missed when I
merged the bpfstat code.

Pointed out by:	iedowse
Pointy hat to:	csjp
MFC after:	3 days
2005-08-18 22:30:52 +00:00
Andrew Thompson 23e7643185 Mark the callouts as MPSAFE as if_bridge has been giant-free since day 1.
Use the SMP friendly callout_init_mtx() while we are here.

Approved by:	mlaier (mentor)
MFC after:	3 days
2005-08-18 20:17:00 +00:00
Brooks Davis dc7c539e33 When we started calling if_findindex() from if_alloc() with an empty
struct ifnet most of if_findindex() become a complex no-op.  Remove it
and replace it with a corrected version of the four line for loop it
devolved to plus some error handling.  This should probably be replaced
with subr_unit at some point.

Switch from checking ifaddr_byindex to ifnet_byindex when looking for
empty indexes.  Since we're doing this from if_alloc/if_free, we can
only be sure that ifnet_byindex will be correct.  This fixes panics when
loading the ef(4) module.  The panics were caused by the fact that
if_alloc was called four time before if_attach was called and thus
ifaddr_byindex was not set and the same unit was allocated again.  This
in turn caused the first if_attach to fail because the ifp was not the
one in ifnet_byindex(ifp->if_index).

Reported by:	"Wojciech A. Koszek" <dunstan at freebsd dot czest dot pl>
PR:		kern/84987
MFC After:	1 day
2005-08-18 18:36:40 +00:00
Brooks Davis 7cf30146f0 - Move IF_ADDR_LOCK_DESTROY(ifp) from if_free to if_free_type.
- Add a note that additions should be made to if_free_type and not
  if_free to help avoid this in the future.

This apparently fixes a use after free in if_bridge and may fix bugs
in other direct if_free_type consumers.

Reported by:	thompsa
2005-08-16 17:02:35 +00:00
Brooks Davis f3447eb493 Vlan interfaces change their type after ether_ifattach() so we needs to
use if_free_type(ifp, IFT_ETHER) to delete them and stop leaking struct
arpcoms.

Reported by:	thompsa
MFC After:	3 days
2005-08-15 20:27:34 +00:00
Andrew Thompson 691cdb5351 Ensure that we are holding the lock when initialising the bridge interface. We
could initialise while unlocked if the bridge is not up when setting the inet
address, ether_ioctl() would call bridge_init.

Change it so bridge_init is always called unlocked and then locks before
calling bstp_initialization().

Reported by:    Michal Mertl
Approved by:    mlaier (mentor)
MFC after:      3 days
2005-08-15 02:54:29 +00:00
Andrew Thompson a1c0fd4dee Ensure that we are holding the lock when initialising the bridge interface. We
could initialise while unlocked if the bridge is not up when setting the inet
address, ether_ioctl() would call bridge_init.

Change it so bridge_init is always called unlocked and then locks before
calling bstp_initialization().

Reported by:	Michal Mertl
Approved by:	mlaier (mentor)
MFC after:	3 days
2005-08-15 02:50:13 +00:00
Gleb Smirnoff 00ff5c4778 Axe ppp_for_tty(). Use tty->t_lsc pointer to store sc. This
also eliminates recursive use of ppp_softc_list_mtx.

PR:		kern/84686
Reviewed by:	phk
MFC after:	1 week
2005-08-12 08:27:15 +00:00
Gleb Smirnoff 791888619d o To prevent a race between RTM_DELETE message and
arptimer() deleting stale entry, we need to lock
  rtentry before unlocking radix head.

Reviewed by:	sam
2005-08-11 08:26:31 +00:00
Gleb Smirnoff 530f95fc08 o Make rt_check() function more strict:
- rt0 passed to rt_check() must not be NULL, assert this.
  - rt returned by rt_check() must be valid locked rtentry,
    if no error occured.
o Modify callers, so that they never pass NULL rt0
  to rt_check().

Reviewed by:	sam, ume (nd6.c)
2005-08-11 08:14:53 +00:00
Robert Watson fc57457045 For each interface flag, indicate whether or not it is owned by the
device driver, owned by the network stack, or initialized by the device
driver before attach and read-only from then on.

Not all device drivers and network stack components currently follow
these rules, especially with respect to IFF_UP, and a few exceptions
with IFF_ALLMULTI.

MFC after:	7 days
2005-08-09 12:56:20 +00:00
Robert Watson 13f4c340ae Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags.  Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags.  This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by:	pjd, bz
MFC after:	7 days
2005-08-09 10:20:02 +00:00
Robert Watson 292ee7be1c Rename IFF_RUNNING to IFF_DRV_RUNNING, IFF_OACTIVE to IFF_DRV_OACTIVE,
and move both flags from ifnet.if_flags to ifnet.if_drv_flags, making
and documenting the locking of these flags the responsibility of the
device driver, not the network stack.  The flags for these two fields
will be mutually exclusive so that they can be exposed to user space as
though they were stored in the same variable.

Provide #defines to provide the old names #ifndef _KERNEL, so that user
applications (such as ifconfig) can use the old flag names.  Using the
old names in a device driver will result in a compile error in order to
help device driver writers adopt the new model.

When exposing the interface flags to user space, via interface ioctls
or routing sockets, or the two fields together.  Since the driver flags
cannot currently be set for user space, no new logic is currently
required to handle this case.

Add some assertions that general purpose network stack routines, such
as if_setflags(), are not improperly used on driver-owned flags.

With this change, a large number of very minor network stack races are
closed, subject to correct device driver locking.  Most were likely
never triggered.

Driver sweep to follow; many thanks to pjd and bz for the line-by-line
review they gave this patch.

Reviewed by:	pjd, bz
MFC after:	7 days
2005-08-09 10:16:17 +00:00
Gleb Smirnoff 9bd8ca3014 In preparation for fixing races in ARP (and probably in other
L2/L3 mappings) make rt_check() return a locked rtentry.
2005-08-09 08:39:56 +00:00
Andrew Thompson 3155122ec2 Use m_copypacket() which is an optimization of the common case
m_copym(m, 0, M_COPYALL, how).

This is required for strict alignment architectures where we align the IP
header in the input path but m_copym() will create an unaligned copy in
bridge_broadcast(). m_copypacket() preserves alignment of the first mbuf.

Noticed by:	Petri Simolin
Approved by:	mlaier (mentor)
MFC after:	3 days
2005-08-08 22:21:55 +00:00
Robert Watson 6a113b3de7 Merge the dev_clone and dev_clone_cred event handlers into a single
event handler, dev_clone, which accepts a credential argument.
Implementors of the event can ignore it if they're not interested,
and most do.  This avoids having multiple event handler types and
fall-back/precedence logic in devfs.

This changes the kernel API for /dev cloning, and may affect third
party packages containg cloning kernel modules.

Requested by:	phk
MFC after:	3 days
2005-08-08 19:55:32 +00:00
Sam Leffler 456d182d5b destroy lock _before_ free'ing the structure it resides in 2005-08-06 18:42:01 +00:00
John Baldwin 6da3131abd Initialize the if_addr mutex in if_alloc() rather than waiting until
if_attach().  This allows ethernet drivers to use it in their routines
to program their MAC filters before ether_ifattach() is called (de(4) is
one such driver).  Also, the if_addr mutex is destroyed in if_free()
rather than if_detach(), so there was another potential bug in that a
driver that failed during attach and called if_free() without having
called ether_ifattach() would have tried to destroy an uninitialized mutex.

Reported by:	Holm Tiffe holm at freibergnet dot de
Discussed with:	rwatson
2005-08-04 14:39:47 +00:00
Robert Watson c3b31afd92 Protect link layer network interface multicast address list manipulation
using ifp->if_addr_mtx:

- Initialize if_addr_mtx when ifnet is initialized.

- Destroy if_addr_mtx when ifnet is torn down.

- Rename ifmaof_ifpforaddr() to if_findmulti(); assert if_addr_mtx.
  Staticize.

- Extract ifmultiaddr allocation and initialization into if_allocmulti();
  accept a 'mflags' argument to indicate whether or not sleeping is
  permitted.  This centralizes error handling and address duplication.

- Extract ifmultiaddr tear-down and deallocation in if_freemulti().

- Re-structure if_addmulti() to hold if_addr_mtx around manipulation of
  the ifnet multicast address list and reference count manipulation.
  Make use of non-sleeping allocations.  Annotate the fact that we only
  generate routing socket events for explicit address addition, not
  implicit link layer address addition.

- Re-structure if_delmulti() to hold if_addr_mtx around manipulation of
  the ifnet multicast address list and reference count manipulation.
  Annotate the lack of a routing socket event for implicit link layer
  address removal.

- De-spl all and sundry.

Problem reported by:	Ed Maste <emaste at phaedrus dot sandvine dot ca>
MFC after:		1 week
2005-08-02 23:23:26 +00:00
Robert Watson 09df718e0e When allocating link layer ifnet address list entries in
ifp->if_resolvemulti(), do so with M_NOWAIT rather than M_WAITOK, so
that a mutex can be held over the call.  In the FDDI code, add a
missing M_ZERO.  Consumers are already aware that if_resolvemulti()
can fail.

MFC after:	1 week
2005-08-02 17:52:52 +00:00
Robert Watson de6073aab0 Add if_addr_mtx to struct ifnet, a mutex to protect ifnet-related address
lists.  Add accessor macros.

This changes the size of struct ifnet, but ideally, all ifnet consumers
are now using if_alloc() to allocate these structures rather than
embedding them into device driver softc's, so this won't modify the
network device driver ABI.

MFC after:	1 week
2005-08-02 17:43:35 +00:00
Bjoern A. Zeeb 9e669156d4 Add support for IPv6 over GRE [1]. PR kern/80340 includes the
FreeBSD specific ip_newid() changes NetBSD does not have.
Correct handling of non AF_INET packets passed to bpf [2].

PR:		kern/80340[1], NetBSD PRs 29150[1], 30844[2]
Obtained from:	NetBSD ip_gre.c rev. 1.34,1.35, if_gre.c rev. 1.56
Submitted by:	Gert Doering <gert at greenie.muc.de>[2]
MFC after:	4 days
2005-08-01 08:14:21 +00:00
Christian S.J. Peron 422a63da6e Rather than hold a mutex over calls to SYSCTL_OUT allocate a
temporary buffer then pass the array to user-space once we have
dropped the lock.

While we are here, drop an assertion which could result in a
kernel panic under certain race conditions.

Pointed out by:	rwatson
2005-07-26 17:21:56 +00:00
Hajimu UMEMOTO a1f7e5f8ee scope cleanup. with this change
- most of the kernel code will not care about the actual encoding of
  scope zone IDs and won't touch "s6_addr16[1]" directly.
- similarly, most of the kernel code will not care about link-local
  scoped addresses as a special case.
- scope boundary check will be stricter.  For example, the current
  *BSD code allows a packet with src=::1 and dst=(some global IPv6
  address) to be sent outside of the node, if the application do:
    s = socket(AF_INET6);
    bind(s, "::1");
    sendto(s, some_global_IPv6_addr);
  This is clearly wrong, since ::1 is only meaningful within a single
  node, but the current implementation of the *BSD kernel cannot
  reject this attempt.

Submitted by:	JINMEI Tatuya <jinmei__at__isl.rdc.toshiba.co.jp>
Obtained from:	KAME
2005-07-25 12:31:43 +00:00
Andrew Thompson 39bb2fca46 We check that all the member interfaces have the same MTU on attach to the
bridge but the interface can still be changed afterwards.

This falls under the 'dont do that' category but log an warning when INVARIANTS
is defined.

Approved by:	mlaier (mentor)
MFC after:	3 days
2005-07-25 02:22:37 +00:00
Christian S.J. Peron 69f7644bc9 Introduce new sysctl variable: net.bpf.stats. This sysctl variable can
be used to pass statistics regarding dropped, matched and received
packet counts from the kernel to user-space. While we are here
introduce a new counter for filtered or matched packets. We currently
keep track of packets received or dropped by the bpf device, but not
how many packets actually matched the bpf filter.

-Introduce net.bpf.stats sysctl OID
-Move sysctl variables after the function prototypes so we can
 reference bpf_stats_sysctl(9) without build errors.
-Introduce bpf descriptor counter which is used mainly for sizing
 of the xbpf_d array.
-Introduce a xbpf_d structure which will act as an external
 representation of the bpf_d structure.
-Add a the following members to the bpfd structure:

	bd_fcount	- Number of packets which matched bpf filter
	bd_pid		- PID which opened the bpf device
	bd_pcomm	- Process name which opened the device.

It should be noted that it's possible that the process which opened
the device could be long gone at the time of stats collection. An
example might be a process that opens the bpf device forks then exits
leaving the child process with the bpf fd.

Reviewed by:	mdodd
2005-07-24 17:21:17 +00:00
Robert Watson 638ccea02a Allocate one of the spare ifnet integer fields to hold if_drv_flags,
which in the future will hold IFF_OACTIVE and IFF_RUNNING, and have
its access synchronized by the device driver rather than the
protocol stack.  This will avoid potential races in the management
of flags in if_flags.

Discussed with:	various (scottl, jhb, ...)
MFC after:	1 week
2005-07-21 22:01:06 +00:00
Poul-Henning Kamp 514bcb8955 Add some KASSERTS to catch null pointers. 2005-07-21 09:00:51 +00:00
Andrew Thompson 12b47243c6 Clear the PROMISC flag from the vlan interface when we remove a member. We
checked for IFT_L2VLAN in bridge_ioctl_add() but not bridge_delete_member().

Approved by:	mlaier (mentor)
2005-07-20 19:42:51 +00:00
Robert Watson 2432c31c8b In multicast routines:
Compare pointers with NULL rather than treating them as booleans.

Compare pointers with NULL rather than 0 to make it more clear
they are pointers.

Assign pointers value of NULL rather than 0 to make it more clear
they are pointers.

MFC after:	3 days
2005-07-19 10:12:58 +00:00
Robert Watson d8d5b10e84 Rename equal() macro to sa_equal(), which matches the definitions
of sa_equal() in other files, and makes it more clear what equal()
is comparing.

MFC after:	3 days
2005-07-19 10:03:47 +00:00
Robert Watson f002340544 Lock down netnatm and mark as MPSAFE:
- Introduce a subsystem mutex, natm_mtx, manipulated with accessor macros
  NATM_LOCK_INIT(), NATM_LOCK(), NATM_UNLOCK(), NATM_LOCK_ASSERT().  It
  protects the consistency of pcb-related data structures.  Finer grained
  locking is possible, but should be done in the context of specific
  measurements (as very little work is done in netnatm -- most is in the
  ATM device driver or socket layer, so there's probably not much
  contention).

- Remove GIANT_REQUIRED, mark as NETISR_MPSAFE, remove
  NET_NEEDS_GIANT("netnatm").

- Conditionally acquire Giant when entering network interfaces for
  ifp->if_ioctl() using IFF_LOCKGIANT(ifp)/IFF_UNLOCKGIANT(ifp) in order
  to coexist with non-MPSAFE atm ifnet drivers..

- De-spl.

MFC after:	2 weeks
Reviewed by:	harti, bms (various versions)
2005-07-18 16:55:46 +00:00
George V. Neville-Neil ba7be0a934 Fix for PR 82974. We were not checking that the route looked up in
the case of an RTM_CHANGE was specific, i.e. that it matched completely.  This
led to a route change of a non-existent route changing the default route
as the radix code would simply back track to that point and hand that
route back to the routing socket code.

PR: 82974
Reviewed by: Tai-hwa Liang <avatar@mmlab.cse.yzu.edu.tw>
             Ben Kaduk <minimarmot@gmail.com>
             Bjoern A. Zeeb <bzeeb-lists@lists.zabbadoz.net>
Obtained from:	OpenBSD with modifications.
MFC after: 2 weeks
2005-07-15 09:18:34 +00:00
Max Laier 52023244de Move eventhandler for 'ifnet_departure_event' at the end of the progress.
Some of the (IPv6) cleanup functions send packets to inform peers of the
departure.  These packets confused users of ifnet_departure_event (pf at the
moment).

PR:		kern/80627
Tested by:	Divacky Roman
MFC after:	1 week
2005-07-14 20:26:43 +00:00