Commit graph

40 commits

Author SHA1 Message Date
Warner Losh 95ee2897e9 sys: Remove $FreeBSD$: two-line .h pattern
Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/
2023-08-16 11:54:11 -06:00
Alexander Motin 7f16b501e2 GEOM: Introduce partial confxml API
Traditionally the GEOM's primary channel of information from kernel to
user-space was confxml, fetched by libgeom through kern.geom.confxml
sysctl.  It is convenient and informative, representing full state of
GEOM in a single XML document.  But problems start to arise on systems
with hundreds of disks, where the full confxml size reaches many
megabytes, taking significant time to first write it and then parse.

This patch introduces alternative solution, allowing to fetch much
smaller XML document, subset of the full confxml, limited to 64KB and
representing only one specified geom and optionally its parents.  It
uses existing GEOM control interface, extended with new "getxml" verb.
In case of any error, such as the buffer overflow, it just transparently
falls back to traditional full confxml.  This patch uses the new API in
user-space GEOM tools where it is possible.

Reviewed by:	imp
MFC after:	2 month
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D34529
2022-03-12 11:55:52 -05:00
Alexander Motin 67c58cd729 GEOM: Remove g_wait_sim.
It seems never been used since addition.
2022-01-29 22:12:43 -05:00
Alexander Motin ffc1cc95e7 GEOM: Relax direct dispatch for GEOM threads.
The only cases when direct dispatch does not make sense is for I/O
submission from down thread and for completion from up thread.  In
all other cases, if both consumer and producer are OK about it, we
can save on context switches.

MFC after:	2 weeks
2022-01-28 14:21:21 -05:00
Alexander Motin c4c88d4718 Remove duplicate g_debugflags declaration.
While there, define G_F_FOOTSHOOTING instead of numeric constants.

MFC after:	13 days
X-MFX-with:	r355412
2019-12-05 15:07:32 +00:00
Alexander Motin 49ee0fcea5 Use sbuf_cat() in GEOM confxml generation.
When it comes to megabytes of text, difference between sbuf_printf() and
sbuf_cat() becomes substantial.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2019-06-19 15:36:02 +00:00
Pedro F. Giffuni 3728855a0f sys/geom: adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
2017-11-27 15:17:37 +00:00
Alexander Motin 7ae1a87bfe Escape special XML chars, returned by some devices, confusing XML parsers.
MFC after:	1 month
2013-11-27 14:25:06 +00:00
Alexander Motin 40ea77a036 Merge GEOM direct dispatch changes from the projects/camlock branch.
When safety requirements are met, it allows to avoid passing I/O requests
to GEOM g_up/g_down thread, executing them directly in the caller context.
That allows to avoid CPU bottlenecks in g_up/g_down threads, plus avoid
several context switches per I/O.

The defined now safety requirements are:
 - caller should not hold any locks and should be reenterable;
 - callee should not depend on GEOM dual-threaded concurency semantics;
 - on the way down, if request is unmapped while callee doesn't support it,
   the context should be sleepable;
 - kernel thread stack usage should be below 50%.

To keep compatibility with GEOM classes not meeting above requirements
new provider and consumer flags added:
 - G_CF_DIRECT_SEND -- consumer code meets caller requirements (request);
 - G_CF_DIRECT_RECEIVE -- consumer code meets callee requirements (done);
 - G_PF_DIRECT_SEND -- provider code meets caller requirements (done);
 - G_PF_DIRECT_RECEIVE -- provider code meets callee requirements (request).
Capable GEOM class can set them, allowing direct dispatch in cases where
it is safe.  If any of requirements are not met, request is queued to
g_up or g_down thread same as before.

Such GEOM classes were reviewed and updated to support direct dispatch:
CONCAT, DEV, DISK, GATE, MD, MIRROR, MULTIPATH, NOP, PART, RAID, STRIPE,
VFS, ZERO, ZFS::VDEV, ZFS::ZVOL, all classes based on g_slice KPI (LABEL,
MAP, FLASHMAP, etc).

To declare direct completion capability disk(9) KPI got new flag equivalent
to G_PF_DIRECT_SEND -- DISKFLAG_DIRECT_COMPLETION.  da(4) and ada(4) disk
drivers got it set now thanks to earlier CAM locking work.

This change more then twice increases peak block storage performance on
systems with manu CPUs, together with earlier CAM locking changes reaching
more then 1 million IOPS (512 byte raw reads from 16 SATA SSDs on 4 HBAs to
256 user-level threads).

Sponsored by:	iXsystems, Inc.
MFC after:	2 months
2013-10-22 08:22:19 +00:00
Dag-Erling Smørgrav 1b2cb2b3f0 Introduce a kern.geom.notaste sysctl that can be used to temporarily
disable GEOM tasting to avoid the "bouncing GEOM" problem where, when
you shut down the consumer of a provider which can be viewed in multiple
ways (typically a mirror whose members are labeled partitions), GEOM
will immediately taste that provider's alter ego and reattach the
consumer.

Approved by:	re (glebius)
2013-09-24 20:05:16 +00:00
Alexander Motin 50199fa0d0 Make g_wither_washer() to not loop by itself, but only when there was some
more topology change done that may require its attention.  Add few missing
g_do_wither() calls in respective places to signal it.

This fixes potential infinite loop here when some provider is withered, but
still opened or connected for some reason and so can not be destroyed.  For
example, see r227009 and r227510.
2013-03-24 03:15:20 +00:00
Poul-Henning Kamp 8c24ef5f78 Use unit number allocation functions for GEOM minor numbers. 2004-10-25 12:28:28 +00:00
Poul-Henning Kamp 1b464bd889 Make withering water tight.
When we orphan/wither a provider, an attached geom+consumer could
end up being withered as a result and it may be in front of us in
the normal object scanning order so we need to do multi-pass.  On
the other hand, there may be withering stuff we can't get rid off
(yet), so we need to keep track of both the existence of withering
stuff and if there is more we can do at this time.
2004-07-08 16:17:14 +00:00
Poul-Henning Kamp 3d1d5bc3c3 Rearrange some of the GEOM debugging tools to be more structured.
Retire g_sanity() and corresponding debugflag (0x8)

  Retire g_{stall,release}_events().

  Under #ifdef DIAGNOSTIC:

    Make g_valid_obj() an official function and have it return an an
    non-zero integer which indicates the kind of object when found.

    Implement G_VALID_{CLASS,GEOM,CONSUMER,PROVIDER}() macros based
    on g_valid_obj().

    Sprinkle calls to these macros liberally over the infrastructure.

    Always check that we do not free a live object.
2004-03-10 08:49:08 +00:00
Poul-Henning Kamp a974614b05 More of the event stuff can now be private to geom_event.c 2003-04-23 20:54:42 +00:00
Poul-Henning Kamp 8cd1535a24 Rename g_call_me() to g_post_event(), and give it a flag
argument to determine if we can M_WAITOK in malloc.
2003-04-23 20:46:12 +00:00
Poul-Henning Kamp d98777f8db Remove the now unused hardcoded g_post_event() event support. 2003-04-23 20:25:33 +00:00
Poul-Henning Kamp 9ab3ea7841 Turn EV_NEW_PROVIDER into a g_call_me() event. 2003-04-23 20:16:13 +00:00
Poul-Henning Kamp f2e9a09494 Convert EV_SPOILED event to use g_call_me(). 2003-04-23 20:06:38 +00:00
Poul-Henning Kamp 9972896c00 Turn the hardwired NEW_CLASS event into a g_call_me() event. 2003-04-23 19:34:38 +00:00
Poul-Henning Kamp b5cba4167f Move the shutdown eventhandler stuff to a more logical place. 2003-04-23 19:15:27 +00:00
Poul-Henning Kamp 316aed030e Add handling for cancelled events in the g_call_me() methods. 2003-04-02 21:10:04 +00:00
Poul-Henning Kamp afcbcfaed0 Change events to have an array of "void *" references, and give the
event posting functions varargs to fill these.

Attribute g_call_me() to appropriate g_geom's where necessary.

Add a flag argument to g_call_me() methods which will be used to signal
cancellation of events in the future.

This commit should be a no-op.
2003-04-02 20:41:18 +00:00
Poul-Henning Kamp afa2a5aab7 Remove some debugging in the new OAM[*] and add a debug flag for other
parts of it.

[*] I've been asked what "OAM" means:  It's an acronym used in the
telecom industry, "Operations And Maintenance", and there it covers
anything from a single unlabeled led on the frontpanel the the full
nightmare of CMIP for SS7.
2003-03-31 18:35:37 +00:00
Poul-Henning Kamp d49d7ca591 Turn /dev/geom.ctl from a GEOM class into a plain character device driver
instead, it will never see a disk-I/O transaction, so this is a lot simpler.
2003-03-24 13:37:15 +00:00
Poul-Henning Kamp dddc28bfe0 Introduce g_cancel_events() and use it a couple of places where it makes
sense.
2003-03-23 23:01:40 +00:00
Poul-Henning Kamp d943f1b0b9 Introduce an SX lock which allows us to stall event processing
during OAM operations.
2003-03-23 21:58:09 +00:00
Poul-Henning Kamp 7da1ebfd74 Mitigate deadlock situation pending a more complete solution. 2003-03-21 22:05:33 +00:00
Poul-Henning Kamp e24cbd9017 Retire the GEOM private statistics code and use devstat instead. 2003-03-18 09:42:33 +00:00
Poul-Henning Kamp 8ebd558f5d Implement a handle for efficient implementation of perforations in
lower extremities.

Setting bit 4 in debugflags (sysctl kern.geom.debugflags=16) will
allow any open to succeed on rank#1 providers.  This will generally
correspond to the physical disk devices: ad0, da0, md0 etc.

This fundamentally violates the mechanics of GEOMs autoconfiguration,
and is only provided as a debugging facility, so obviously error
reports on GEOM where this bit is or has been set will not be
accepted.
2003-02-12 09:48:27 +00:00
Poul-Henning Kamp 4ec353005c Move the g_stat struct to its own .h file, we will export it to other code.
Insted of embedding a struct g_stat in consumers and providers, merely
include a pointer.

Remove a couple of <sys/time.h> includes now unneeded.

Add a special allocator for struct g_stat.  This allocator will allocate
entire pages and hand out g_stat functions from there.  The "id" field
indicates free/used status.

Add "/dev/geom.stats" device driver whic exports the pages from the
allocator to userland with mmap(2) in read-only mode.

This mmap(2) interface should be considered a non-public interface and
the functions in libgeom (not yet committed) should be used to access
the statistics data.
2003-02-08 13:03:57 +00:00
Poul-Henning Kamp 91cd3dc6f5 Move #defines of major/minor to internal header file so other bits can
share and coordinate with geom_dev.
2003-02-08 12:30:12 +00:00
Poul-Henning Kamp 801bb689ca Commit the correct copy of the g_stat structure.
Add debug.sizeof.g_stat sysctl.

Set the id field of the g_stat when we create consumers and providers.

Remove biocount from consumer, we will use the counters in the g_stat
structure instead.  Replace one field which will need to be atomically
manipulated with two fields which will not (stat.nop and stat.nend).

Change add companion field to bio_children: bio_inbed for the exact
same reason.

Don't output the biocount in the confdot output.

Fix KASSERT in g_io_request().

Add sysctl kern.geom.collectstats defaulting to off.

Collect the following raw statistics conditioned on this sysctl:

    for each consumer and provider {
        total number of operations started.
        total number of operations completed.
        time last operation completed.
        sum of idle-time.
        for each of BIO_READ, BIO_WRITE and BIO_DELETE {
            number of operations completed.
            number of bytes completed.
            number of ENOMEM errors.
            number of other errors.
            sum of transaction time.
        }
    }

API for getting hold of these statistics data not included yet.
2003-02-07 23:08:24 +00:00
Poul-Henning Kamp d518e53936 Add the remaning part of the new libdisk interaction.
WARNING:  This is not a published interface, it is a stopgap measure for
WARNING:  libdisk so we can get 5.0-R out of the door.

Sponsored by:	DARPA & NAI Labs
2002-10-28 22:43:54 +00:00
Poul-Henning Kamp 2874f1cf36 Properly isolate the locking domains of sysctl from the topology lock
for the sysctls which report the configuration.

Sponsored by:	DARPA & NAI Labs.
2002-10-04 10:38:36 +00:00
Poul-Henning Kamp 5dcf28b202 Disable the g_sanity() check unless people ask for it in the debugflags.
Sponsored by:	DARPA & NAI Labs.
2002-09-30 08:46:29 +00:00
Poul-Henning Kamp 4ae677009e Style, whitespace and lint fixes.
Sponsored by:	DARPA & NAI Labs.
2002-09-28 11:57:20 +00:00
Poul-Henning Kamp 346cd5fe2d Implement g_call_me() as a way for geom methods to schedule operations
to be performed in the event-thread.

To do this, we need to lock the eventlist with g_eventlock (nee g_doorlock),
since g_call_me() being called from the UP/DOWN paths will not be able to
aquire g_topology_lock.

This also means that for now these events are not referenced on any
particular consumer/provider/geom.

For UP/DOWN path use, this will not become a problem since the access()
function will make sure we drain any bio's before we dismantle.

Sponsored by:   DARPA & NAI Labs.
2002-09-27 20:38:36 +00:00
Poul-Henning Kamp 2654e1fc4e s/classs/classes/ to fixup grammer after the previous global renaming.
Sponsored by: DARPA & NAI Labs
2002-04-04 09:41:47 +00:00
Poul-Henning Kamp b1876192f0 Eliminate some thread pointers which do not make sense anymore.
Split private parts of geom.h into geom_int.h.  The latter should
never be included in class implemtations.
2002-03-26 22:07:38 +00:00