Commit graph

216 commits

Author SHA1 Message Date
Mark Johnston 1e91412e40 Don't set the mirror GEOM softc to NULL in g_mirror_destroy().
At this point we have not rendezvous'ed with the mirror worker thread, and
I/O may still be in flight. Various I/O completion paths expect to be able
to obtain a reference to the mirror softc from the GEOM, so setting it to
NULL may result in various NULL pointer dereferences if the mirror is
stopped with -f or the kernel is shut down while a mirror is
synchronizing. The worker thread will clear the softc pointer before
exiting.

Tested by:	pho
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
2017-04-14 17:08:37 +00:00
Mark Johnston 77011eac86 Check for a provider error before enqueuing mirror I/O.
We are otherwise susceptible to a race with a concurrent teardown of the
mirror provider, causing the I/O to be left uncompleted after the mirror
started withering.

Tested by:	pho
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
2017-04-14 17:03:32 +00:00
Mark Johnston a65d524afc Stop mirror synchronization before draining the I/O queue.
Regular I/O requests may be blocked by concurrent synchronization requests
targeted to the same LBAs, in which case they are moved to a holding queue
until the conflicting I/O completes. We therefore want to stop
synchronization before completing pending I/O in g_mirror_destroy_provider()
since this ensures that blocked I/O requests are completed as well.

Tested by:	pho
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
2017-04-14 16:54:50 +00:00
Mark Johnston a4834289d6 Handle NULL entries in gmirror disk ds_bios arrays.
Entries may be removed and freed if an I/O error occurs during mirror
synchronization, so we cannot assume that all entries of ds_bios are
valid.

Also ensure that a synchronization BIO's array index is preserved after
a successful write.

Reported and tested by:	pho
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
2017-04-10 17:15:59 +00:00
Mark Johnston 0d75d0dfbc Avoid sleeping when the mirror I/O queue is non-empty.
A request may be queued while the queue lock is dropped when the mirror is
being destroyed. The corresponding wakeup would be lost, possibly resulting
in an apparent hang of the mirror worker thread.

Tested by:	pho (part of a larger patch)
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
2017-03-29 19:39:07 +00:00
Mark Johnston c1ab409cba Remove an unneeded g_mirror_destroy_provider() call.
The worker thread will destroy the mirror provider as part of its teardown
sequence. The call made sense in the initial revision of gmirror, but
became unnecessary in r137248.

Tested by:	pho (part of a larger diff)
MFC afteR:	2 weeks
Sponsored by:	Dell EMC Isilon
2017-03-29 19:30:22 +00:00
Mark Johnston 819cd913f4 Refine r301173 a bit.
- Don't execute any of g_mirror_shutdown_post_sync() when panicking. We
  cannot safely idle the mirror or stop synchronization in that state, and
  the current attempts to do so complicate debugging of gmirror itself.
- Check for a non-NULL panicstr instead of using SCHEDULER_STOPPED(). The
  latter was added for use in the locking primitives.

Reviewed by:	mav, pjd
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
2017-03-27 16:25:58 +00:00
Alexander Motin b6fe583c55 Add gmirror create subcommand, alike to gstripe, gconcat, etc.
It is quite specific mode of operation without storing on-disk metadata.
It can be useful in some cases in combination with some external control
tools handling mirror creation and disks hot-plug.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2016-11-30 09:27:08 +00:00
Alexander Motin dc399583ba Use providergone method to cover race between destroy and g_access().
Reviewed by:	markj
MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2016-11-13 03:56:26 +00:00
Mark Johnston 5c2ac5cf2a gmirror: Add a subroutine to free synchronization BIOs.
This addresses a memory leak that occurs upon an I/O error during a mirror
synchronization.

MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
2016-10-20 23:08:40 +00:00
Mark Johnston b450976dc2 gmirror: Release pending regular requests when synchronization stops.
Normally gmirror allows colliding requests to proceed whenever a
synchronization request completes and advances to the next offset. However
if an I/O request collides with one of the final g_mirror_syncreqs, nothing
releases it once synchronization completes, resulting in an apparent I/O
hang. The same problem can occur if synchronization is aborted by an
I/O error. Therefore, be sure to requeue pending requests when
mirror synchronization is stopped for any reason.

While here, remove some dead code from g_mirror_regular_release().

MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
2016-10-20 23:02:30 +00:00
Alexander Motin 5a236b0ef9 Fix possible geom destruction before final provider close.
Introduce internal counter to track opens.  Using provider's counters is
not very successfull after calling g_wither_provider().

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2016-10-06 15:20:05 +00:00
Mark Johnston 4dea20be45 gmirror: Write an updated syncid before queuing writes.
When a syncid bump is pending, any write to the mirror results in the
updated syncid being written to each component's metadata block. However,
the update was only being performed after the writes to the mirror
componenents were queued. Instead, synchronously update the metadata block
first.

MFC after:	3 weeks
Sponsored by:	Dell EMC Isilon
2016-10-06 00:13:55 +00:00
Mark Johnston 903618cd65 gmirror: Bump the syncid if broken disks are found during startup.
Consider a mirror with two components, m1 and m2. Suppose a hardware error
results in the removal of m2, with m1's genid bumped. Suppose further that
a replacement mirror component m3 is created and synchronized, after which
the system is shut down uncleanly. During a subsequent bootup, if gmirror
tastes m1 and m2 first, m2 will be removed from the mirror because it is
broken, but the mirror will be started without bumping the syncid on m1
because all elements of the mirror are accounted for. Then m3 will be
added to the already-running mirror with the same syncid as m1, so the
components will not be synchronized despite the unclean shutdown.

Handle this scenario by bumping the syncid of healthy components if any
broken mirrors are discovered during mirror startup.

MFC after:	3 weeks
Sponsored by:	Dell EMC Isilon
2016-10-06 00:05:45 +00:00
Mark Johnston fff048e4bc gmirror: Use bool instead of boolean_t.
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
2016-10-05 23:55:01 +00:00
Alexander Motin 8b64f3ca6c Use g_wither_provider() where applicable.
It is just a helper function combining G_PF_WITHER setting with
g_orphan_provider().
2016-09-23 21:29:40 +00:00
Mark Johnston 4bfb585351 Don't treat an error from g_mirror_clear_metadata() as fatal.
Such errors can occur as the result of a write error or because the disk
backing the mirror element was removed. They result in a generation ID bump
on all active elements of the mirror, so we can safely disconnect the mirror
component rather than destroy it.

MFC after:	2 weeks
Sponsored by:	EMC / Isilon Storage Division
Differential Revision:	https://reviews.freebsd.org/D7750
2016-09-06 23:42:59 +00:00
Mark Johnston 40c5032d32 Add some fail points to gmirror.
These are useful for testing changes to I/O error handling, and for
reproducing existing bugs in a controlled manner. The fail points are

    g_mirror_regular_request_read
    g_mirror_regular_request_write
    g_mirror_sync_request_read
    g_mirror_sync_request_write
    g_mirror_metadata_write

They all effectively allow one to inject an error value into the bio_error
field of a corresponding BIO request as it is being completed.

MFC after:	2 weeks
Sponsored by:	EMC / Isilon Storage Division
2016-09-06 23:35:48 +00:00
Mark Johnston 7d31c3939a Move some gmirror metadata update messages to a higher debug level.
These can be printed quite frequently from a mostly-idle mirror, cluttering
the console.

MFC after:	1 week
2016-07-14 00:40:24 +00:00
Mark Johnston be20fc2e90 Do not complete pending gmirror BIOs when tearing down the provider.
This will result in lock recursion and is more generally incorrect since
the completion handlers will just reinsert the BIOs into the queue we're
trying to drain.

Reviewed by:	imp, ngie
Approved by:	re (gjb)
MFC after:	3 weeks
Sponsored by:	EMC / Isilon Storage Division
Differential Revision:	https://reviews.freebsd.org/D6908
2016-06-22 21:00:28 +00:00
Gleb Smirnoff a7c5163b5f When we are in panic, always go the asynchronous path in g_mirror_destroy(),
otherwise the system will hang.

This is a temporarily least intrusive crutch to get certain panicing systems
dumping. The proper fix should question is g_mirror_destroy() should be called
on a panicing system at all.

Discussed with:	mav
2016-06-01 22:11:54 +00:00
Konstantin Belousov 4e2732b550 Removal of Giant droping wrappers for GEOM classes.
Sponsored by:	The FreeBSD Foundation
2016-05-20 08:25:37 +00:00
Pedro F. Giffuni e8d5712284 sys/geom: spelling fixes in comments.
No functional change.
2016-04-29 20:56:58 +00:00
Pedro F. Giffuni b99bce73e2 geom: unsign some types to match their definitions and avoid overflows.
In struct:gctl_req, nargs is unsigned.

In mirror:
g_mirror_syncreqs is unsigned.

In raid:
in struct:g_raid_volume, v_disks_count is unsigned.

In virstor:
in struct:g_virstor_softc, n_components is unsigned.

MFC after:	2 weeks
2016-04-27 15:10:40 +00:00
Warner Losh 9a8fa125c1 Bump bio_cmd and bio_*flags from 8 bits to 16.
Differential Revision: https://reviews.freebsd.org/D5784
2016-04-14 05:10:41 +00:00
Warner Losh c55f57071a Create an API to reset a struct bio (g_reset_bio). This is mandatory
for all struct bio you get back from g_{new,alloc}_bio. Temporary
bios that you create on the stack or elsewhere should use this before
first use of the bio, and between uses of the bio. At the moment, it
is nothing more than a wrapper around bzero, but that may change in
the future. The wrapper also removes one place where we encode the
size of struct bio in the KBI.
2016-02-17 17:16:02 +00:00
Jung-uk Kim fd90e2ed54 CALLOUT_MPSAFE has lost its meaning since r141428, i.e., for more than ten
years for head.  However, it is continuously misused as the mpsafe argument
for callout_init(9).  Deprecate the flag and clean up callout_init() calls
to make them more consistent.

Differential Revision:	https://reviews.freebsd.org/D2613
Reviewed by:	jhb
MFC after:	2 weeks
2015-05-22 17:05:21 +00:00
Alexander Motin 5d85cd2d11 Remove extra semicolon.
MFC after:	1 week
2015-03-27 12:45:20 +00:00
Alexander Motin 3ab0187add Remove request sorting from GEOM_MIRROR and GEOM_RAID.
When CPU is not busy, those queues are typically empty.  When CPU is busy,
then one more extra sorting is the last thing it needs.  If specific device
(HDD) really needs sorting, then it will be done later by CAM.

This supposed to fix livelock reported for mirror of two SSDs, when UFS
fires zillion of BIO_DELETE requests, that totally blocks I/O subsystem by
pointless sorting of requests and responses under single mutex lock.

MFC after:	2 weeks
2015-03-27 12:44:28 +00:00
Alexander Motin 41fe4ba647 Fix bug on memory allocation error in split method.
While there, use bioq_takefirst() in place where it is convenient.

MFC after:	1 week
2015-03-27 11:14:12 +00:00
Alexander Motin 7715befdf2 Fix couple BIO_DELETE bugs in geom_mirror.
Do not report GEOM::candelete if none of providers support BIO_DELETE.
If consumer still requests BIO_DELETE, report error instead of hanging.

MFC after:	2 weeks
2015-03-12 10:20:53 +00:00
Hans Petter Selasky af3b2549c4 Pull in r267961 and r267973 again. Fix for issues reported will follow. 2014-06-28 03:56:17 +00:00
Glen Barber 37a107a407 Revert r267961, r267973:
These changes prevent sysctl(8) from returning proper output,
such as:

 1) no output from sysctl(8)
 2) erroneously returning ENOMEM with tools like truss(1)
    or uname(1)
 truss: can not get etype: Cannot allocate memory
2014-06-27 22:05:21 +00:00
Hans Petter Selasky 3da1cf1e88 Extend the meaning of the CTLFLAG_TUN flag to automatically check if
there is an environment variable which shall initialize the SYSCTL
during early boot. This works for all SYSCTL types both statically and
dynamically created ones, except for the SYSCTL NODE type and SYSCTLs
which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to
be used in the case a tunable sysctl has a custom initialisation
function allowing the sysctl to still be marked as a tunable. The
kernel SYSCTL API is mostly the same, with a few exceptions for some
special operations like iterating childrens of a static/extern SYSCTL
node. This operation should probably be made into a factored out
common macro, hence some device drivers use this. The reason for
changing the SYSCTL API was the need for a SYSCTL parent OID pointer
and not only the SYSCTL parent OID list pointer in order to quickly
generate the sysctl path. The motivation behind this patch is to avoid
parameter loading cludges inside the OFED driver subsystem. Instead of
adding special code to the OFED driver subsystem to post-load tunables
into dynamically created sysctls, we generalize this in the kernel.

Other changes:
- Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask"
to "hw.pcic.intr_mask".
- Removed redundant TUNABLE statements throughout the kernel.
- Some minor code rewrites in connection to removing not needed
TUNABLE statements.
- Added a missing SYSCTL_DECL().
- Wrapped two very long lines.
- Avoid malloc()/free() inside sysctl string handling, in case it is
called to initialize a sysctl from a tunable, hence malloc()/free() is
not ready when sysctls from the sysctl dataset are registered.
- Bumped FreeBSD version to indicate SYSCTL API change.

MFC after:	2 weeks
Sponsored by:	Mellanox Technologies
2014-06-27 16:33:43 +00:00
Bryan Drewery 09adfca39f Show error code when failing to destroy a mirror on delay
Sponsored by:	EMC / Isilon Storage Division
MFC after:	2 weeks
2014-04-05 03:01:29 +00:00
Andrey V. Elsukov ae3bc0acff Add an ability to stop gmirror and clear its metadata in one command.
This fixes the problem, when gmirror starts again just after stop.

The problem occurs when gmirror's component has geom label with equal size.
E.g. gpt and gptid have the same size as partition, diskid has the same
size as entire disk. When gmirror's geom has been destroyed, glabel
creates its providers and this initiate retaste.

Now "gmirror destroy" command is available. It destroys geom and also
erases gmirror's metadata.

MFC after:	2 weeks
2013-12-27 02:43:53 +00:00
Andrey V. Elsukov 7c5710dbaf Prevent users from deactivating the last component of a mirror.
PR:		184985
MFC after:	1 week
2013-12-19 22:13:12 +00:00
Andrey V. Elsukov 32cea4ca0f Add "resize" verb to gmirror(8) and such functionality to geom_mirror(4).
Now it is easy to expand the size of the mirror when all its components
are replaced. Also add g_resize method to geom_mirror class. It will write
updated metadata to new last sector, when parent provider is resized.

Silence from:	geom@
MFC after:	1 month
2013-11-19 22:55:17 +00:00
Alexander Motin 40ea77a036 Merge GEOM direct dispatch changes from the projects/camlock branch.
When safety requirements are met, it allows to avoid passing I/O requests
to GEOM g_up/g_down thread, executing them directly in the caller context.
That allows to avoid CPU bottlenecks in g_up/g_down threads, plus avoid
several context switches per I/O.

The defined now safety requirements are:
 - caller should not hold any locks and should be reenterable;
 - callee should not depend on GEOM dual-threaded concurency semantics;
 - on the way down, if request is unmapped while callee doesn't support it,
   the context should be sleepable;
 - kernel thread stack usage should be below 50%.

To keep compatibility with GEOM classes not meeting above requirements
new provider and consumer flags added:
 - G_CF_DIRECT_SEND -- consumer code meets caller requirements (request);
 - G_CF_DIRECT_RECEIVE -- consumer code meets callee requirements (done);
 - G_PF_DIRECT_SEND -- provider code meets caller requirements (done);
 - G_PF_DIRECT_RECEIVE -- provider code meets callee requirements (request).
Capable GEOM class can set them, allowing direct dispatch in cases where
it is safe.  If any of requirements are not met, request is queued to
g_up or g_down thread same as before.

Such GEOM classes were reviewed and updated to support direct dispatch:
CONCAT, DEV, DISK, GATE, MD, MIRROR, MULTIPATH, NOP, PART, RAID, STRIPE,
VFS, ZERO, ZFS::VDEV, ZFS::ZVOL, all classes based on g_slice KPI (LABEL,
MAP, FLASHMAP, etc).

To declare direct completion capability disk(9) KPI got new flag equivalent
to G_PF_DIRECT_SEND -- DISKFLAG_DIRECT_COMPLETION.  da(4) and ada(4) disk
drivers got it set now thanks to earlier CAM locking work.

This change more then twice increases peak block storage performance on
systems with manu CPUs, together with earlier CAM locking changes reaching
more then 1 million IOPS (512 byte raw reads from 16 SATA SSDs on 4 HBAs to
256 user-level threads).

Sponsored by:	iXsystems, Inc.
MFC after:	2 months
2013-10-22 08:22:19 +00:00
Ed Schouten 647a92d62b Fix the formatting of the error message.
The G_MIRROR_DEBUG() macro already appends a newline. Also, most of the
log messages emitted by gmirror start with an uppercase letter.
2013-08-12 18:17:45 +00:00
Scott Long f07b69478e Fix a mystery cut-n-paste corruption from the previous commit.
Submitted by:	Brenden Fabeny
2013-06-19 23:09:10 +00:00
Scott Long 2084cbe975 Mark geom_mirror as capable of unmapped i/o
Obtained from:	Netflix
MFC after:	3 days
2013-06-19 21:52:32 +00:00
Andriy Gapon 1f1088b843 g_mirror: g_getattr() failure should not be fatal
This allows to use gmirror e.g. on top of ZVOLs.

PR:		kern/175323
Submitted by:	Alexei.Volkov@softlynx.ru, mav
Reported by:	Alexei.Volkov@softlynx.ru
Tested by:	Alexei.Volkov@softlynx.ru
Reviewed by:	ae, mav, pjd
MFC after:	1 week
2013-01-26 10:50:04 +00:00
Alexander Motin cbab616174 Alike to r242314 for GRAID make GMIRROR more aggressive in marking volumes
as clean on shutdown and move that action from shutdown_pre_sync stage to
shutdown_post_sync to avoid extra flapping.

ZFS tends to not close devices on shutdown, that doesn't allow GEOM RAID
to shutdown gracefully.  To handle that, mark volume as clean just when
shutdown time comes and there are no active writes.

PR:		kern/113957
MFC after:	2 weeks
2013-01-15 01:13:55 +00:00
Gleb Smirnoff 4a7f7b10b5 When synchronizing, include in the config dump amount of
bytes syncronized.
  The rationale behind this is the following: for large disks the
percent synchronisation counter ticks too seldom, and monitoring
software (as well as human operator) can't tell whether
synchronisation goes on or one of disks got stuck. On an idle
server one can look into gstat and see whether synchronisation goes
on or not, but on a busy server that won't work. Also, new value
monitored can be differentiated obtaining the synchronisation speed
quite precisely.

Submitted by:	Konstantin Kukushkin <dark ramtel.ru>
Reviewed by:	pjd
2012-09-11 20:20:13 +00:00
Gleb Smirnoff d89862ac87 Make geom_mirror more friendly to SSDs. To properly support TRIM,
we need to pass BIO_DELETE requests down to providers that support
it. Also, we need to announce our support for BIO_DELETE to upper
consumer. This requires:

- In g_mirror_start() return true for "GEOM::candelete" request.
- In g_mirror_init_disk() probe below provider for "GEOM::candelete"
  attribute, and mark disk with a flag if it does support BIO_DELETE.
- In g_mirror_register_request() distribute BIO_DELETE requests only
  to those disks, that do support it.

Note that we announce "GEOM::candelete" as true unconditionally of
whether we have TRIM-capable media down below or not. This is made
intentionally, because upper consumer (usually UFS) requests the
attribite only once at mount time. And if user ever migrates his
mirror from HDDs to SSDs, then he/she would get TRIM working without
remounting filesystem.

Reviewed by:	pjd
2012-07-01 15:43:52 +00:00
Gleb Smirnoff b0ae63ca25 In g_mirror_regular_request() upon successful delivery treat
BIO_DELETE requests same way as BIO_WRITE removing them from
queue. This fixes panic with BIO_DELETE operations on geom_mirror.

Reviewed by:	pjd
2012-07-01 15:30:43 +00:00
Andrey V. Elsukov f931cd70af Prevent removing of the last active component from a mirror.
PR:		kern/154860
Reviewed by:	pjd
MFC after:	1 week
2012-05-18 09:22:21 +00:00
Andrey V. Elsukov 1ee0138d2f Introduce new device flag G_MIRROR_DEVICE_FLAG_TASTING. It should
protect geom from destroying while it is tasting.

PR:		kern/154860
Reviewed by:	pjd
MFC after:	1 week
2012-05-18 09:19:07 +00:00
Ed Schouten 6472ac3d8a Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.
The SYSCTL_NODE macro defines a list that stores all child-elements of
that node. If there's no SYSCTL_DECL macro anywhere else, there's no
reason why it shouldn't be static.
2011-11-07 15:43:11 +00:00
Andrey V. Elsukov 5d807a0e1a Include sys/sbuf.h directly.
Reviewed by:	pjd
2011-07-11 05:22:31 +00:00
Alexander Motin 90f2be2430 Implement relaxed comparision for hardcoded provider names to make it
ignore adX/adaY difference in both directions to simplify migration to
the CAM-based ATA or back.
2011-04-27 00:10:26 +00:00
Alexander Leidinger cb08c2cc83 Add some FEATURE macros for various GEOM classes.
No FreeBSD version bump, the userland application to query the features will
be committed last and can serve as an indication of the availablility if
needed.

Sponsored by:	Google Summer of Code 2010
Submitted by:	kibab
Reviewed by:	silence on geom@ during 2 weeks
X-MFC after:	to be determined in last commit with code from this project
2011-02-25 10:24:35 +00:00
Pawel Jakub Dawidek a478ea7490 - Allow to specify value as const pointers.
- Make optional string values always an empty string.
2010-09-13 08:56:07 +00:00
Alexander Motin 3d7cfb15f5 Remove bintime_cmp() function, unused since r200086.
MFC after:	1 week
2010-08-18 15:38:10 +00:00
Alexander Motin 86de0ca52c Move wakeup() out of mutex to reduce contention. 2010-01-05 10:30:56 +00:00
Alexander Motin 92f60381d9 As soon as mirror has no own stripes, report largest stripe of unrerlying
components, hoping others fit, if they are not equal.
2009-12-24 12:17:22 +00:00
Alexander Motin 891852cc12 Change 'load' balancing mode algorithm:
- Instead of measuring last request execution time for each drive and
choosing one with smallest time, use averaged number of requests, running
on each drive. This information is more accurate and timely. It allows to
distribute load between drives in more even and predictable way.
- For each drive track offset of the last submitted request. If new request
offset matches previous one or close for some drive, prefer that drive.
It allows to significantly speedup simultaneous sequential reads.

PR:		kern/113885
Reviewed by:	sobomax
2009-12-03 21:47:51 +00:00
Pawel Jakub Dawidek b740e905a4 Add support for changing providers priority.
Submitted by:	Mel Flynn
2009-09-06 06:52:06 +00:00
Andrew Thompson 853a10a581 Revert r190676,190677
The geom and CAM changes for root_hold are the wrong solution for USB design
quirks.

Requested by:	scottl
2009-04-10 04:08:34 +00:00
Andrew Thompson 626fc9fe3d Add a how argument to root_mount_hold() so it can be passed NOWAIT and be called
in situations where sleeping isnt allowed.
2009-04-03 19:46:12 +00:00
Julian Elischer 3745c395ec Rename the kthread_xxx (e.g. kthread_create()) calls
to kproc_xxx as they actually make whole processes.
Thos makes way for us to add REAL kthread_create() and friends
that actually make theads. it turns out that most of these
calls actually end up being moved back to the thread version
when it's added. but we need to make this cosmetic change first.

I'd LOVE to do this rename in 7.0  so that we can eventually MFC the
new kthread_xxx() calls.
2007-10-20 23:23:23 +00:00
Jeff Roberson 982d11f836 Commit 14/14 of sched_lock decomposition.
- Use thread_lock() rather than sched_lock for per-thread scheduling
   sychronization.
 - Use the per-process spinlock rather than the sched_lock for per-process
   scheduling synchronization.

Tested by:      kris, current@
Tested on:      i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-05 00:00:57 +00:00
Pawel Jakub Dawidek 501250ba60 Now, that we have gjournal in the tree add possibility to configure
gmirror and graid3 in a way that it is not resynchronized after a
power failure or system crash.
It is safe when gjournal is running on top of gmirror/graid3.
2006-11-01 22:51:49 +00:00
Pawel Jakub Dawidek 42461fba65 Implement BIO_FLUSH handling by simply passing it down to the components.
Sponsored by:	home.pl
2006-10-31 21:23:51 +00:00
Pawel Jakub Dawidek 8e007c52fd Fix synchronization in gmirror and graid3 which I broken. Synchronization
request can still have bio_to set to sc_provider (this is READ part of a
synchronization request) and in this case g_{mirror,raid3}_sync() wasn't
called as it should be.

MFC after:	1 week
2006-09-13 15:46:49 +00:00
John-Mark Gurney 0cca572e64 move created/detected/activated under debug level 1 to quiet the common case..
add count of active and total components to the launched line so you can
see at a glance if your mirror/raid3 is complete...

now:
GEOM_MIRROR: Device mirror/sam launched (2/2).

Reviewed by:	pjd
2006-09-09 21:45:37 +00:00
Pawel Jakub Dawidek de6f1c7c6c Not only a request from us can be passed to g_{mirror,raid3}_worker()
function, but also a request to us, in which case checking bio_cflags
is wrong, because the class above us is controling it, not we.

MFC after:	1 week
2006-08-09 09:41:53 +00:00
Yaroslav Tykhiy 776fc0e90e Commit the results of the typo hunt by Darren Pilgrim.
This change affects documentation and comments only,
no real code involved.

PR:		misc/101245
Submitted by:	Darren Pilgrim <darren pilgrim bitfreak org>
Tested by:	md5(1)
MFC after:	1 week
2006-08-04 07:56:35 +00:00
Pawel Jakub Dawidek 3c57a41d7b Don't use f-word in comments. We are gentlemans.
Pointed out by:	Maciej Sobczak
2006-08-01 23:17:33 +00:00
Pawel Jakub Dawidek 86ed3c25e1 Always allow to specify components with /dev/ prefix.
MFC after:	3 days
2006-07-13 20:37:59 +00:00
Pawel Jakub Dawidek 3525bb6b98 Use proper defines instead of magic values.
MFC after:	1 week
2006-07-10 21:18:00 +00:00
Pawel Jakub Dawidek 1f7fec3cb5 Allow to close access even if device is already destroyed.
Reported by:	Ulrich Spoerlein <uspoerlein@gmail.com>
PR:		kern/98093
MFC after:	1 week
2006-07-03 10:32:38 +00:00
Pawel Jakub Dawidek a2fe5c6676 - Remove dead code.
- Comment possible event miss, which isn't critical, but probably can be
  fixed by replacing the event lock usage with the queue lock.

MFC after:	2 weeks
2006-04-28 12:13:49 +00:00
Pawel Jakub Dawidek a063667622 Be sure to not destroy device twice. This is not possible in theory, but
with this change there is even no theoretical race.

MFC after:	2 weeks
2006-04-28 11:47:28 +00:00
Pawel Jakub Dawidek 712fe9bd7a Introduce and use delayed-destruction functionality from a pre-sync hook,
which means that devices will be destroyed on last close.

This fixes destruction order problems when, eg. RAID3 array is build on
top of RAID1 arrays.

Requested, reviewed and tested by:	ru
MFC after:	2 weeks
2006-04-10 10:32:22 +00:00
Pawel Jakub Dawidek 2e128ca835 - 'ndisks' variable is not boolean, so compare it with a value.
- Keep conditions order consistent with the comment above.

MFC after:	3 days
2006-03-30 12:15:41 +00:00
Pawel Jakub Dawidek 9bfdf5987d Update copyright for 2006. 2006-03-19 12:55:51 +00:00
Pawel Jakub Dawidek 18d370acae kern.geom.mirror.sync_requests=2 seems to be a better default - it still
keeps disks very busy, but makes system much more responsive.

While here, kill extra space.
2006-03-19 10:49:05 +00:00
Ruslan Ermilov ef25813de6 Fix build on 64-bit platforms. 2006-03-13 14:48:45 +00:00
Pawel Jakub Dawidek 855761d5db - Speed up synchronization process by using configurable number of I/O
requests in parallel.
  + Add kern.geom.mirror.sync_requests tunable which defines how many parallel
    I/O requests should be used.
  + Retire kern.geom.mirror.reqs_per_sync and kern.geom.mirror.syncs_per_sec
    sysctls.
- Fix race between regular and synchronization requests.
- Reimplement mirror's data synchronization - do not use the topology lock
  for this purpose, as it may case deadlocks.
- Stop synchronization from pre-sync hook.
- Fix some other minor issues.

MFC after:	3 days
2006-03-13 00:58:41 +00:00
Pawel Jakub Dawidek 9d793bdd46 When inserting a new component md_provsize metadata field wasn't set, which
means that old problem was triggered (when two providers end at the same
offset, eg. ad0 and ad0s1 and the wrong was is picked up by gmirror/graid3).

Reported by:	Michal Suszko <dry@dry.pl>
MFC after:	3 days
2006-03-10 07:41:31 +00:00
Pawel Jakub Dawidek 4686187543 Allow to dump kernel to gmirror providers.
Some conditions have to be met to make it work properly. This will be
described in the manual page.

MFC after:	3 days
2006-03-08 08:27:33 +00:00
Pawel Jakub Dawidek bf31327cca On component state change to ACTIVE don't forget to update metadata.
MFC after:	3 days
2006-02-12 17:38:09 +00:00
Pawel Jakub Dawidek 01f1f41c25 Use time_uptime instead of time_second, as the latter may go backwards.
Suggested by:	ru
MFC after:	3 days
2006-02-12 17:36:09 +00:00
Pawel Jakub Dawidek d4b0268a24 - Add kern.geom.mirror.disconnect_on_failure sysctl/tunnable (default to 1
to preserve currect behaviour). When set to 0, components are not
  disconnected - gmirror will try to still use them (only first error will
  be logged). This is helpful when we have two broken components, but in
  different places, so actually all data is available.
  Such buggy component will be visible in 'gmirror list' output with flag
  BROKEN.
- Never disconnect the last valid component. If we detect errors there we
  will just pass them up. This wasn't reasonable to deny access to the
  whole provider because of one broken sector.

Prodded by:	ru
MFC after:	3 days
2006-02-11 17:39:29 +00:00
Pawel Jakub Dawidek fe6f94ea84 Mark array as CLEAN when there are no write requests in
kern.geom.mirror.idletime seconds. Write, not any requests.
Mark array as clean immediatelly on last write close.

Prodded by:	ru
MFC after:	3 days
2006-02-11 14:42:23 +00:00
Pawel Jakub Dawidek 38ea96ac99 Remove trailing spaces. 2006-02-01 12:06:01 +00:00
Pawel Jakub Dawidek a49c0bd40a Remove dead code.
Found by:	Coverity Prevent(tm)
Coverity ID:	CID104
MFC after:	3 days
2006-01-18 21:42:19 +00:00
Maxim Sobolev 8a4a44b5aa Check for g_read_data(9) errors properly:
o The only indication of error condition is NULL value returned by
  the function;

o value pointed to by error argument is undefined in the case when
  operation completes successfully.

Discussed with: phk
2005-11-30 19:24:51 +00:00
Robert Watson 5bb84bc84b Normalize a significant number of kernel malloc type names:
- Prefer '_' to ' ', as it results in more easily parsed results in
  memory monitoring tools such as vmstat.

- Remove punctuation that is incompatible with using memory type names
  as file names, such as '/' characters.

- Disambiguate some collisions by adding subsystem prefixes to some
  memory types.

- Generally prefer lower case to upper case.

- If the same type is defined in multiple architecture directories,
  attempt to use the same name in additional cases.

Not all instances were caught in this change, so more work is required to
finish this conversion.  Similar changes are required for UMA zone names.
2005-10-31 15:41:29 +00:00
Pawel Jakub Dawidek 59ddf345d5 After provider creation!! 2005-05-25 15:54:17 +00:00
Pawel Jakub Dawidek 0f2bbe5ba4 - Call root_mount_rel() when provider IS created, not earlier.
This should close the race observed by Daniel Eriksson.
- Remove redundant wakeup().
2005-05-25 13:10:04 +00:00
Pawel Jakub Dawidek 4eafb037f6 Add some debug code to diagnose root-on-mirror problems with recent -current.
Reported by:	Daniel Eriksson
2005-05-23 13:05:07 +00:00
Pawel Jakub Dawidek 862f5624ea Add KASSERT() to be sure there is an active component.
Suggested by:	Coverity Prevent analysis tool
2005-05-11 18:13:51 +00:00
Pawel Jakub Dawidek 3865ca2e13 Fix provider's size check for 'insert' command.
Before this fix one was able to insert one sector too small provider.

MFC after:	3 days
2005-04-25 10:41:26 +00:00
Pawel Jakub Dawidek 7979b3683c Remove the hack which allowed to use gmirror for root file system,
use root_mount KPI instead.
2005-04-19 21:47:25 +00:00
Pawel Jakub Dawidek c2ca10933d Make the code more obvious - when an error occurs in g_mirror_connect_disk(),
detach and destroy consumer before returning.
2005-03-26 17:23:01 +00:00
Pawel Jakub Dawidek cc6aa917b9 Check for return values.
Submitted by:	sam
Found by:	Coverity Prevent analysis tool
2005-03-26 16:51:19 +00:00
Pawel Jakub Dawidek e68909854c - Add md_provsize field to metadata, which will help with
shared-last-sector problem.
  After this change, even if there is more than one provider with the same
  last sector, the proper one will be chosen based on its size.
  It still doesn't fix the 'c' partition problem (when da0s1 can be confused
  with da0s1c) and situation when 'a' partition starts at offset 0
  (then da0s1a can be confused with da0s1 and da0s1c). One can use '-h'
  option there, when creating device or avoid sharing last sector.
  Actually, when providers share the same last sector and their size is equal,
  they provide exactly the same data, so the name (da0s1, da0s1a, da0s1c)
  isn't important at all.
- Provide backward compatibility.
- Update copyright's year.

MFC after:	1 week
2005-02-27 23:07:47 +00:00