Commit graph

55 commits

Author SHA1 Message Date
Poorna d1e775313d
support decommissioning of tiered objects (#16751) 2023-03-16 07:48:05 -07:00
Harshavardhana b984bf8d1a
allow expiration of all versions during Listing() (#16757) 2023-03-09 15:15:30 -08:00
Harshavardhana 31188e9327
add parallel workers in batch replication (#16609) 2023-02-13 12:07:58 -08:00
Florian Schwab d67a846ec4
allow restarting of decommissioning if completed, failed or canceld (#16464) 2023-01-24 07:07:59 -08:00
Harshavardhana 095fc0561d
feat: allow decom of multiple pools (#16416) 2023-01-16 21:36:34 +05:30
Anis Elleuch 475a88b555
fix: error out if an object is found after a full decom (#16277) 2023-01-12 05:52:51 +05:30
Harshavardhana b882310e2b
avoid locks for internal and invalid buckets in MakeBucket() (#16302) 2022-12-23 07:46:00 -08:00
Aditya Manthramurthy a30cfdd88f
Bump up madmin-go to v2 (#16162) 2022-12-06 13:46:50 -08:00
Harshavardhana 5a8df7efb3
re-implement StorageInfo to be a peer call (#16155) 2022-12-01 14:31:35 -08:00
jiuker fe8eed963e
fix: wrapped error will not equal in decommissioning (#16113) 2022-11-24 08:00:42 -08:00
Harshavardhana 853c4de75a
allow changing endpoints in distributed setups (#16071) 2022-11-16 07:59:10 -08:00
Harshavardhana 1f3db03bf0
allow changing argument for path for SNSD setup (#16013) 2022-11-07 00:11:58 -08:00
Krishnan Parthasarathi 4523da6543
feat: introduce pool-level rebalance (#15483) 2022-10-25 12:36:57 -07:00
Anis Elleuch fc6c794972
Audit dangling object removal (#15933) 2022-10-24 11:35:07 -07:00
Anis Elleuch ac85c2af76
lifecycle: refactor rules filtering and tagging support (#15914) 2022-10-21 10:46:53 -07:00
Harshavardhana 928feb0889
remove unused debug param from evalActionFromLifecycle (#15813) 2022-10-07 10:24:12 -07:00
Anis Elleuch 158d0e26a2
decom: Ignore object/version error during deletion (#15806) 2022-10-06 09:41:58 -07:00
Klaus Post a9f1ad7924
Add extended checksum support (#15433) 2022-08-29 16:57:16 -07:00
Harshavardhana 433b6fa8fe
upgrade golang-lint to the latest (#15600) 2022-08-26 12:52:29 -07:00
Krishnan Parthasarathi 91e6af4470
Add trace support for decommissioning (#15502)
* Add trace support for decommissioning
* Add support for tracing errors during decommission
2022-08-10 12:46:45 -07:00
ebozduman b57e7321e7
Replaces 'disk'=>'drive' visible to end user (#15464) 2022-08-04 16:10:08 -07:00
Poorna 426c902b87
site replication: fix healing of bucket deletes. (#15377)
This PR changes the handling of bucket deletes for site 
replicated setups to hold on to deleted bucket state until 
it syncs to all the clusters participating in site replication.
2022-07-25 17:51:32 -07:00
Harshavardhana 7da9e3a6f8
support encrypted/compressed objects properly during decommission (#15320)
fixes #15314
2022-07-16 19:35:24 -07:00
Harshavardhana e7ac1ea54c
allow decommission to continue when healing (#15312)
Bonus:

- heal buckets in-case during startup the new
  pools have bucket missing.
2022-07-15 21:03:23 -07:00
Harshavardhana 1b339ea062
allow force delete on decom pool (#15302)
Bonus:

- skip suspended pool from being
  considered for multipart uploads

- add more context for decomErrors()
2022-07-14 20:44:22 -07:00
Harshavardhana 236ef03dbd
fix: skip objects expired via lifecycle rules during decommission (#15300) 2022-07-14 16:47:09 -07:00
Harshavardhana 0a8b78cb84
fix: simplify passing auditLog eventType (#15278)
Rename Trigger -> Event to be a more appropriate
name for the audit event.

Bonus: fixes a bug in AddMRFWorker() it did not
cancel the waitgroup, leading to waitgroup leaks.
2022-07-12 10:43:32 -07:00
Harshavardhana ae92521310
remove unnecessary nAgreed value in partial() func (#15242) 2022-07-07 13:45:34 -07:00
Harshavardhana 5802df4365
retry and resume decom operation upon retriable failures (#15244)
it is possible in a k8s-like system reading pool.bin
might not have quorum during startup, however, add
a way to retry after this failure.
2022-07-07 12:31:44 -07:00
Harshavardhana 9d80ff5a05
fix: decommission delete markers for non-current objects (#15225)
versioned buckets were not creating the delete markers
present in the versioned stack of an object, this essentially
would stop decommission to succeed.

This PR fixes creating such delete markers properly during
a decommissioning process, adds tests as well.
2022-07-05 07:37:24 -07:00
Harshavardhana b311abed31
decom IAM, Bucket metadata properly (#15220)
Current code incorrectly passed the
config asset object name while decommissioning,
make sure that we pass the right object name
to be hashed on the newer set of pools.

This PR fixes situations after a successful
decommission, the users and policies might go
missing due to wrong hashed set.
2022-07-04 14:02:54 -07:00
Harshavardhana 0fee993a4b
return appropriate error under 'decom status' (#15213)
fixes #15208
2022-07-01 16:21:23 -07:00
Harshavardhana 30c9e50701
make sure to ignore expected errors and dirname deletes (#14945) 2022-05-18 17:58:19 -07:00
Krishna Srinivas e34ca9acd1
retry each object decom upto 3 times, in-case of failure (#14861) 2022-05-11 11:37:32 -07:00
Krishnan Parthasarathi ad8e611098
feat: implement prefix-level versioning exclusion (#14828)
Spark/Hadoop workloads which use Hadoop MR 
Committer v1/v2 algorithm upload objects to a 
temporary prefix in a bucket. These objects are 
'renamed' to a different prefix on Job commit. 
Object storage admins are forced to configure 
separate ILM policies to expire these objects 
and their versions to reclaim space.

Our solution:

This can be avoided by simply marking objects 
under these prefixes to be excluded from versioning, 
as shown below. Consequently, these objects are 
excluded from replication, and don't require ILM 
policies to prune unnecessary versions.

-  MinIO Extension to Bucket Version Configuration
```xml
<VersioningConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/"> 
        <Status>Enabled</Status>
        <ExcludeFolders>true</ExcludeFolders>
        <ExcludedPrefixes>
          <Prefix>app1-jobs/*/_temporary/</Prefix>
        </ExcludedPrefixes>
        <ExcludedPrefixes>
          <Prefix>app2-jobs/*/__magic/</Prefix>
        </ExcludedPrefixes>

        <!-- .. up to 10 prefixes in all -->     
</VersioningConfiguration>
```
Note: `ExcludeFolders` excludes all folders in a bucket 
from versioning. This is required to prevent the parent 
folders from accumulating delete markers, especially
those which are shared across spark workloads 
spanning projects/teams.

- To enable version exclusion on a list of prefixes

```
mc version enable --excluded-prefixes "app1-jobs/*/_temporary/,app2-jobs/*/_magic," --exclude-prefix-marker myminio/test
```
2022-05-06 19:05:28 -07:00
Anis Elleuch 44a3b58e52
Add audit log for decommissioning (#14858) 2022-05-04 00:45:27 -07:00
Anis Elleuch 46de9ac03e
Decom: Easily restart decommission when it is done (#14855)
When a decommission task is successfully completed, failed, or canceled,
this commit allows restarting the decommission again. Restarting is not
allowed when there is an ongoing decommission task.
2022-05-03 13:36:08 -07:00
Harshavardhana 424b44c247
allow changing server command line from http->https (#14832)
this is allowed as long as order is preserved as is
on an existing setup, the new command line is updated
in `pool.bin` to facilitate future decommission's on
these pools.
2022-04-28 16:27:53 -07:00
Harshavardhana c56a139fdc
fix: support decommissioning directory objects (#14822)
improvements in this PR include

- decommission objects that have __XLDIR__ suffix
- decommission objects that have `null` version on
  a versioned bucket.
- make sure to look for any "decom" failures to ensure
  that we do not wrong conclude decom as complete without
  all files getting copied over.
- break out eagerly upon first error for objects with
  multiple versions, leave the object as is for support
  debugging and analysis.
2022-04-26 20:06:41 -07:00
Harshavardhana 7e248fc0ba
wait on parallel decom to complete before returning (#14764)
without this wait there is a potential for some objects
that are in actively being decommissioned would cancel,
however the decommission status might wrongly conclude
this as "Complete".

To avoid this make sure to add waitgroups on the parallel
workers, allowing parallel copies to complete fully before
we return.
2022-04-18 13:26:29 -07:00
Harshavardhana 8318aa0113
cancel active routine only after metadata has been saved (#14757)
currently updated pool.bin was not saved properly, that would
lead to unable to remove a pool upon a successful decommission.

fixes #14756
2022-04-15 13:16:15 -07:00
Krishna Srinivas 5f94cec1e2
Allow parallel decom migration threads to be more than erasure sets (#14733) 2022-04-12 10:49:53 -07:00
Krishna Srinivas 48594617b5
Parallelize decommissioning process (#14704) 2022-04-07 23:19:13 -07:00
Harshavardhana ee49a23220
resume/start decommission on the first node of the pool under decommission (#14705)
Additionally fixes

- IsSuspended() can use read locks
- Avoid double cancels panic on canceler
2022-04-06 23:42:05 -07:00
Krishna Srinivas bdd816488d
Get the BackendInfo to fill the apporpriate struct fields (#14660) 2022-03-30 10:48:35 -07:00
Krishna Srinivas 36dcfee2f7
Allow decomission of pool even if a drive in it is down (#14656) 2022-03-29 22:51:31 -07:00
Harshavardhana bd6f7b6d83
fix: make decommission restart non-blocking (#14591)
currently an on-going decommission, during a server
restart might block the startup sequence for relatively
longer periods, instead start the decommission in
background lazily.
2022-03-20 14:46:43 -07:00
Harshavardhana e3071157f0
allow MakeBucketLocation to work for metaBucket (#14548)
decommission would fail to start due to failure
in MakeBucketLocation() error on .minio.sys/ bucket
creation.

Allow these special buckets.
2022-03-14 11:25:24 -07:00
Harshavardhana 5d6f6d8d5b
create missing .minio.sys/config, .minio.sys/buckets during decommission (#14497) 2022-03-07 16:18:57 -08:00
Harshavardhana aaea94a48d
update quorum requirement to list all objects (#14201)
some upgraded objects might not get listed due
to different quorum ratios across objects.

make sure to list all objects that satisfy the
maximum possible quorum.
2022-01-27 17:00:15 -08:00