Commit graph

3429 commits

Author SHA1 Message Date
Anis Elleuch 14d89eaae4
mrf: Enhance behavior for better results (#11788)
MRF was starting to heal when it receives a disk connection event, which
is not good when a node having multiple disks reconnects to the cluster.

Besides, MRF needs Remove healing option to remove stale files.
2021-03-18 11:19:02 -07:00
Harshavardhana add3cd4e44
allow configuring delete cleanup interval from default 10minutes (#11818) 2021-03-17 15:15:58 -07:00
Harshavardhana 60b0f2324e
storage write call path optimizations (#11805)
- write in o_dsync instead of o_direct for smaller
  objects to avoid unaligned double Write() situations
  that may arise for smaller objects < 128KiB
- avoid fallocate() as its not useful since we do not
  use Append() semantics anymore, fallocate is not useful
  for streaming I/O we can save on a syscall
- createFile() doesn't need to validate `bucket` name
  with a Lstat() call since createFile() is only used
  to write at `minioTmpBucket`
- use io.Copy() when writing unAligned writes to allow
  usage of ReadFrom() from *os.File providing zero
  buffer writes().
2021-03-17 09:38:38 -07:00
Anis Elleuch 0eb146e1b2
add additional metrics per disk API latency, API call counts #11250)
```
mc admin info --json
```

provides these details, for now, we shall eventually 
expose this at Prometheus level eventually. 

Co-authored-by: Harshavardhana <harsha@minio.io>
2021-03-16 20:06:57 -07:00
Andreas Auernhammer e197800f90
s3v4: read and verify S3 signature v4 chunks separately (#11801)
This commit fixes a security issue in the signature v4 chunked
reader. Before, the reader returned unverified data to the caller
and would only verify the chunk signature once it has encountered
the end of the chunk payload.

Now, the chunk reader reads the entire chunk into an in-memory buffer,
verifies the signature and then returns data to the caller.

In general, this is a common security problem. We verifying data
streams, the verifier MUST NOT return data to the upper layers / its
callers as long as it has not verified the current data chunk / data
segment:
```
func (r *Reader) Read(buffer []byte) {
   if err := r.readNext(r.internalBuffer); err != nil {
      return err
   }
   if err := r.verify(r.internalBuffer); err != nil {
      return err
   }
   copy(buffer, r.internalBuffer)
}
```
2021-03-16 13:33:40 -07:00
Klaus Post 771dea175c
erasure pools enable faster checks for file not found (#11799)
For operations that require the object to exist make it possible to 
detect if the file isn't found in *any* pool.

This will allow these to return the error early without having to re-check.
2021-03-16 11:02:20 -07:00
Harshavardhana 6160188bf3
fix: erasure index based reading based on actual ParityBlocks (#11792)
in some setups with ordering issues in drive configuration,
we should rely on expected parityBlocks instead of `len(disks)/2`
2021-03-15 20:03:13 -07:00
Steve Wills 642ba3f2d6
fix: runtime issue on FreeBSD due to missing O_NOATIME/O_DSYNC support (#11790)
See also:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=253937
2021-03-15 14:02:36 -07:00
Harshavardhana afbd3e41eb
add missing principalId in web notifications (#11777)
fixes #11561
2021-03-13 10:52:43 -08:00
Poorna Krishnamoorthy 5e003549cc
Replication: Enforce DeleteMarker disable setting (#11720)
This PR also enforces DeleteReplication
disable setting
2021-03-13 10:28:35 -08:00
Nitish Tiwari 7fa3e4106b
Add consoleAdmin as a default canned policy (#11770) 2021-03-12 12:51:43 -08:00
Philip Brown 75db500e85
cmd/os-readdir_other.go - return nil with err (#11772) 2021-03-12 07:22:25 -08:00
Harshavardhana feafccf007
handle trimming '/' if present in the object names (#11765)
- MultipleDeletes should handle '/' prefix for objectnames
- Trimming the slash alone is enough for ListObjects()
  prefix and markers

fixes #11769
2021-03-11 13:57:03 -08:00
Anis Elleuch f92b7a5621
Browser: Shared link has content-disposition header (#11712)
The shared link will be automatically downloadable when the user opens
the shared link in a browser.
2021-03-10 23:02:16 -08:00
Poorna Krishnamoorthy c25e75f0b5
Fix redact LDAP password properly (#11762)
fixes #11742 previous pull request #11750 fixed only the web trace
2021-03-10 11:05:38 -08:00
Harshavardhana 777344a594
add release build-arg to docker multiarch builds (#11754)
additional paths to ignore for healing
2021-03-10 09:38:35 -08:00
Poorna Krishnamoorthy 878bc6c72b
Redact LDAP password if any in request trace (#11750)
Fixes: #11742
2021-03-09 14:43:16 -08:00
Klaus Post fdc2f69218
truncate xl.meta files upon rewrites #11749)
If the destination files exist and is larger - junk data will be left at the end of the file.
2021-03-09 14:42:24 -08:00
Anis Elleuch 0d124095ea
lc: Return expiration header only when version id is unspecified (#11718)
Follow S3 specification to return Expiration header in HEAD/GET call
only when version-id is not passed in the request.
2021-03-09 13:19:08 -08:00
Harshavardhana 691035832a
fix: normalize object layer inputs (#11534)
Cases where we have applications making request
for `//` in object names make sure that all
are normalized to `/` and all such requests that
are prefixed '/' are removed. To ensure a
consistent view from all operations.
2021-03-09 12:58:22 -08:00
Anis Elleuch eac66e67ec
Use maximum parity for config files (#11740)
Some deployments have low parity (EC:2), but we really do not need to
save our config data with the same parity configuration.

N/2 would be better to keep MinIO configurations intact when unexpected
a number of drives fail.
2021-03-09 10:19:47 -08:00
Anis Elleuch 57f3ed22d4
erasure: Reduce the interval of cleaning up .trash folder (#11741)
Reduce from 30 to 10 minutes.
2021-03-09 09:45:38 -08:00
Poorna Krishnamoorthy 2f29719e6b
resize replication worker pool dynamically after config update (#11737) 2021-03-09 02:56:42 -08:00
Andreas Auernhammer 209fe61dcc
vault: disable Hashicorp Vault with opt-in (#11711)
This commit disables the Hashicorp Vault
support but provides a way to temp. enable
it via the `MINIO_KMS_VAULT_DEPRECATION=off`

Vault support has been deprecated long ago
and this commit just requires users to take
action if they maintain a Vault integration.
2021-03-09 00:02:35 -08:00
Harshavardhana 8ecffdb7a7 Revert "Revert "heal: Heal bucket metadata when a fresh disk is inserted (#11734)""
This reverts commit 806df164b2.
2021-03-08 16:12:17 -08:00
Harshavardhana 806df164b2 Revert "heal: Heal bucket metadata when a fresh disk is inserted (#11734)"
This reverts commit 64662a49ff.
2021-03-08 14:43:24 -08:00
Klaus Post 4ac9ed4248
CopyObject: Do not remove crypto info when compressed (#11702)
Removing crypto info makes it impossible to copy encrypted+compressed objects.

Disable destination compression when encrypted.
2021-03-08 12:57:54 -08:00
Klaus Post 3ff5f55dcb
Fetch fileinfo concurrently (#11700)
For non-erasure setups fetch up to 10 fileinfos concurrently.

Fixes #11625
2021-03-08 11:30:43 -08:00
Max Xu 097e5eba9f
feat: remove go-bindata-assetfs in favor of embed by upgrading to go1.16 (#11733) 2021-03-08 11:26:43 -08:00
Anis Elleuch 64662a49ff
heal: Heal bucket metadata when a fresh disk is inserted (#11734)
Replacing disk with a fresh one never heals bucket metadata (policy,
notification, etc..). This commit fixes the issue.
2021-03-08 10:54:13 -08:00
Harshavardhana 78e867e145
ignore healing .trash, .metacache amd .multipart paths (#11725) 2021-03-07 09:38:31 -08:00
Harshavardhana 9ccc483df6
[feat]: change erasure coding default block size from 10MiB to 1MiB (#11721)
major performance improvements in range GETs to avoid large
read amplification when ranges are tiny and random

```
-------------------
Operation: GET
Operations: 142014 -> 339421
Duration: 4m50s -> 4m56s
* Average: +139.41% (+1177.3 MiB/s) throughput, +139.11% (+658.4) obj/s
* Fastest: +125.24% (+1207.4 MiB/s) throughput, +132.32% (+612.9) obj/s
* 50% Median: +139.06% (+1175.7 MiB/s) throughput, +133.46% (+660.9) obj/s
* Slowest: +203.40% (+1267.9 MiB/s) throughput, +198.59% (+753.5) obj/s
```

TTFB from 10MiB BlockSize
```
* First Access TTFB: Avg: 81ms, Median: 61ms, Best: 20ms, Worst: 2.056s
```

TTFB from 1MiB BlockSize
```
* First Access TTFB: Avg: 22ms, Median: 21ms, Best: 8ms, Worst: 91ms
```

Full object reads however do see a slight change which won't be
noticeable in real world, so not doing any comparisons

TTFB still had improvements with full object reads with 1MiB

```
* First Access TTFB: Avg: 68ms, Median: 35ms, Best: 11ms, Worst: 1.16s
```

v/s

TTFB with 10MiB
```
* First Access TTFB: Avg: 388ms, Median: 98ms, Best: 20ms, Worst: 4.156s
```

This change should affect all new uploads, previous uploads should
continue to work with business as usual. But dramatic improvements can
be seen with these changes.
2021-03-06 14:09:34 -08:00
Anis Elleuch abce040088
fix: Remove repetitive IAM ready message (#11723)
"IAM initialization complete" is printed each 5 minutes, avoid this by
printing it only during the first initialization of IAM.
2021-03-06 09:27:46 -08:00
Anis Elleuch 558762bdf6
iam: Return a slice of policies for a group (#11722)
A group can have multiple policies, a user subscribed to readwrite &
diagnostics can perform S3 operations & admin operations as well.
However, the current code only returns one policy for one group.
2021-03-06 09:27:06 -08:00
Harshavardhana d971061305
use listPathRaw for HealObjects() instead of expensive WalkVersions() (#11675) 2021-03-06 09:25:48 -08:00
Andreas Auernhammer 509bcc01ad
fips: do not use SHA-3 when building a FIPS-140 2 binary (#11710)
This commit disables SHA-3 for OpenID when building a
FIPS-140 2 compatible binary. While SHA-3 is a
crypto. hash function accepted by NIST there is no
FIPS-140 2 compliant implementation available when
using the boringcrypto Go branch.

Therefore, SHA-3 must not be used when building
a FIPS-140 2 binary.
2021-03-05 20:43:42 -08:00
Krishnan Parthasarathi bcf9825082
Data usage should account for transitioned objects (#11717) 2021-03-05 14:15:53 -08:00
sgandon 124816f6a6
fix : IAM Intialization failing with a large number of users/policies (#11701) 2021-03-05 08:36:16 -08:00
Klaus Post fa9cf1251b
Imporve healing and reporting (#11312)
* Provide information on *actively* healing, buckets healed/queued, objects healed/failed.
* Add concurrent healing of multiple sets (typically on startup).
* Add bucket level resume, so restarts will only heal non-healed buckets.
* Print summary after healing a disk is done.
2021-03-04 14:36:23 -08:00
Harshavardhana d73d756a80
fix: incorrect errors thrown by lint (#11699)
fixes #11698
2021-03-04 14:27:38 -08:00
Aditya Manthramurthy 7488c77e7c
Test LDAP connection configuration at startup (#11684) 2021-03-04 12:17:36 -08:00
Harshavardhana 786585009e
fix: capture disks when entire peer is offline (#11697)
currently when one of the peer is down, the
drives from that peer are reported as '0/0'
offline instead we should capture/filter the
drives from the peer and populate it appropriately
such that `mc admin info` displays correct info.
2021-03-04 10:07:05 -08:00
Anis Elleuch 7be7109471
locking: Add Refresh for better locking cleanup (#11535)
Co-authored-by: Anis Elleuch <anis@min.io>
Co-authored-by: Harshavardhana <harsha@minio.io>
2021-03-03 18:36:43 -08:00
Klaus Post c3217bd6eb
Use actual size for buffer selection (#11687)
For compressed inputs, this will be -1, but the object may be small.
2021-03-03 16:28:10 -08:00
Andreas Auernhammer f14cc6c943
etag: add FromContentMD5 to parse content-md5 as ETag (#11688)
This commit adds the `FromContentMD5` function to
parse a client-provided content-md5 as ETag.

Further, it also adds multipart ETag computation
for future needs.
2021-03-03 12:58:28 -08:00
Harshavardhana 2c198ae7b6
fix: prometheus metrics disks_online count when disks are down (#11689)
prometheus metrics was using total disks instead
of online disk count, when disks were down, this
PR fixes this and also adds a new metric for
total_disk_count
2021-03-03 11:18:41 -08:00
Poorna Krishnamoorthy 690434514d
Avoid notification event for replicas (#11683)
Creating notification events for replica creation
is not particularly useful to send as the notification
event generated at source already includes replication
completion events.

For applications using replica cluster as failover, avoiding
duplicate notifications for replica event will allow seamless
failover.
2021-03-03 11:13:31 -08:00
Harshavardhana 039f59b552
fix: missing user policy enforcement in PostPolicyHandler (#11682) 2021-03-03 08:47:08 -08:00
Harshavardhana c6a120df0e
fix: Prometheus metrics to re-use storage disks (#11647)
also re-use storage disks for all `mc admin server info`
calls as well, implement a new LocalStorageInfo() API
call at ObjectLayer to lookup local disks storageInfo

also fixes bugs where there were double calls to StorageInfo()
2021-03-02 17:28:04 -08:00
Klaus Post cd9e30c0f4
IAM: Block while loading users (#11671)
While starting up a request that needs all IAM data will start another load operation if the first on startup hasn't finished. This slows down both operations.

Block these requests until initial load has completed.

Blocking calls will be ListPolicies, ListUsers, ListServiceAccounts, ListGroups - and the calls that eventually trigger these. These will wait for the initial load to complete.

Fixes issue seen in #11305
2021-03-02 17:08:25 -08:00