Commit graph

198 commits

Author SHA1 Message Date
Harshavardhana 5cfedcfe33
askDisks for strict quorum to be equal to read quorum (#14623) 2022-03-25 16:29:45 -07:00
Andreas Auernhammer 4d2fc530d0
add support for SSE-S3 bulk ETag decryption (#14627)
This commit adds support for bulk ETag
decryption for SSE-S3 encrypted objects.

If KES supports a bulk decryption API, then
MinIO will check whether its policy grants
access to this API. If so, MinIO will use
a bulk API call instead of sending encrypted
ETags serially to KES.

Note that MinIO will not use the KES bulk API
if its client certificate is an admin identity.

MinIO will process object listings in batches.
A batch has a configurable size that can be set
via `MINIO_KMS_KES_BULK_API_BATCH_SIZE=N`.
It defaults to `500`.

This env. variable is experimental and may be
renamed / removed in the future.

Signed-off-by: Andreas Auernhammer <hi@aead.dev>
2022-03-25 15:01:41 -07:00
Aditya Manthramurthy 79ba458051
fix: free up reader resources in S3Select properly (#14600) 2022-03-23 20:58:53 -07:00
Avimitin fb9b53026d
Add riscv64 support (#14601)
In riscv64, the `syscall.Uname` function will return a uint8 slice.

    func main() {
      var buf syscall.Utsname
      fmt.Printf("Buffer Type: %T\n", buf.Release)
    }

    output:
      Buffer Type: [65]uint8

This is tested in the Arch Linux RISC-V 64 QEMU environment.

Signed-off-by: Avimitin <avimitin@gmail.com>
2022-03-22 20:36:59 -07:00
Klaus Post 472c2d828c
Fix waitgroup add after wait on config reload (#14584)
Fix `panic: "POST /minio/peer/v21/signalservice?signal=2": sync: WaitGroup is reused before previous Wait has returned`

Log entries already on the channel would cause `logEntry` to increment the
 waitgroup when sending messages, after Cancel has been called.

Instead of tracking every single message, just check the send goroutine. Faster 
and safe, since it will not decrement until the channel is closed.

Regression from #14289
2022-03-19 09:15:45 -07:00
Anis Elleuch b20ecc7b54
Add support of TLS session tickets with KES server (#14577)
Reduce overhead for communication between MinIO server and KES server.
2022-03-18 15:14:10 -07:00
Harshavardhana 43eb5a001c
re-use transport for AdminInfo() call (#14571)
avoids creating new transport for each `isServerResolvable`
request, instead re-use the available global transport and do
not try to forcibly close connections to avoid TIME_WAIT
build upon large clusters.

Never use httpClient.CloseIdleConnections() since that can have
a drastic effect on existing connections on the transport pool.

Remove it everywhere.
2022-03-17 16:20:10 -07:00
Aditya Manthramurthy ce97313fda
Add extra LDAP configuration validation (#14535)
- The result now contains suggestions on fixing common configuration issues.
- These suggestions will subsequently be exposed in console/mc
2022-03-16 19:57:36 -07:00
Harshavardhana ae3b369fe1
logger webhook failure can overrun the queue_size (#14556)
PR introduced in #13819 was incorrect and was not
handling the situation where a buffer is full can
cause incessant amount of logs that would keep the
logger webhook overrun by the requests.

To avoid this only log failures to console logger
instead of all targets as it can cause self reference,
leading to an infinite loop.
2022-03-15 17:45:51 -07:00
Klaus Post c07af89e48
select: Add ScanRange to CSV&JSON (#14546)
Implements https://docs.aws.amazon.com/AmazonS3/latest/API/API_SelectObjectContent.html#AmazonS3-SelectObjectContent-request-ScanRange

Fixes #14539
2022-03-14 09:48:36 -07:00
Aditya Manthramurthy b7ed3b77bd
Indicate required fields in LDAP configuration correctly (#14526) 2022-03-10 19:03:38 -08:00
Poorna 75b925c326
Deprecate root disk for disk caching (#14527)
This PR modifies #14513 to issue a deprecation
warning rather than reject settings on startup.
2022-03-10 18:42:44 -08:00
Harshavardhana 91d419ee6c
warn issues about large block I/O performance for Linux older than 4.0.0 (#14524)
This PR simply adds a warning message when it detects older kernel
versions and warn's them about potential performance issues on this
kernel.

The issue can be seen only with parallel I/O across all drives
on denser setups such as 90 drives or 45 drives per server configurations.
2022-03-10 17:36:13 -08:00
Poorna 7ce91ea1a1
Disallow root disk to be used for cache drives (#14513) 2022-03-10 02:45:31 -08:00
Klaus Post b890bbfa63
Add local disk health checks (#14447)
The main goal of this PR is to solve the situation where disks stop 
responding to operations. This generally causes an FD build-up and 
eventually will crash the server.

This adds detection of hung disks, where calls on disk get stuck.

We add functionality to `xlStorageDiskIDCheck` where it keeps 
track of the number of concurrent requests on a given disk.

A total number of 100 operations are allowed. If this limit is reached 
we will block (but not reject) new requests, but we will monitor the 
state of the disk.

If no requests have been completed or updated within a 15-second 
window, we mark the disk as offline. Requests that are blocked will be 
unblocked and return an error as "faulty disk".

New requests will be rejected until the disk is marked OK again.

Once a disk has been marked faulty, a check will run every 5 seconds that 
will attempt to write and read back a file. As long as this fails the disk will 
remain faulty.

To prevent lots of long-running requests to mark the disk faulty we 
implement a callback feature that allows updating the status as parts 
of these operations are running.

We add a reader and writer wrapper that will update the status of each 
successful read/write operation. This should allow fine enough granularity 
that a slow, but still operational disk will not reach 15 seconds where 
50 operations have not progressed.

Note that errors themselves are not enough to mark a disk faulty. 
A nil (or io.EOF) error will mark a disk as "good".

* Make concurrent disk setting configurable via `_MINIO_DISK_MAX_CONCURRENT`.

* de-couple IsOnline() from disk health tracker

The purpose of IsOnline() is to ensure that we
reconnect the drive only when the "drive" was

- disconnected from network we need to validate
  if the drive is "correct" and is the same drive
  which belongs to this server.

- drive was replaced we have to format it - we
  support hot swapping of the drives.

IsOnline() is not meant for taking the drive offline
when it is hung, it is not useful we can let the
drive be online instead "return" errors for relevant
calls.

* return errFaultyDisk for DiskInfo() call

Co-authored-by: Harshavardhana <harsha@minio.io>

Possible future Improvements:

* Unify the REST server and local xlStorageDiskIDCheck. This would also improve stats significantly.
* Allow reads/writes to be aborted by the context.
* Add usage stats, concurrent count, blocked operations, etc.
2022-03-09 11:38:54 -08:00
Klaus Post 7060c809c0
Add authorization header to HEAD requests (#14510)
Add Authorization to network check requests.

Fixes #14507
2022-03-09 10:48:56 -08:00
Harshavardhana 0e3bafcc54
improve logs, fix banner formatting (#14456) 2022-03-03 13:21:16 -08:00
Andreas Auernhammer b48f719b8e
kes: remove unnecessary error conversion (#14459)
This commit removes some duplicate code that
converts KES API errors.

This code was added since KES `0.18.0` changed
some exported API errors. However, the KES SDK
handles this error conversion itself.
Therefore, it is not necessary to duplicate this
behavior in MinIO.

See: 21555fa624/error.go (L94)

Signed-off-by: Andreas Auernhammer <hi@aead.dev>
2022-03-03 09:42:37 -08:00
Lenin Alevski 289fcbd08c
KES dependency upgrade (#14454)
- Updating KES dependency to v.0.18.0
- Fixing incompatibility issue when checking for errors during KES key creation

Signed-off-by: Lenin Alevski <alevsk.8772@gmail.com>
2022-03-02 23:03:40 -08:00
Harshavardhana f6875bb893
fix: regression from refactor in AMQP notification (#14455)
fixes a regression introduced in #14269 that refactored
the notification registration logic, all the amqp targets
however online will not be available for use anymore.

fixes #14451
2022-03-02 21:35:48 -08:00
Klaus Post b030ef1aca
tests: Clean up dsync package (#14415)
Add non-constant timeouts to dsync package.

Reduce test runtime by minutes. Hopefully not too aggressive.
2022-03-01 11:14:28 -08:00
Klaus Post 88fd1cba71
select: add MISSING operator support (#14406)
Probably not full support, but for regular checks it should work.

Fixes #14358
2022-02-25 12:31:19 -08:00
hellivan 5307e18085
use keycloak_realm properly for keycloak user lookups (#14401)
In case a user-defined a value for the MINIO_IDENTITY_OPENID_KEYCLOAK_REALM 
environment variable, construct the path properly.
2022-02-24 10:16:53 -08:00
Klaus Post 2cea944cdb
select: Allow lower case 'is' (#14405)
Ref: #14358
2022-02-24 09:10:48 -08:00
Shireesh Anjal 3934700a08
Make audit webhook and kafka config dynamic (#14390) 2022-02-24 09:05:33 -08:00
hellivan 0913eb6655
fix: openid config provider not initialized correctly (#14399)
Up until now `InitializeProvider` method of `Config` struct was
implemented on a value receiver which is why changes on `provider`
field where never reflected to method callers. In order to fix this
issue, the method was implemented on a pointer receiver.
2022-02-23 23:42:37 -08:00
Harshavardhana 1bfbe354f5
fix: clientId must be unique for all servers (#14398)
This is a regression from #14037, distributed setups
with MQTT was not working anymore. According to MQTT
spec it is expected this is unique per server.

We shall proceed to use unix nano timestamp hex
value instead here.
2022-02-23 20:19:59 -08:00
Shireesh Anjal 25144fedd5
Send deployment id and minio version in http header (#14378) 2022-02-23 13:36:01 -08:00
Shireesh Anjal 94d37d05e5
Apply dynamic config at sub-system level (#14369)
Currently, when applying any dynamic config, the system reloads and
re-applies the config of all the dynamic sub-systems.

This PR refactors the code in such a way that changing config of a given
dynamic sub-system will work on only that sub-system.
2022-02-22 10:59:28 -08:00
Shireesh Anjal c1437c7b46
allow config reset api to work by overloading default values (#14368)
The `LookupConfig` code was not using `GetWithDefault`, because of which
some of the config values were being returned as empty string, and calls
like `strconv.Atoi` and `time.ParseDuration` on these were failing.
2022-02-21 15:50:45 -08:00
Aditya Manthramurthy bc110d8055
fix: mysql notification target table creation (#14350)
Add a generated hash column as the primary key for the key name as 
MySQL does not allow indexes on long VARCHAR columns.
2022-02-18 12:13:49 -08:00
Harshavardhana 65b1a4282e
fix: console logger regression with dynamic logger webhook registration (#14346)
fixes a regression from #14289
2022-02-17 17:50:10 -08:00
Shireesh Anjal 28f188e3ef
Make logger webhook config dynamic (#14289)
It should not be required to restart the 
server after setting the logger webhook config.
2022-02-17 11:11:15 -08:00
Shireesh Anjal 1a5496eced
Add enable key to logger webhook help (#14326)
This key is supported by the logger webhook config - but is not returned in the help.
2022-02-16 11:59:50 -08:00
Shireesh Anjal 16939ca192
Mark SUBNET credentials as sensitive (#14320)
So that they are redacted in the health report
2022-02-16 08:40:34 -08:00
Klaus Post 5ec57a9533
Add GetObject gzip option (#14226)
Enabled with `mc admin config set alias/ api gzip_objects=on`

Standard filtering applies (1K response minimum, not compressed content 
type, not range request, gzip accepted by client).
2022-02-14 09:19:01 -08:00
Shireesh Anjal 9890f579f8
Add subsystem level validation on config set (#14269)
When setting a config of a particular sub-system, validate the existing
config and notification targets of only that sub-system, so that
existing errors related to one sub-system (e.g. notification target
offline) do not result in errors for other sub-systems.
2022-02-08 10:36:41 -08:00
Shireesh Anjal 3882da6ac5
Add subnet proxy config (#14225)
Will store the HTTP(S) proxy URL to use for connecting to SUBNET.
2022-02-01 09:52:38 -08:00
Harshavardhana c39eb3bacd
fix: possible crash if private.key is empty (#14208)
Before
```
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x9f54f7]

goroutine 1 [running]:
crypto/x509.IsEncryptedPEMBlock(...)
	crypto/x509/pem_decrypt.go:105
github.com/minio/minio/internal/config.LoadX509KeyPair({0xc00061e270, 0x0}, {0xc00061e2d0, 0x25})
	github.com/minio/minio/internal/config/certs.go:88 +0xf7
github.com/minio/pkg/certs.(*Manager).AddCertificate(0xc000576150, {0xc00061e270, 0x25}, {0xc00061e2d0, 0x25})
	github.com/minio/pkg@v1.1.15/certs/certs.go:132 +0x368
github.com/minio/pkg/certs.NewManager({0x51f5910, 0xc00053e140}, {0xc00061e270, 0xc000580400}, {0xc00061e2d0, 0x25}, 0x4dc5880)
	github.com/minio/pkg@v1.1.15/certs/certs.go:97 +0x170
github.com/minio/minio/cmd.getTLSConfig()
```

After
```
ERROR Unable to load the TLS configuration: The private key is not readable
      > Please check your certificate
```
2022-01-30 12:55:21 -08:00
Poorna a4be47d7ad
Validate config before saving changes after config reset (#14203) 2022-01-27 18:28:16 -08:00
Aditya Manthramurthy 7dfa565d00
Identity LDAP: Allow multiple search base DNs (#14191)
This change allows the MinIO server to lookup users in different directory
sub-trees by allowing specification of multiple search bases separated by
semicolons.
2022-01-26 15:05:59 -08:00
Poorna 295730408b
Disallow delete replication for tag based rules (#14167) 2022-01-24 15:22:20 -08:00
Harshavardhana 5a9f133491
speed up startup sequence for all operations (#14148)
This speed-up is intended for faster startup times
for almost all MinIO operations. Changes here are

- Drives are not re-read for 'format.json' on a regular
  basis once read during init is remembered and refreshed
  at 5 second intervals.

- Do not do O_DIRECT tests on drives with existing 'format.json'
  only fresh setups need this check.

- Parallelize initializing erasureSets for multiple sets.

- Avoid re-reading format.json when migrating 'format.json'
  from really old V1->V2->V3

- Keep a copy of local drives for any given server in memory
  for a quick lookup.
2022-01-24 11:28:45 -08:00
Anis Elleuch 3e9bd931ed
tests: Remove RPC wording from the code (#14142)
The lock was using net/rpc in the past but it got replaced with a REST API. 
This commit will fix function names/comments to avoid confusion.
2022-01-20 09:36:09 -08:00
Harshavardhana 1a56ebea70
cleanup dsync tests and remove net/rpc references (#14118) 2022-01-18 12:44:38 -08:00
Harshavardhana 70e1cbda21
allow disabling O_DIRECT in certain environments for reads (#14115)
repeated reads on single large objects in HPC like
workloads, need the following option to disable
O_DIRECT for a more effective usage of the kernel
page-cache.

However this optional should be used in very specific
situations only, and shouldn't be enabled on all
servers.

NVMe servers benefit always from keeping O_DIRECT on.
2022-01-17 08:34:14 -08:00
Anis Elleuch b106b1c131
lock: Fix decision when a lock needs to be removed (#14095)
The code was not properly deciding if a lock needs to be removed 
when it doesn't have quorum anymore. After this commit, a lock will be
forcefully unlocked if nodes reporting they are not able to find a lock
internally breaks the quorum.

Simplify the code as well.
2022-01-14 10:33:08 -08:00
Harshavardhana 76b21de0c6
feat: decommission feature for pools (#14012)
```
λ mc admin decommission start alias/ http://minio{1...2}/data{1...4}
```

```
λ mc admin decommission status alias/
┌─────┬─────────────────────────────────┬──────────────────────────────────┬────────┐
│ ID  │ Pools                           │ Capacity                         │ Status │
│ 1st │ http://minio{1...2}/data{1...4} │ 439 GiB (used) / 561 GiB (total) │ Active │
│ 2nd │ http://minio{3...4}/data{1...4} │ 329 GiB (used) / 421 GiB (total) │ Active │
└─────┴─────────────────────────────────┴──────────────────────────────────┴────────┘
```

```
λ mc admin decommission status alias/ http://minio{1...2}/data{1...4}
Progress: ===================> [1GiB/sec] [15%] [4TiB/50TiB]
Time Remaining: 4 hours (started 3 hours ago)
```

```
λ mc admin decommission status alias/ http://minio{1...2}/data{1...4}
ERROR: This pool is not scheduled for decommissioning currently.
```

```
λ mc admin decommission cancel alias/
┌─────┬─────────────────────────────────┬──────────────────────────────────┬──────────┐
│ ID  │ Pools                           │ Capacity                         │ Status   │
│ 1st │ http://minio{1...2}/data{1...4} │ 439 GiB (used) / 561 GiB (total) │ Draining │
└─────┴─────────────────────────────────┴──────────────────────────────────┴──────────┘
```

> NOTE: Canceled decommission will not make the pool active again, since we might have
> Potentially partial duplicate content on the other pools, to avoid this scenario be
> very sure to start decommissioning as a planned activity.

```
λ mc admin decommission cancel alias/ http://minio{1...2}/data{1...4}
┌─────┬─────────────────────────────────┬──────────────────────────────────┬────────────────────┐
│ ID  │ Pools                           │ Capacity                         │ Status             │
│ 1st │ http://minio{1...2}/data{1...4} │ 439 GiB (used) / 561 GiB (total) │ Draining(Canceled) │
└─────┴─────────────────────────────────┴──────────────────────────────────┴────────────────────┘
```
2022-01-10 09:07:49 -08:00
Aditya Manthramurthy 1981fe2072
Add internal IDP and OIDC users support for site-replication (#14041)
- This allows site-replication to be configured when using OpenID or the
  internal IDentity Provider.

- Internal IDP IAM users and groups will now be replicated to all members of the
  set of replicated sites.

- When using OpenID as the external identity provider, STS and service accounts
  are replicated.

- Currently this change dis-allows root service accounts from being
  replicated (TODO: discuss security implications).
2022-01-06 15:52:43 -08:00
Minio Trusted 76877eb6fa move gofumpt to golang-ci 2022-01-06 13:08:21 -08:00