Commit graph

505 commits

Author SHA1 Message Date
Harshavardhana 005a4a275a
add more bootstrap messages to provide latency (#17650)
- simplify refreshing bucket metadata, wait() to
  depend on how fast the bucket metadata can load.

- simplify resync to start resync in single pass.
2023-07-14 04:00:29 -07:00
jiuker 183428db03
fear: Implement 'mc support top net' (#17598) 2023-07-13 11:41:19 -07:00
Harshavardhana 7f782983ca
fix: for FTP server driver allow implicit trust of TLS (#17541)
fixes #17535
2023-06-30 08:04:13 -07:00
Harshavardhana d3e5e607a7
allow site-replication checks to work on non-distributed setups (#17524)
fixes #17523
2023-06-27 09:23:50 -07:00
Anis Eleuch d8dad5c9ea
s3: Make/Delete buckets to use error quorum per pool (#17467) 2023-06-23 11:48:23 -07:00
Harshavardhana 65c31fab12
fix: do not crash rebalance code instead set the object layer (#17465)
fixes #17421
2023-06-20 09:28:23 -07:00
Aditya Manthramurthy 5a1612fe32
Bump up madmin-go and pkg deps (#17469) 2023-06-19 17:53:08 -07:00
Anis Eleuch bb24346e04
listen: Only error out if not able to bind any interface (#17353) 2023-06-12 09:09:28 -07:00
Klaus Post 6e38d0f3ab
Add more bootstrap info in debug mode (#17362) 2023-06-08 08:39:47 -07:00
Harshavardhana d1448adbda
use slices package and remove some helpers (#17342) 2023-06-06 10:12:52 -07:00
Praveen raj Mani ecfb18b26a
Freeze the s3 APIs until the notification sub-system initializes completely (#17182) 2023-05-19 08:44:48 -07:00
Harshavardhana b62791617c
fix: notify systemd as soon as we wait on the OS signal (#17199) 2023-05-12 16:42:17 -07:00
Praveen raj Mani 57acacd5a7
Support persistent queue store for loggers (#17121) 2023-05-08 21:20:31 -07:00
Poorna c5c1426262
Validate if replication config being added is self referential (#17142) 2023-05-06 13:35:43 -07:00
Harshavardhana 5569acd95c
disallow EC:0 if not set during server startup (#17141) 2023-05-04 14:44:30 -07:00
Harshavardhana 9571b0825e
add configurable VRF interface and user-timeout (#17108) 2023-05-03 14:12:25 -07:00
WGH ab34f0065c
Support systemd notify protocol (#17062) 2023-05-01 23:15:08 -07:00
Harshavardhana dbd53af369
fix: initialize reverse proxy forwarder with right public certs (#17080) 2023-04-25 15:50:32 -07:00
Harshavardhana 477230c82e
avoid attempting to migrate old configs (#17004) 2023-04-21 13:56:08 -07:00
Harshavardhana dd9ed85e22
implement support for FTP/SFTP server (#16952) 2023-04-15 07:34:02 -07:00
Anis Eleuch 91b6fe1af3
trace: Bootstrap to show the correct source line number (#16989) 2023-04-06 17:51:53 -07:00
Krishnan Parthasarathi 31fba6f434
Save bootstrap trace events in a circular buffer (#16823) 2023-03-17 16:01:03 -07:00
Harshavardhana 0c1f8b4e0f
add user-agent for all minio.Client usage (#16619) 2023-02-14 13:19:30 -08:00
Harshavardhana 71f02adfca Revert "Print golang http errors in MinIO log format (#16465)"
This reverts commit 1fd7946dce.
2023-02-09 09:27:27 +05:30
Krishnan Parthasarathi 990fc415f7
Ensure safety of transitionState at startup (#16563) 2023-02-07 23:11:42 -08:00
Harshavardhana 747d475e76
initialize subsystems that are not dependent on buckets first (#16559) 2023-02-07 12:46:47 -08:00
Anis Elleuch 095b518802
Show a better error msg when internal data encryption key is incorrect (#16549) 2023-02-07 05:22:54 -08:00
Anis Elleuch 1fd7946dce
Print golang http errors in MinIO log format (#16465) 2023-01-26 22:46:16 +05:30
Harshavardhana 54b561898f
fix: anonymize the x-amz-id-2 value from hostname (#16478) 2023-01-25 10:25:36 -08:00
Shireesh Anjal 5a9f7516d6
Add monthly license update job (#16391) 2023-01-17 05:08:15 +05:30
Anis Elleuch 2146ed4033
xl: Quit early when EC config is incorrect (#16390)
Co-authored-by: Anis Elleuch <anis@min.io>
2023-01-09 23:07:45 -08:00
Harshavardhana e0086c1be7
reduce startup delays on kubernetes (#16356) 2023-01-05 02:32:43 -08:00
Harshavardhana 1cd8e1d8b6
remove the startup jitter before locks() (#16340) 2023-01-02 01:40:09 -08:00
Anis Elleuch acc9c033ed
debug: Add X-Amz-Request-ID to lock/unlock calls (#16309) 2022-12-23 19:49:07 -08:00
Anis Elleuch 34167c51d5
trace: Add bootstrap tracing events (#16286) 2022-12-21 15:52:29 -08:00
Harshavardhana 5a218f38a1
allow retries for transaction lock on startup (#16273) 2022-12-19 22:00:00 -08:00
Anis Elleuch e57e946206
Do not save credentials in config.json (#16275) 2022-12-19 12:27:06 -08:00
Harshavardhana 80fc3a8a52
use newDynamicTimeoutWithOpts() when appropriate (#16266) 2022-12-15 13:11:37 -08:00
Klaus Post 988a2e8fed
Faster startup of large distributed systems with latency (#16259) 2022-12-15 08:31:21 -08:00
Anis Elleuch 939c0100a6
log: Do not interpret verbs in object names in console output (#16233) 2022-12-13 08:27:40 -08:00
Aditya Manthramurthy 2d60bf8c50
Refactor HTTP transports (#16222) 2022-12-12 20:31:21 -08:00
Harshavardhana 37e20f6ef2
feat: allow listening specific addrs for API port (#16223) 2022-12-12 18:48:46 -08:00
Harshavardhana 853c4de75a
allow changing endpoints in distributed setups (#16071) 2022-11-16 07:59:10 -08:00
Poorna d6bc141bd1
feat: Add support for site level resync (#15753) 2022-11-14 07:16:40 -08:00
Anis Elleuch 3b1a9b9fdf
Use the same lock for the scanner and site replication healing (#15985) 2022-11-08 08:55:55 -08:00
Harshavardhana 9547b7d0e9
add deadlineConnections on remoteTransport (#16010) 2022-11-05 11:09:21 -07:00
Harshavardhana 6e4acf0504
add a message of removal for gateway and hide the command (#15965) 2022-10-28 14:11:20 -07:00
Harshavardhana 23b329b9df
remove gateway completely (#15929) 2022-10-24 17:44:15 -07:00
Anis Elleuch 58d776daa0
Set CONSOLE_MINIO_SERVER to 127.0.0.1 by default (#15887) 2022-10-21 14:42:28 -07:00
Anis Elleuch de5070446d
Deprecate --listeners flag (#15900) 2022-10-19 08:45:50 -07:00
Harshavardhana 6cb2f56395 Revert "Revert "tests: Add context cancelation (#15374)""
This reverts commit 564a0afae1.
2022-10-14 03:08:40 -07:00
Harshavardhana 2a13cc28f2 feat: implement support batch replication (#15554) 2022-10-05 23:00:43 -07:00
Anis Elleuch 86bb48792c
non-blocking initialization of bucket target notifications (#15571) 2022-09-27 17:23:28 -07:00
Harshavardhana 9d6fddcfdf
persist the non-default creds in config (#15711) 2022-09-21 16:14:47 -07:00
Klaus Post ff12080ff5
Remove deprecated io/ioutil (#15707) 2022-09-19 11:05:16 -07:00
Aditya Manthramurthy afbb63a197
Factor out external event notification funcs (#15574)
This change moves external event notification functionality into
`event-notification.go`. This simplifies notification related code.
2022-08-24 06:42:36 -07:00
Poorna 21fe14201f
replication: centralize healthcheck for remote targets (#15516)
This PR moves health check from minio-go client to being
managed on the server.

Additionally integrating health check into site replication
2022-08-16 17:46:22 -07:00
Harshavardhana c7d535c648
init console after IAM init() (#15531)
fixes #15527
2022-08-13 12:54:41 -07:00
ebozduman b57e7321e7
Replaces 'disk'=>'drive' visible to end user (#15464) 2022-08-04 16:10:08 -07:00
Poorna 426c902b87
site replication: fix healing of bucket deletes. (#15377)
This PR changes the handling of bucket deletes for site 
replicated setups to hold on to deleted bucket state until 
it syncs to all the clusters participating in site replication.
2022-07-25 17:51:32 -07:00
Minio Trusted 564a0afae1 Revert "tests: Add context cancelation (#15374)"
This reverts commit 1e332f0eb1.

Reverting this as tests are failing randomly.
2022-07-21 13:58:56 -07:00
Klaus Post 1e332f0eb1
tests: Add context cancelation (#15374)
A huge number of goroutines would build up from various monitors

When creating test filesystems provide a context so they can shut down when no longer needed.
2022-07-21 11:52:18 -07:00
LHHDZ e68e76e143
fix: data race, which caused tests execution to fail (#15313) 2022-07-16 07:57:55 -07:00
Harshavardhana b4eb74f5ff
allow custom speedtest bucket (#15271)
this allows for specifying existing buckets with

- object replication enabled
- object encryption enabled
- object versioning enabled
- object locking enabled
2022-07-12 10:12:47 -07:00
Harshavardhana 1a40c7c27c
use signature-v2 for 'object perf' tests to avoid CPU using sha256 (#15151)
It is observed in a local 8 drive system the CPU seems to be
bottlenecked at

```
(pprof) top
Showing nodes accounting for 1385.31s, 88.47% of 1565.88s total
Dropped 1304 nodes (cum <= 7.83s)
Showing top 10 nodes out of 159
      flat  flat%   sum%        cum   cum%
      724s 46.24% 46.24%       724s 46.24%  crypto/sha256.block
   219.04s 13.99% 60.22%    226.63s 14.47%  syscall.Syscall
   158.04s 10.09% 70.32%    158.04s 10.09%  runtime.memmove
   127.58s  8.15% 78.46%    127.58s  8.15%  crypto/md5.block
    58.67s  3.75% 82.21%     58.67s  3.75%  github.com/minio/highwayhash.updateAVX2
    40.07s  2.56% 84.77%     40.07s  2.56%  runtime.epollwait
    33.76s  2.16% 86.93%     33.76s  2.16%  github.com/klauspost/reedsolomon._galMulAVX512Parallel84
     8.88s  0.57% 87.49%     11.56s  0.74%  runtime.step
     7.84s   0.5% 87.99%      7.84s   0.5%  runtime.memclrNoHeapPointers
     7.43s  0.47% 88.47%     22.18s  1.42%  runtime.pcvalue
```

Bonus changes:

- re-use transport for bucket replication clients, also site replication clients.
- use 32KiB buffer for all read and writes at transport layer seems to help
  TLS read connections.
- Do not have 'MaxConnsPerHost' this is problematic to be used with net/http
  connection pooling 'MaxIdleConnsPerHost' is enough.
2022-06-22 16:28:25 -07:00
Andreas Auernhammer cd7a0a9757
fips: simplify TLS configuration (#15127)
This commit simplifies the TLS configuration.
It inlines the FIPS / non-FIPS code.

Signed-off-by: Andreas Auernhammer <hi@aead.dev>
2022-06-21 07:54:48 -07:00
Poorna 55ee94bed0
initialize site replication subsys after loading metadata (#15099) 2022-06-16 19:00:35 -07:00
Anis Elleuch 0d00f3a55b
kms: initialize after cli parsing (#15076)
KMS depends on the --certs-dir flag. 

Ensure KMS is initialized after loading the flag.
2022-06-13 13:06:13 -07:00
Harshavardhana af1944f28d
support reading systemctl config automatically on baremetal setups (#15066)
this allows for customers to use `mc admin service restart`
directly even when performing RPM, DEB upgrades. Upon such 'restart'
after upgrade MinIO will re-read the /etc/default/minio for any
newer environment variables.

As long as `MINIO_CONFIG_ENV_FILE=/etc/default/minio` is set, this
is honored.
2022-06-10 09:59:15 -07:00
Harshavardhana 52221db7ef
fix: for unexpected errors in reading versioning config panic (#14994)
We need to make sure if we cannot read bucket metadata
for some reason, and bucket metadata is not missing and
returning corrupted information we should panic such
handlers to disallow I/O to protect the overall state
on the system.

In-case of such corruption we have a mechanism now
to force recreate the metadata on the bucket, using
`x-minio-force-create` header with `PUT /bucket` API
call.

Additionally fix the versioning config updated state
to be set properly for the site replication healing
to trigger correctly.
2022-05-31 02:57:57 -07:00
Harshavardhana f1abb92f0c
feat: Single drive XL implementation (#14970)
Main motivation is move towards a common backend format
for all different types of modes in MinIO, allowing for
a simpler code and predictable behavior across all features.

This PR also brings features such as versioning, replication,
transitioning to single drive setups.
2022-05-30 10:58:37 -07:00
Harshavardhana 5792be71fa
fix: add timeouts to avoid goroutine leaks in net/http (#14995)
Following code can reproduce an unending go-routine buildup,
while keeping connections established due to lack of client
not closing the connections.

https://gist.github.com/harshavardhana/2d00e6f909054d2d2524c71485ad02e1

Without this PR all MinIO deployments can be put into
denial of service attacks, causing entire service to be
unavailable.

We bring in two timeouts at this stage to control such
go-routine build ups, new change

- IdleTimeout (to kill off idle connections)
- ReadHeaderTimeout (to kill off connections that are too slow)

This new change also brings two hidden options to make any
additional relevant changes if desired in some setups.
2022-05-30 06:24:51 -07:00
Harshavardhana 5cfedcfe33
askDisks for strict quorum to be equal to read quorum (#14623) 2022-03-25 16:29:45 -07:00
Harshavardhana f6113264f4 add detection for GOMAXPROCS < NumCPU 2022-03-21 19:05:10 -07:00
Harshavardhana 91d419ee6c
warn issues about large block I/O performance for Linux older than 4.0.0 (#14524)
This PR simply adds a warning message when it detects older kernel
versions and warn's them about potential performance issues on this
kernel.

The issue can be seen only with parallel I/O across all drives
on denser setups such as 90 drives or 45 drives per server configurations.
2022-03-10 17:36:13 -08:00
Harshavardhana 0e3bafcc54
improve logs, fix banner formatting (#14456) 2022-03-03 13:21:16 -08:00
Shireesh Anjal 3934700a08
Make audit webhook and kafka config dynamic (#14390) 2022-02-24 09:05:33 -08:00
Shireesh Anjal 25144fedd5
Send deployment id and minio version in http header (#14378) 2022-02-23 13:36:01 -08:00
Poorna ed3418c046
Refactor replication resync to be an active process (#14266)
When resync is triggered, walk the bucket namespace and
resync objects that are unreplicated. This PR also adds
an API to report resync progress.
2022-02-10 10:16:52 -08:00
Harshavardhana 3c87e1e60d
fix: rename some function names to avoid confusion (#14262) 2022-02-07 11:49:07 -08:00
Harshavardhana 0cac868a36
speed-up startup time, do not block on ListBuckets() (#14240)
Bonus fixes #13816
2022-02-07 10:39:57 -08:00
Harshavardhana 186c477f3c init console server after server config is initialized
fixes #14259
2022-02-07 00:17:33 -08:00
Harshavardhana 6123377e66
speedup getFormatErasureInQuorum use driveCount (#14239)
startup speed-up, currently getFormatErasureInQuorum()
would spend up to 2-3secs when there are 3000+ drives
for example in a setup, simplify this implementation
to use drive counts.
2022-02-04 12:21:21 -08:00
Harshavardhana dbd05d6e82
remove FIFO bucket quota, use ILM expiration instead (#14206) 2022-01-31 11:07:04 -08:00
Harshavardhana 7f214a0e46
use dnscache resolver for resolving command line endpoints (#14135)
this helps in caching the resolved values early on, avoids
causing further resolution for individual nodes when
object layer comes online.

this can speed up our startup time during, upgrades etc by
an order of magnitude.

additional changes in connectLoadInitFormats() and parallelize
all calls that might be potentially blocking.
2022-01-20 13:03:15 -08:00
Harshavardhana 60f2df54e0
Add envVars for CLI arguments (#14114)
fixes #14107
2022-01-15 16:20:02 -08:00
Harshavardhana 38ccc4f672
fix: make sure to avoid calling RenameData() on disconnected disks. (#14094)
Large clusters with multiple sets, or multi-pool setups at times might
fail and report unexpected "file not found" errors. This can become
a problem during startup sequence when some files need to be created
at multiple locations.

- This PR ensures that we nil the erasure writers such that they
  are skipped in RenameData() call.

- RenameData() doesn't need to "Access()" calls for `.minio.sys`
  folders they always exist.

- Make sure PutObject() never returns ObjectNotFound{} for any
  errors, make sure it always returns "WriteQuorum" when renameData()
  fails with ObjectNotFound{}. Return appropriate errors for all
  other cases.
2022-01-12 18:49:01 -08:00
Klaus Post 3d66d053c7
Add small client TLS PSK cache (#14039) 2022-01-06 11:34:02 -08:00
Harshavardhana 42ba0da6b0
fix: initialize new drwMutex for each attempt in 'for {' loop. (#14009)
It is possible that GetLock() call remembers a previously
failed releaseAll() when there are networking issues, now
this state can have potential side effects.

This PR tries to avoid this side affect by making sure
to initialize NewNSLock() for each GetLock() attempts
made to avoid any prior state in the memory that can
interfere with the new lock grants.
2022-01-02 09:15:34 -08:00
Poorna K 111c6177d2
Deprecate caching for erasure/distributed mode (#13909)
Fixes: #13907

Also removing default value of `writethrough` for cache commit
which was interfering with cache_after setting
2021-12-15 16:48:34 -08:00
Harshavardhana 8144a125ce
check for update in background (#13889) 2021-12-13 09:43:03 -08:00
Aditya Manthramurthy 42d11d9e7d
Move IAM notifications into IAM system functions (#13780) 2021-11-29 14:38:57 -08:00
Harshavardhana e49c184595
add configurable 'shutdown-timeout' for HTTP server (#13771)
fixes #12317
2021-11-29 09:06:56 -08:00
Harshavardhana fb268add7a
do not flush if Write() failed (#13597)
- Go might reset the internal http.ResponseWriter() to `nil`
  after Write() failure if the go-routine has returned, do not
  flush() such scenarios and avoid spurious flushes() as
  returning handlers always flush.
- fix some racy tests with the console 
- avoid ticker leaks in certain situations
2021-11-18 17:19:58 -08:00
Harshavardhana 20c43c447d
de-couple bucket metadata loading with lock context (#13679)
avoid passing lock context while loading bucket
metadata, refactor such that we can de-couple things
for subsystem loading.
2021-11-17 13:42:08 -08:00
Harshavardhana 4545ecad58
ignore swapped drives instead of throwing errors (#13655)
- add checks such that swapped disks are detected
  and ignored - never used for normal operations.

- implement `unrecognizedDisk` to be ignored with
  all operations returning `errDiskNotFound`.

- also add checks such that we do not load unexpected
  disks while connecting automatically.

- additionally humanize the values when printing the errors.

Bonus: fixes handling of non-quorum situations in
getLatestFileInfo(), that does not work when 2 drives
are down, currently this function would return errors
incorrectly.
2021-11-15 09:46:55 -08:00
Aditya Manthramurthy 79a58e275c
fix: race in delete user functionality (#13547)
- The race happens with a goroutine that refreshes IAM cache data from storage.
- It could lead to deleted users re-appearing as valid live credentials.
- This change also causes CI to run tests without a race flag (in addition to
running it with).
2021-11-01 15:03:07 -07:00
Harshavardhana 6d53e3c2d7
reduce number of middleware handlers (#13546)
- combine similar looking functionalities into single
  handlers, and remove unnecessary proxying of the
  requests at handler layer.

- remove bucket forwarding handler as part of default setup
  add it only if bucket federation is enabled.

Improvements observed for 1kiB object reads.
```
-------------------
Operation: GET
Operations: 4538555 -> 4595804
* Average: +1.26% (+0.2 MiB/s) throughput, +1.26% (+190.2) obj/s
* Fastest: +4.67% (+0.7 MiB/s) throughput, +4.67% (+739.8) obj/s
* 50% Median: +1.15% (+0.2 MiB/s) throughput, +1.15% (+173.9) obj/s
```
2021-11-01 08:04:03 -07:00
Aditya Manthramurthy 5f1af8a69d
For IAM with etcd backend, avoid sending notifications (#13472)
As we use etcd's watch interface, we do not need the 
network notifications as they are no-ops anyway.

Bonus: Remove globalEtcdClient global usage in IAM
2021-10-20 03:22:35 -07:00
Harshavardhana acc9645249
allow more socket listeners per instance for multi-core setups (#13385) 2021-10-08 16:58:24 -07:00