Cluster name can be missing in profiles created by older tsh versions.
Trying to load the client.Key without a cluster name now causes a
failure when using WithAllCerts (because ssh/db/kube certs are
per-cluster).
Also added some output to `tsh status` when no profiles can be loaded.
```diff
~/.tsh/
└── keys
├── one.example.com --> Proxy hostname
│ ├── certs.pem --> TLS CA certs for the Teleport CA
│ ├── foo --> RSA Private Key for user "foo"
│ ├── foo.pub --> Public Key
- │ ├── foo-cert.pub --> SSH certificate for proxies and nodes
│ ├── foo-x509.pem --> TLS client certificate for Auth Server
+ │ ├── foo-ssh --> SSH certs for user "foo"
+ │ │ ├── root-cert.pub --> SSH cert for Teleport cluster "root"
+ │ │ └── leaf-cert.pub --> SSH cert for Teleport cluster "leaf"
```
When `-J` is provided, this also loads/reissues the SSH cert for the cluster associated with the jumphost's certificate. Fixes#5637.
* Added support for connecting API client through tunnel proxy and web proxy addresses (with identity file).
* Added concurrent dialing logic to dial several possible dialing combinations and seamlessly return the first client to connect.
* mfa: per-session MFA certs for SSH and Kubernetes
This is client-side support for requesting single-use certs with an MFA
check.
The client doesn't know whether they need MFA check when accessing a
resource, this is decided during an RBAC check on the server. So a
client will always try to get a single-use cert, and the server will
respond with NotNeeded if MFA is not required. This is an extra
round-trip for every session which causes ~20% slowdown in SSH logins:
```
$ hyperfine '/tmp/tsh-old ssh talos date' '/tmp/tsh-new ssh talos date'
Benchmark #1: /tmp/tsh-old ssh talos date
Time (mean ± σ): 49.9 ms ± 1.0 ms [User: 15.1 ms, System: 7.4 ms]
Range (min … max): 48.4 ms … 54.1 ms 59 runs
Benchmark #2: /tmp/tsh-new ssh talos date
Time (mean ± σ): 60.2 ms ± 1.6 ms [User: 19.1 ms, System: 8.3 ms]
Range (min … max): 59.0 ms … 69.7 ms 50 runs
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
Summary
'/tmp/tsh-old ssh talos date' ran
1.21 ± 0.04 times faster than '/tmp/tsh-new ssh talos date'
```
Another few other internal changes:
- client.LocalKeyAgent will now always have a non-nil LocalKeyStore.
Previously, it would be nil (e.g. in a web UI handler or when using an
identity file) which easily causes panics. I added a noLocalKeyStore
type instead that returns errors from all methods.
- requesting a user cert with a TTL < 1min will now succeed and return a
1min cert instead of failing
* Capture access approvals on MFA-issued certs
* Address review feedback
* Address review feedback
* mfa: accept unknown nodes during short-term MFA cert creation
An unknown node could be an OpenSSH node set up via
https://goteleport.com/teleport/docs/openssh-teleport/
In this case, we shouldn't prevent the user from connecting.
There's a small risk of authz bypass - an attacker might know a
different name/IP for a registered node which Teleport doesn't know
about. But a Teleport node will still check RBAC and reject the
connection.
* Validate username against unmapped user identity
IssueUserCertsWithMFA is called on the leaf auth server in case of
trusted clusters. Username in the request object will be that of the
original unmapped caller.
* mfa: add IsMFARequired RPC
This RPC is ran before every connection to check whether MFA is
required. If a connection is against the leaf cluster, this request is
forwarded from root to leaf for evaluation.
* Fix integration tests
* Correctly treat "Username" as login name in IsMFARequired
Also, move the logic into auth.Server out of ServerWithRoles.
* Fix TestHA
* Address review feedback
An extra dockerfile for gRPC generation is extra maintenance burden. It
was also using a really old base image that has a ton of known vulns.
Also, update GOGO_PROTO_TAG to match the version we have vendored via
go.mod.
* mfa: reuse the same challenge for all U2F devices
Challenge is a random string that U2F devices must sign. The JS API
requires you to use the same challenge for all registered devices,
instead of one challenge per device that we had previously. See
https://fidoalliance.org/specs/fido-u2f-v1.0-nfc-bt-amendment-20150514/fido-u2f-javascript-api.html#dictionary-u2fsignrequest-members
Reuse the same challenge for U2F devices to match the JS API. Also,
propagate the version string to follow the spec exactly.
* Update lib/auth/auth.go
Co-authored-by: a-palchikov <deemok@gmail.com>
Co-authored-by: a-palchikov <deemok@gmail.com>
* auth: API for requesting per-connection certificates
See https://github.com/gravitational/teleport/blob/master/rfd/0014-session-2FA.md#api
This API is a wrapper around GenerateUserCerts with a few differences:
- performs an MFA check before generating a cert
- enforces a single usage (ssh/k8s/db for now)
- embeds client IP in the cert
- marks a cert to distinguish from regular user certs
- enforces a 1min TTL
* Apply suggestions from code review
Co-authored-by: a-palchikov <deemok@gmail.com>
Co-authored-by: a-palchikov <deemok@gmail.com>
* Use fake clock consistently in units tests.
* Split web session management into two interfaces and implement them separately for clear separation
* Split session management into New/Validate to make it aparent where the sessions are created and where existing sessions are managed. Remove ttlmap in favor of a simple map and handle expirations
explicitly.
Add web session management to gRPC server for the cache.
* Reintroduce web sessions APIs under a getter interface.
* Add SubKind to WatchKind for gRPC and add conversions from/to protobuf. Fix web sessions unit tests.
* lib/web: create/insert session context in ValidateSession if the session has not yet been added to session cache.
lib/cache: add event filter for web session in auth cache.
lib/auth: propagate web session subkind in gRPC event.
* Add implicit migrations for legacy web session key path for queries.
* Integrate web token in lib/web
* Add a bearer token when upserting a web session
* Fix tests. Use fake clock wherever possible.
* Converge session cache handling in lib/web
* Clean up and add doc comments where necessary
* Use correct form of sessions/tokens controller for ServerWithRoles. Use fake time in web tests
* Converge the web sessions/tokens handling in lib/auth to match the old behavior w.r.t access checking (e.g. implicit handling of the local user identity).
* Use cached reads and waiters only when necessary. Query sessions/tokens using best-effort - first looking in the cache and falling back to a proxy client
* Properly propagate events about deletes for values with subkind.
* Update to retrofit changes after recent teleport API refactorings
* Update comment on removing legacy code to move the deadline to 7.x
* Do not close the resources on the session when it expires - this beats the purpose of this PR.
Also avoid a race between closing the cached clients and an existing reference to the session by letting the session linger for longer before removing it.
* Move web session/token request structs to the api client proto package
* Only set HTTP fs on the web handler if the UI is enabled
* Properly tear down web session test by releasing resources at the end. Fix the web UI assets configuration by removing DisableUI and instead use the presence of assets (HTTP file system) as an indicator that the web UI has been enabled.
* Decrease the expired session cache clean up threshold to 2m. Only log the expiration error message for errors other than not found
* Add test for terminal disconnect when using two proxies in HA mode
* mfa: implement management commands in tsh
New commands are:
- tsh mfa ls
- tsh mfa add
- tsh mfa rm
There are 2 problems intentionally left in this PR to keep it small:
1. TOTP registration requires user to manually enter the secret in the
app. When there's free time, I'll add platform-specific QR code display
to make this easier.
2. U2F authentication only checks one of the registered devices. This is
a limitation of the u2f-host binary, which can't check multiple devices
at once (even if spawning multiple u2f-host commands in parallel). In
the next PR, I'll replace u2f-host with a Go library that supports this.
* Address review feedback
Add 3 new RPCs for the auth server:
- AddMFADevice
- DeleteMFADevice
- GetMFADevices
All RPCs act on the user calling them, rather than specifying the user
in parameters. It's one less thing to validate and also prevents authz
bugs with one user messing with other user's MFA devices.
Add and Delete RPCs are streaming both ways, to allow MFA using an
existing device (prevents MFA bypass) and a challenge/response
registration used in U2F and TOTP. This approach makes the challenge
bound to the RPC connection and doesn't require backend storage.
Each user can now have multiple devices. This commit only changes the
backend structure to support it, the client and API haven't been updated
yet.
Also added a migration for existing MFA data on auth server startup.