* mfa: add cluster-level require_session_mfa option
Users may choose to require MFA for all sessions in some clusters, or
only on select roles in others.
This PR adds the cluster-level option in the `auth_service` config.
To expose this field to non-auth services, opened up `GetAuthPreference`
access over the auth API, including all the cache plumbing.
* Address review feedback
With the introduction of `kube_listen_addr`, some users are confused on
how to set a public address for k8s access that's different from
`public_addr` of the proxy. `kube_public_addr` removes that confusion
and more closely resembles the other proxy endpoints.
This config:
```yaml
proxy_service:
kube_listen_addr: 0.0.0.0:3026
kube_public_addr: kube.example.com:3026
```
translates to the old format:
```yaml
proxy_service:
kubernetes:
enabled: yes
listen_addr: 0.0.0.0:3026
public_addr: kube.example.com:3026
```
* mfa: add new second_factor options "on" and "optional"
"on" means that 2FA is required for all users, either TOTP or U2F.
"optional" means that 2FA is supported for all users, but not required.
Only users with MFA devices registered will be prompted for 2FA on
login.
The login with both supported methods is using the same API as the U2F
login. It just now supports TOTP in addition. The API endpoints are
still named after "u2f", I'll rename those in a future PR (in a
backwards-compatible way).
* Apply suggestions from code review
Co-authored-by: Gus Luxton <gus@gravitational.com>
Co-authored-by: a-palchikov <deemok@gmail.com>
* Address reivew feedback
Co-authored-by: Gus Luxton <gus@gravitational.com>
Co-authored-by: a-palchikov <deemok@gmail.com>
This commit fixes#5177
Initial implementation uses dir backend as a cache and is OK
for small clusters, but will be a problem for many proxies.
This implementation uses Go autocert that is quite limited
compared to Caddy's certmagic or lego.
Autocert has no OCSP stapling and no locking for cache for example.
However, it is much simpler and has no dependencies.
It will be easier to extend to use Teleport backend as a cert cache.
```yaml
proxy_service:
public_addr: ['example.com']
# ACME - automatic certificate management environment.
#
# It provisions certificates for domains and
# valid subdomains in public_addr section.
#
# The sudomains are valid if there is a registered application.
# For example, app.example.com will get a cert if app is a regsitered
# application access app. The sudomain cookie.example.com is not.
#
# Teleport acme is using TLS-ALPN-01 challenge:
#
# https://letsencrypt.org/docs/challenge-types/#tls-alpn-01
#
acme:
# By default acme is disabled.
enabled: true
# Use a custom URI, for example staging is
#
# https://acme-staging-v02.api.letsencrypt.org/directory
#
# Default is letsencrypt.org production URL:
#
# https://acme-v02.api.letsencrypt.org/directory
uri: ''
# Set email to receive alerts and other correspondence
# from your certificate authority.
email: 'alice@example.com'
```
* Use strict teleport.yaml validation in warning mode
Strict YAML validation catches the cases where a valid config key is
placed in the wrong location in the config. These errors were not
caught by the old validation.
The failure is always reported, but only fails startup when both old and
new validations fail. This will let the users fix their configs during
6.0 release and we will start enforcing it in 7.0.
Example:
```yaml
auth_service:
data_dir: "/foo" # this field must live under "teleport:", not "auth_service:"
```
Output:
```
$ teleport start -c teleport-invalid.yaml
ERRO "Teleport configuration is invalid: yaml: unmarshal errors:\n line 6: field data_dir not found in type config.Auth." config/fileconf.go:303
ERRO This error will be enforced in the next Teleport release. config/fileconf.go:304
[AUTH] Auth service 5.0.0-dev:v4.4.0-alpha.1-262-g307040886-dirty is starting on 0.0.0.0:3025.
... continues startup ...
```
* Remove newlines from YAML error
* Update logrus package to fix data races
* Introduce a logger that uses the test context to log the messages so they are output if a test fails for improved trouble-shooting.
* Revert introduction of test logger - simply leave logger configuration at debug level outputting to stderr during tests.
* Run integration test for e as well
* Use make with a cap and append to only copy the relevant roles.
* Address review comments
* Update integration test suite to use test-local logger that would only output logs iff a specific test has failed - no logs from other test cases will be output.
* Revert changes to InitLoggerForTests API
* Create a new logger instance when applying defaults or merging with file service configuration
* Introduce a local logger interface to be able to test file configuration merge.
* Fix kube integration tests w.r.t log
* Move goroutine profile dump into a separate func to handle parameters consistently for all invocations
- 'tsh kube login' fetches the latest list of kube clusters instead of
only using existing kubeconfig contexts.
This makes 'tsh kube login' succeed when a kube cluster was added
after last 'tsh login'.
- 'tsh kube ls' no longer wrongly marks selected clusters, if they
weren't generated by tsh.
- 'tctl rm' now works with kube_service objects.
- 'tsh login' now updates kubeconfig entries when a login session is
already active
- 'teleport.yaml' now uses 'labels' and 'commands' for RBAC labels on
kubernetes_service; this is consistent with ssh and app services.
Added validation check that ensures application names are valid DNS
subdomains. This is because and application name can potentially be used
in the DNS name of the application if either a public address is not
provided or the application is accessed via a trusted cluster.
* Add labels to KubernetesCluster resources
Plumb from config to the registered object, keep dynamic labels updated.
* Check kubernetes RBAC
Checks are in some CRUD operations on the auth server and in the
kubernetes forwarder (both proxy or kubernetes_service).
The logic is essentially copy-paste of the TAA version.
Fixes#3604
This commit adds support for cluster_labels
role parameter limiting access to remote clusters by label.
New tctl update rc provides interface to set labels on remote clusters.
Consider two clusers, `one` - root and `remote` - leaf.
```bash
$ tsh clusters
Cluster Name Status
------------ ------
one online
two online
```
Create the trusted cluster join token with labels:
```bash
$ tctl tokens add --type=trusted_cluster --labels=env=prod
```
Every cluster joined using this token will inherit env:prod labels.
Alternatively, update remote cluster labels by modifying
`rc` command. Letting remote clusters to propagate their labels
creates a problem of rogue clusters updating their labels to bad values.
Instead, administrator of root cluster control the labels
using remote clusters API without fear of override:
```bash
$ tctl get rc
kind: remote_cluster
metadata:
name: two
status:
connection: online
last_heartbeat: "2020-09-14T03:13:59.35518164Z"
version: v3
```
```bash
$ tctl update rc/two --set-labels=env=prod
cluster two has been updated
```
```bash
$ tctl get rc
kind: remote_cluster
metadata:
labels:
env: prod
name: two
status:
connection: online
last_heartbeat: "2020-09-14T03:13:59.35518164Z"
```
Update the role to deny access to prod env:
```yaml
kind: role
metadata:
name: dev
spec:
allow:
logins: [root]
node_labels:
'*': '*'
# Cluster labels control what clusters user can connect to. The wildcard ('*') means
# any cluster. If no role in the role set is using labels and cluster is not labeled,
# the cluster labels check is not applied. Otherwise, cluster labels are always enforced.
# This makes the feature backwards-compatible.
cluster_labels:
'env': 'staging'
deny:
# cluster labels control what clusters user can connect to. The wildcard ('*') means
# any cluster. By default none is set in deny rules to preserve backwards compatibility
cluster_labels:
'env': 'prod'
```
```bash
$ tctl create -f dev.yaml
```
Cluster two is now invisible to user with `dev` role.
```bash
$ tsh clusters
Cluster Name Status
------------ ------
one online
```
Added support for an identity aware, RBAC enforcing, mutually
authenticated, web application proxy to Teleport.
* Updated services.Server to support an application servers.
* Updated services.WebSession to support application sessions.
* Added CRUD RPCs for "AppServers".
* Added CRUD RPCs for "AppSessions".
* Added RBAC support using labels for applications.
* Added JWT signer as a services.CertAuthority type.
* Added support for signing and verifying JWT tokens.
* Refactored dynamic label and heartbeat code into standalone packages.
* Added application support to web proxies and new "app_service" to
proxy mutually authenticated connections from proxy to an internal
application.
* Implement kubernetes_service registration and sratup
The new service now starts, registers (locally or via a join token) and
heartbeats its presence to the auth server.
This service can handle k8s requests (like a proxy) but not to remote
teleport clusters. Proxies will be responsible for routing those.
The client (tsh) will not yet go to this service, until proxy routing is
implemented. I manually tweaked server addres in kubeconfig to test it.
You can also run `tctl get kube_service` to list all registered
instances. The self-reported info is currently limited - only listening
address is set.
* Address review feedback
This is a shorthand for the larger kubernetes section:
```
proxy_service:
kube_listen_addr: "0.0.0.0:3026"
```
if equivalent to:
```
proxy_service:
kubernetes:
enabled: yes
listen_addr: "0.0.0.0:3026"
```
This shorthand is meant to be used with the new `kubernetes_service`:
https://github.com/gravitational/teleport/pull/4455
It reduces confusion when both `proxy_service` and `kubernetes_service`
are configured in the same process.
* Fix local etcd test failures when etcd is not running
* Add kubernetes_service to teleport.yaml
This plumbs config fields only, they have no effect yet.
Also, remove `cluster_name` from `proxy_config.kubernetes`. This field
will only exist under `kubernetes_service` per
https://github.com/gravitational/teleport/pull/4455
* Handle IPv6 in kubernetes_service and rename label fields
* Disable k8s cluster name defaulting in user TLS certs
Need to implement service registration first.
Most users won't need this, so the behavior is optional. Default system
configs will usually trigger a password prompt, which is why this
feature is disabled by default.
Cluster name from this field plug all clusters from kubeconfig are
stored on the auth server via heartbeats.
This info will later be used to route k8s requests back to proxies.
Updates https://github.com/gravitational/teleport/issues/3952
This commit introduces GRPC API for streaming sessions.
It adds structured events and sync streaming
that avoids storing events on disk.
You can find design in rfd/0002-streaming.md RFD.
Adds support for Concurrent Session Control and a new
semaphore API. Roles now support two new configuration
options, `max_ssh_connections` and `max_ssh_sessions`
which correspond to the total number of authenticated
ssh connections per cluster, and the number of ssh sessions
within a connection respectively. Attempting to exceed
these limits generate variants of the `session.rejected`
audit event and cause the connection/session to be
rejected.
This allows users to manually switch to a different algorithm by:
- setting the config file field
- running "tctl auth rotate"
If config file field is not set, existing signing algorithm of the CA is
preserved.
Store the signing algorithm along the CA private key. When reading old
CAs that don't have it set, default to UNKNOWN proto enum which
corresponds to the old SHA1-based signing alg.
The only time you get a SHA2 signature is when creating a fresh cluster
and generating a new CA. This can be disabled in the config.
This allows users to override the SHA2 signing algorithms we default to
now for compatibility with the (very) old OpenSSH versions.
For host and user certs, use the CA signing algo for their own
handshakes. This allows us to propagate the signing algo from auth
server everywhere else.
List of fixed items:
```
integration/helpers.go:1279:2 gosimple S1000: should use for range instead of for { select {} }
integration/integration_test.go:144:5 gosimple S1009: should omit nil check; len() for nil slices is defined as zero
integration/integration_test.go:173:5 gosimple S1009: should omit nil check; len() for nil slices is defined as zero
integration/integration_test.go:296:28 gosimple S1019: should use make(chan error) instead
integration/integration_test.go:570:41 gosimple S1019: should use make(chan interface{}) instead
integration/integration_test.go:685:40 gosimple S1019: should use make(chan interface{}) instead
integration/integration_test.go:759:33 gosimple S1019: should use make(chan string) instead
lib/auth/init_test.go:62:2 gosimple S1021: should merge variable declaration with assignment on next line
lib/auth/tls_test.go:1658:22 gosimple S1024: should use time.Until instead of t.Sub(time.Now())
lib/backend/dynamo/dynamodbbk.go:420:5 gosimple S1004: should use !bytes.Equal(expected.Key, replaceWith.Key) instead
lib/backend/dynamo/dynamodbbk.go:656:12 gosimple S1039: unnecessary use of fmt.Sprintf
lib/backend/etcdbk/etcd.go:458:5 gosimple S1004: should use !bytes.Equal(expected.Key, replaceWith.Key) instead
lib/backend/firestore/firestorebk.go:407:5 gosimple S1004: should use !bytes.Equal(expected.Key, replaceWith.Key) instead
lib/backend/lite/lite.go:317:5 gosimple S1004: should use !bytes.Equal(expected.Key, replaceWith.Key) instead
lib/backend/lite/lite.go:336:6 gosimple S1004: should use !bytes.Equal(value, expected.Value) instead
lib/backend/memory/memory.go:365:5 gosimple S1004: should use !bytes.Equal(expected.Key, replaceWith.Key) instead
lib/backend/memory/memory.go:376:5 gosimple S1004: should use !bytes.Equal(existingItem.Value, expected.Value) instead
lib/backend/test/suite.go:327:10 gosimple S1024: should use time.Until instead of t.Sub(time.Now())
lib/client/api.go:1410:9 gosimple S1003: should use strings.ContainsRune(name, ':') instead
lib/client/api.go:2355:32 gosimple S1019: should use make([]ForwardedPort, len(spec)) instead
lib/client/keyagent_test.go:85:2 gosimple S1021: should merge variable declaration with assignment on next line
lib/client/player.go:54:33 gosimple S1019: should use make(chan int) instead
lib/config/configuration.go:1024:52 gosimple S1019: should use make(services.CommandLabels) instead
lib/config/configuration.go:1025:44 gosimple S1019: should use make(map[string]string) instead
lib/config/configuration.go:930:21 gosimple S1003: should use strings.Contains(clf.Roles, defaults.RoleNode) instead
lib/config/configuration.go:931:22 gosimple S1003: should use strings.Contains(clf.Roles, defaults.RoleAuthService) instead
lib/config/configuration.go:932:23 gosimple S1003: should use strings.Contains(clf.Roles, defaults.RoleProxy) instead
lib/service/supervisor.go:387:2 gosimple S1001: should use copy() instead of a loop
lib/tlsca/parsegen.go:140:9 gosimple S1034: assigning the result of this type assertion to a variable (switch generalKey := generalKey.(type)) could eliminate type assertions in switch cases
lib/utils/certs.go:140:9 gosimple S1034: assigning the result of this type assertion to a variable (switch generalKey := generalKey.(type)) could eliminate type assertions in switch cases
lib/utils/certs.go:167:40 gosimple S1010: should omit second index in slice, s[a:len(s)] is identical to s[a:]
lib/utils/certs.go:204:5 gosimple S1004: should use !bytes.Equal(certificateChain[0].SubjectKeyId, certificateChain[0].AuthorityKeyId) instead
lib/utils/parse/parse.go:116:45 gosimple S1003: should use strings.Contains(variable, "}}") instead
lib/utils/parse/parse.go:116:6 gosimple S1003: should use strings.Contains(variable, "{{") instead
lib/utils/socks/socks.go:192:10 gosimple S1025: should use String() instead of fmt.Sprintf
lib/utils/socks/socks.go:199:10 gosimple S1025: should use String() instead of fmt.Sprintf
lib/web/apiserver.go:1054:18 gosimple S1024: should use time.Until instead of t.Sub(time.Now())
lib/web/apiserver.go:1954:9 gosimple S1039: unnecessary use of fmt.Sprintf
tool/tsh/tsh.go:1193:14 gosimple S1024: should use time.Until instead of t.Sub(time.Now())
```
Fixed findings:
```
lib/sshutils/server_test.go:163:2: SA4006: this value of `clt` is never used (staticcheck)
clt, err := ssh.Dial("tcp", srv.Addr(), &cc)
^
lib/sshutils/server_test.go:91:3: SA5001: should check returned error before deferring ch.Close() (staticcheck)
defer ch.Close()
^
lib/shell/shell_test.go:33:2: SA4006: this value of `shell` is never used (staticcheck)
shell, err = GetLoginShell("non-existent-user")
^
lib/cgroup/cgroup_test.go:111:2: SA9003: empty branch (staticcheck)
if err != nil {
^
lib/cgroup/cgroup_test.go:119:2: SA5001: should check returned error before deferring service.Close() (staticcheck)
defer service.Close()
^
lib/client/keystore_test.go:138:2: SA4006: this value of `keyCopy` is never used (staticcheck)
keyCopy, err = s.store.GetKey("host.a", "bob")
^
lib/client/api.go:1604:3: SA4004: the surrounding loop is unconditionally terminated (staticcheck)
return makeProxyClient(sshClient, m), nil
^
lib/backend/test/suite.go:156:2: SA4006: this value of `err` is never used (staticcheck)
result, err = s.B.GetRange(ctx, prefix("/prefix/c/c1"), backend.RangeEnd(prefix("/prefix/c/cz")), backend.NoLimit)
^
lib/utils/timeout_test.go:84:2: SA1019: t.Dial is deprecated: Use DialContext instead, which allows the transport to cancel dials as soon as they are no longer needed. If both are set, DialContext takes priority. (staticcheck)
t.Dial = func(network string, addr string) (net.Conn, error) {
^
lib/utils/websocketwriter.go:83:3: SA4006: this value of `err` is never used (staticcheck)
utf8, err = w.encoder.String(string(data))
^
lib/utils/loadbalancer_test.go:134:2: SA4006: this value of `out` is never used (staticcheck)
out, err = Roundtrip(frontend.String())
^
lib/utils/loadbalancer_test.go:209:2: SA4006: this value of `out` is never used (staticcheck)
out, err = RoundtripWithConn(conn)
^
lib/srv/forward/sshserver.go:582:3: SA4004: the surrounding loop is unconditionally terminated (staticcheck)
return
^
lib/service/service.go:347:4: SA4006: this value of `err` is never used (staticcheck)
i, err = auth.GenerateIdentity(process.localAuth, id, principals, dnsNames)
^
lib/service/signals.go:60:3: SA1016: syscall.SIGKILL cannot be trapped (did you mean syscall.SIGTERM?) (staticcheck)
syscall.SIGKILL, // fast shutdown
^
lib/config/configuration_test.go:184:2: SA4006: this value of `conf` is never used (staticcheck)
conf, err = ReadFromFile(s.configFileBadContent)
^
lib/config/configuration.go:129:2: SA5001: should check returned error before deferring reader.Close() (staticcheck)
defer reader.Close()
^
lib/kube/kubeconfig/kubeconfig_test.go:227:2: SA4006: this value of `err` is never used (staticcheck)
tlsCert, err := ca.GenerateCertificate(tlsca.CertificateRequest{
^
lib/srv/sess.go:720:3: SA4006: this value of `err` is never used (staticcheck)
result, err := s.term.Wait()
^
lib/multiplexer/multiplexer_test.go:169:11: SA1006: printf-style function with dynamic format string and no further arguments should use print-style function instead (staticcheck)
_, err = fmt.Fprintf(conn, proxyLine.String())
^
lib/multiplexer/multiplexer_test.go:221:11: SA1006: printf-style function with dynamic format string and no further arguments should use print-style function instead (staticcheck)
_, err = fmt.Fprintf(conn, proxyLine.String())
^
```