Our current parsing code runtime grows exponentially with nested
selectors (e.g. '{{a.b.c.d.e.f}}'), mostly due to memory churn from
slice allocations. With 100,000 levels of selectors, parsing takes ~80s
on my machine.
If an attacker can submit these expressions for parsing, they can DoS
the auth server with relatively small payloads (<1MB).
All real-world expressions are <10 AST nodes deep. Add a sanity check of
1000 levels to protect against malicious inputs.
We can optimize the code later on, but it's not very useful for real
world performance.
Fixes#3604
This commit adds support for cluster_labels
role parameter limiting access to remote clusters by label.
New tctl update rc provides interface to set labels on remote clusters.
Consider two clusers, `one` - root and `remote` - leaf.
```bash
$ tsh clusters
Cluster Name Status
------------ ------
one online
two online
```
Create the trusted cluster join token with labels:
```bash
$ tctl tokens add --type=trusted_cluster --labels=env=prod
```
Every cluster joined using this token will inherit env:prod labels.
Alternatively, update remote cluster labels by modifying
`rc` command. Letting remote clusters to propagate their labels
creates a problem of rogue clusters updating their labels to bad values.
Instead, administrator of root cluster control the labels
using remote clusters API without fear of override:
```bash
$ tctl get rc
kind: remote_cluster
metadata:
name: two
status:
connection: online
last_heartbeat: "2020-09-14T03:13:59.35518164Z"
version: v3
```
```bash
$ tctl update rc/two --set-labels=env=prod
cluster two has been updated
```
```bash
$ tctl get rc
kind: remote_cluster
metadata:
labels:
env: prod
name: two
status:
connection: online
last_heartbeat: "2020-09-14T03:13:59.35518164Z"
```
Update the role to deny access to prod env:
```yaml
kind: role
metadata:
name: dev
spec:
allow:
logins: [root]
node_labels:
'*': '*'
# Cluster labels control what clusters user can connect to. The wildcard ('*') means
# any cluster. If no role in the role set is using labels and cluster is not labeled,
# the cluster labels check is not applied. Otherwise, cluster labels are always enforced.
# This makes the feature backwards-compatible.
cluster_labels:
'env': 'staging'
deny:
# cluster labels control what clusters user can connect to. The wildcard ('*') means
# any cluster. By default none is set in deny rules to preserve backwards compatibility
cluster_labels:
'env': 'prod'
```
```bash
$ tctl create -f dev.yaml
```
Cluster two is now invisible to user with `dev` role.
```bash
$ tsh clusters
Cluster Name Status
------------ ------
one online
```
Added support for an identity aware, RBAC enforcing, mutually
authenticated, web application proxy to Teleport.
* Updated services.Server to support an application servers.
* Updated services.WebSession to support application sessions.
* Added CRUD RPCs for "AppServers".
* Added CRUD RPCs for "AppSessions".
* Added RBAC support using labels for applications.
* Added JWT signer as a services.CertAuthority type.
* Added support for signing and verifying JWT tokens.
* Refactored dynamic label and heartbeat code into standalone packages.
* Added application support to web proxies and new "app_service" to
proxy mutually authenticated connections from proxy to an internal
application.
* Implement kubernetes_service registration and sratup
The new service now starts, registers (locally or via a join token) and
heartbeats its presence to the auth server.
This service can handle k8s requests (like a proxy) but not to remote
teleport clusters. Proxies will be responsible for routing those.
The client (tsh) will not yet go to this service, until proxy routing is
implemented. I manually tweaked server addres in kubeconfig to test it.
You can also run `tctl get kube_service` to list all registered
instances. The self-reported info is currently limited - only listening
address is set.
* Address review feedback
Uploader retries slower on network errors and picks the pace
after any upload has succeeded.
Records that were corrupted, will never get uploaded.
The uploader will create streams indefinitely, clogging the auth server
with streams. Now uploader writes marker for bad session uploads
and does not attempt to reupload.
`require` is a sister package to `assert` that terminates the test on
failure. `assert` records the failure but lets the test proceed, which
is un-intuitive.
Also update all existing tests to match.
Matchers use a similar syntax to Expressions, but behave differently:
- Expressions get evaluated - they interpolate some values and return a
final string.
- Matchers check whether some string matches a value
Matchers implement the same logic as utils.SliceMatchesRegex and add 2
new functions:
- {{regexp.match("foo")}} - match input against a raw regex
- {{regexp.not_match("foo")}} - same as match, but inverts the result
No need to handle literal expressions (e.g. without "{{foo.bar}}"
substitutions) at the higher level. Something like "foo" is a valid
expression which always returns "foo" regardless of traits.
This commit introduces GRPC API for streaming sessions.
It adds structured events and sync streaming
that avoids storing events on disk.
You can find design in rfd/0002-streaming.md RFD.
Adds support for Concurrent Session Control and a new
semaphore API. Roles now support two new configuration
options, `max_ssh_connections` and `max_ssh_sessions`
which correspond to the total number of authenticated
ssh connections per cluster, and the number of ssh sessions
within a connection respectively. Attempting to exceed
these limits generate variants of the `session.rejected`
audit event and cause the connection/session to be
rejected.
* return only trace.NotFound
* supressed error message when user does not have ~/.tsh directory
* return only trace.NotFound
* handled permission issue and file path not found error
* formatting
* nit
* Make milv happy
* Revert "Make milv happy"
This reverts commit 5a24e9f725.
Co-authored-by: Ben Arent <ben@gravitational.com>
List of fixed items:
```
integration/helpers.go:1279:2 gosimple S1000: should use for range instead of for { select {} }
integration/integration_test.go:144:5 gosimple S1009: should omit nil check; len() for nil slices is defined as zero
integration/integration_test.go:173:5 gosimple S1009: should omit nil check; len() for nil slices is defined as zero
integration/integration_test.go:296:28 gosimple S1019: should use make(chan error) instead
integration/integration_test.go:570:41 gosimple S1019: should use make(chan interface{}) instead
integration/integration_test.go:685:40 gosimple S1019: should use make(chan interface{}) instead
integration/integration_test.go:759:33 gosimple S1019: should use make(chan string) instead
lib/auth/init_test.go:62:2 gosimple S1021: should merge variable declaration with assignment on next line
lib/auth/tls_test.go:1658:22 gosimple S1024: should use time.Until instead of t.Sub(time.Now())
lib/backend/dynamo/dynamodbbk.go:420:5 gosimple S1004: should use !bytes.Equal(expected.Key, replaceWith.Key) instead
lib/backend/dynamo/dynamodbbk.go:656:12 gosimple S1039: unnecessary use of fmt.Sprintf
lib/backend/etcdbk/etcd.go:458:5 gosimple S1004: should use !bytes.Equal(expected.Key, replaceWith.Key) instead
lib/backend/firestore/firestorebk.go:407:5 gosimple S1004: should use !bytes.Equal(expected.Key, replaceWith.Key) instead
lib/backend/lite/lite.go:317:5 gosimple S1004: should use !bytes.Equal(expected.Key, replaceWith.Key) instead
lib/backend/lite/lite.go:336:6 gosimple S1004: should use !bytes.Equal(value, expected.Value) instead
lib/backend/memory/memory.go:365:5 gosimple S1004: should use !bytes.Equal(expected.Key, replaceWith.Key) instead
lib/backend/memory/memory.go:376:5 gosimple S1004: should use !bytes.Equal(existingItem.Value, expected.Value) instead
lib/backend/test/suite.go:327:10 gosimple S1024: should use time.Until instead of t.Sub(time.Now())
lib/client/api.go:1410:9 gosimple S1003: should use strings.ContainsRune(name, ':') instead
lib/client/api.go:2355:32 gosimple S1019: should use make([]ForwardedPort, len(spec)) instead
lib/client/keyagent_test.go:85:2 gosimple S1021: should merge variable declaration with assignment on next line
lib/client/player.go:54:33 gosimple S1019: should use make(chan int) instead
lib/config/configuration.go:1024:52 gosimple S1019: should use make(services.CommandLabels) instead
lib/config/configuration.go:1025:44 gosimple S1019: should use make(map[string]string) instead
lib/config/configuration.go:930:21 gosimple S1003: should use strings.Contains(clf.Roles, defaults.RoleNode) instead
lib/config/configuration.go:931:22 gosimple S1003: should use strings.Contains(clf.Roles, defaults.RoleAuthService) instead
lib/config/configuration.go:932:23 gosimple S1003: should use strings.Contains(clf.Roles, defaults.RoleProxy) instead
lib/service/supervisor.go:387:2 gosimple S1001: should use copy() instead of a loop
lib/tlsca/parsegen.go:140:9 gosimple S1034: assigning the result of this type assertion to a variable (switch generalKey := generalKey.(type)) could eliminate type assertions in switch cases
lib/utils/certs.go:140:9 gosimple S1034: assigning the result of this type assertion to a variable (switch generalKey := generalKey.(type)) could eliminate type assertions in switch cases
lib/utils/certs.go:167:40 gosimple S1010: should omit second index in slice, s[a:len(s)] is identical to s[a:]
lib/utils/certs.go:204:5 gosimple S1004: should use !bytes.Equal(certificateChain[0].SubjectKeyId, certificateChain[0].AuthorityKeyId) instead
lib/utils/parse/parse.go:116:45 gosimple S1003: should use strings.Contains(variable, "}}") instead
lib/utils/parse/parse.go:116:6 gosimple S1003: should use strings.Contains(variable, "{{") instead
lib/utils/socks/socks.go:192:10 gosimple S1025: should use String() instead of fmt.Sprintf
lib/utils/socks/socks.go:199:10 gosimple S1025: should use String() instead of fmt.Sprintf
lib/web/apiserver.go:1054:18 gosimple S1024: should use time.Until instead of t.Sub(time.Now())
lib/web/apiserver.go:1954:9 gosimple S1039: unnecessary use of fmt.Sprintf
tool/tsh/tsh.go:1193:14 gosimple S1024: should use time.Until instead of t.Sub(time.Now())
```
This code is not caught by linters because it's exported and they assume
there's some external users.
Since teleport is relatively self-contained, we can tell for sure
whether something is called or not.
Fixed findings:
```
lib/sshutils/server_test.go:163:2: SA4006: this value of `clt` is never used (staticcheck)
clt, err := ssh.Dial("tcp", srv.Addr(), &cc)
^
lib/sshutils/server_test.go:91:3: SA5001: should check returned error before deferring ch.Close() (staticcheck)
defer ch.Close()
^
lib/shell/shell_test.go:33:2: SA4006: this value of `shell` is never used (staticcheck)
shell, err = GetLoginShell("non-existent-user")
^
lib/cgroup/cgroup_test.go:111:2: SA9003: empty branch (staticcheck)
if err != nil {
^
lib/cgroup/cgroup_test.go:119:2: SA5001: should check returned error before deferring service.Close() (staticcheck)
defer service.Close()
^
lib/client/keystore_test.go:138:2: SA4006: this value of `keyCopy` is never used (staticcheck)
keyCopy, err = s.store.GetKey("host.a", "bob")
^
lib/client/api.go:1604:3: SA4004: the surrounding loop is unconditionally terminated (staticcheck)
return makeProxyClient(sshClient, m), nil
^
lib/backend/test/suite.go:156:2: SA4006: this value of `err` is never used (staticcheck)
result, err = s.B.GetRange(ctx, prefix("/prefix/c/c1"), backend.RangeEnd(prefix("/prefix/c/cz")), backend.NoLimit)
^
lib/utils/timeout_test.go:84:2: SA1019: t.Dial is deprecated: Use DialContext instead, which allows the transport to cancel dials as soon as they are no longer needed. If both are set, DialContext takes priority. (staticcheck)
t.Dial = func(network string, addr string) (net.Conn, error) {
^
lib/utils/websocketwriter.go:83:3: SA4006: this value of `err` is never used (staticcheck)
utf8, err = w.encoder.String(string(data))
^
lib/utils/loadbalancer_test.go:134:2: SA4006: this value of `out` is never used (staticcheck)
out, err = Roundtrip(frontend.String())
^
lib/utils/loadbalancer_test.go:209:2: SA4006: this value of `out` is never used (staticcheck)
out, err = RoundtripWithConn(conn)
^
lib/srv/forward/sshserver.go:582:3: SA4004: the surrounding loop is unconditionally terminated (staticcheck)
return
^
lib/service/service.go:347:4: SA4006: this value of `err` is never used (staticcheck)
i, err = auth.GenerateIdentity(process.localAuth, id, principals, dnsNames)
^
lib/service/signals.go:60:3: SA1016: syscall.SIGKILL cannot be trapped (did you mean syscall.SIGTERM?) (staticcheck)
syscall.SIGKILL, // fast shutdown
^
lib/config/configuration_test.go:184:2: SA4006: this value of `conf` is never used (staticcheck)
conf, err = ReadFromFile(s.configFileBadContent)
^
lib/config/configuration.go:129:2: SA5001: should check returned error before deferring reader.Close() (staticcheck)
defer reader.Close()
^
lib/kube/kubeconfig/kubeconfig_test.go:227:2: SA4006: this value of `err` is never used (staticcheck)
tlsCert, err := ca.GenerateCertificate(tlsca.CertificateRequest{
^
lib/srv/sess.go:720:3: SA4006: this value of `err` is never used (staticcheck)
result, err := s.term.Wait()
^
lib/multiplexer/multiplexer_test.go:169:11: SA1006: printf-style function with dynamic format string and no further arguments should use print-style function instead (staticcheck)
_, err = fmt.Fprintf(conn, proxyLine.String())
^
lib/multiplexer/multiplexer_test.go:221:11: SA1006: printf-style function with dynamic format string and no further arguments should use print-style function instead (staticcheck)
_, err = fmt.Fprintf(conn, proxyLine.String())
^
```
All changes should be noop, except for
`integration/integration_test.go`.
The integration test was ignoring `recordingMode` test case parameter
and always used `RecordAtNode`. When switching to `recordingMode`, test
cases with `RecordAtProxy` fail with a confusing error about missing
user agent. Filed https://github.com/gravitational/teleport/issues/3606
to track that separately and unblock enabling `structcheck` linter.
`SyncBuffer` has a goroutine running `io.Copy` to read from the
underlying pipe. Close stops the pipe, but doesn't wait for the last
chunk of data to be written by `io.Copy` to the buffer.
Both `Bytes` and `String` assume that the buffer received no further
writes after `Close`.
Add explicit synchronization between `io.Copy` goroutine and `Close`.
Spring cleaning!
A very mechanical cleanup using several linters (unused, deadcode,
structcheck). Build and tests still pass so no behavior should be
affected.
This commit fixes#3369, refs #3374
It adds support for kuberenetes_users section in roles,
allowing Teleport proxy to impersonate user identities.
It also extends variable interpolation syntax by adding
suffix and prefix to variables and function `email.local`:
Example:
```yaml
kind: role
version: v3
metadata:
name: admin
spec:
allow:
# extract email local part from the email claim
logins: ['{{email.local(external.email)}}']
# impersonate a kubernetes user with IAM prefix
kubernetes_users: ['IAM#{{external.email}}']
# the deny section uses the identical format as the 'allow' section.
# the deny rules always override allow rules.
deny: {}
```
Some notes on email.local behavior:
* This is the only function supported in the template variables for now
* In case if the email.local will encounter invalid email address,
it will interpolate to empty value, will be removed from resulting
output.
Changes in impersonation behavior:
* By default, if no kubernetes_users is set, which is a majority of cases,
user will impersonate themselves, which is the backwards-compatible behavior.
* As long as at least one `kubernetes_users` is set, the forwarder will start
limiting the list of users allowed by the client to impersonate.
* If the users' role set does not include actual user name, it will be rejected,
otherwise there will be no way to exclude the user from the list).
* If the `kuberentes_users` role set includes only one user
(quite frequently that's the real intent), teleport will default to it,
otherwise it will refuse to select.
This will enable the use case when `kubernetes_users` has just one field to
link the user identity with the IAM role, for example `IAM#{{external.email}}`
* Previous versions of the forwarding proxy were denying all external
impersonation headers, this commit allows 'Impesrsonate-User' and
'Impersonate-Group' header values that are allowed by role set.
* Previous versions of the forwarding proxy ignored 'Deny' section of the roles
when applied to impersonation, this commit fixes that - roles with deny
kubernetes_users and kubernetes_groups section will not allow
impersonation of those users and groups.
Added package cgroup to orchestrate cgroups. Only support for cgroup2
was added to utilize because cgroup2 cgroups have unique IDs that can be
used correlated with BPF events.
Added bpf package that contains three BPF programs: execsnoop,
opensnoop, and tcpconnect. The bpf package starts and stops these
programs as well correlating their output with Teleport sessions
and emitting them to the audit log.
Added support for Teleport to re-exec itself before launching a shell.
This allows Teleport to start a child process, capture it's PID, place
the PID in a cgroup, and then continue to process. Once the process is
continued it can be tracked by it's cgroup ID.
Reduced the total number of connections to a host so Teleport does not
quickly exhaust all file descriptors. Exhausting all file descriptors
happens very quickly when disk events are emitted to the audit log which
are emitted at a very high rate.
Added tarballs for exec sessions. Updated session.start and session.end
events with additional metadata. Updated the format of session tarballs
to include enhanced events.
Added file configuration for enhanced session recording. Added code to
startup enhanced session recording and pass package to SSH nodes.
When the remote address is an IP address, we cannot directly convert
byte array to string. Otherwise, we will see hex strings like
"\xac\x11\x01\xc8:443".
See more details about the bug in:
https://community.gravitational.com/t/ssh-dynamic-port-forwarding-not-working/432
The above reported issue is resolved after applying the patch.
Drop TLS_RSA_WITH_AES_128_GCM_SHA{256,384} from default ciphersuites
due to being banned by HTTP2 which breaks GRPC clients. For more
information see: https://tools.ietf.org/html/rfc7540#appendix-A.
This has a nice side effect of Teleport now only supporting ciphersuites
that support PFS.
Note that both ciphersuites can still be be added back in file
configuration.
Validate all incoming events (and archives) to ensure that the server ID
within the event matches the x509 identity of the connected host. This
check makes sure nodes can only submit events for themselves.
In addition, make sure session recordings to disk or S3 can not be
overwritten.
Fixes#2648
Teleport does not support SAML identity provider
initiated logins, this commit gives a better
error message to the user instructing them
what to do.
Update utils.CertChecker to only check key and certificate algorithms
when in FIPS mode. Otherwise accept keys and certificates generated with
any algorithm.
This commit implements #2543
In SSH terms ProxyJump is a shortcut for SSH client
connecting the proxy/jumphost and requesting .port forwarding to the
target node.
This commit adds support for direct-tcpip request support
in teleport proxy service that is an alias to the existing proxy
subsystem and reuses most of the code.
This commit also adds support to "route to cluster" metadata
encoded in SSH certificate making it possible to have client
SSH certificates to include the metadata that will cause the proxy
to route the client requests to a specific cluster.
`tsh ssh -J proxy:port ` is supported in a limited way:
Only one jump host is supported (-J supports chaining
that teleport does not utilise) and tsh will return with error
in case of two jumphosts: -J a,b will not work.
In case if `tsh ssh -J user@proxy` is used, it overrides
the SSH proxy coming from the tsh profile and port-forwarding
is used instead of the existing teleport proxy subsystem
This commit implements #2872.
Similarly to file://, the scheme `stdout://` could be used complimentary
to the existing external scheme to logs audit logs:
to stdout:
```yaml
audit_events_uri: ['dynamodb://events', 'stdout://',]
```
Just like `file://` scheme it is only possible to use 'stdout://'
scheme when external events and session uploader are defined,
so all audit upload and search features of teleport could work.