* Remove JSON schema validation
Removing JSON schema validation from all resource unmarshalers.
--- what JSON schema gets us
Looking at the JSON schema spec and our usage, here are the supposed benefits:
- type validation - make sure incoming data uses the right types for the right fields
- required fields - make sure that mandatory fields are set
- defaulting - set defaults for fields
- documentation - schema definition for our API objects
Note that it does _not_ do:
- fail on unknown fields in data
- fail on a required field with an empty value
--- what replaces it
Based on the above, it may seem like JSON schema provides value.
But it's not the case, let's break it down one by one:
- type validation - unmarshaling JSON into a typed Go struct does this
- required fields - only checks that the field was provided, doesn't actually check that a value is set (e.g. `"name": ""` will pass the `required` check)
- so it's pretty useless for any real validation
- and we already have a separate place for proper validation - `CheckAndSetDefaults` methods
- defaulting - done in `CheckAndSetDefaults` methods
- `Version` is the only annoying field, had to add it in a bunch of objects
- documentation - protobuf definitions are the source of truth for our API schema
--- the benefits
- performance - schema validation does a few rounds of `json.Marshal/Unmarshal` in addition to actual validation; now we simply skip all that
- maintenance - no need to keep protobuf and JSON schema definitions in sync anymore
- creating new API objects - one error-prone step removed
- (future) fewer dependencies - we can _almost_ remove the Go libraries for schema validation (one transient dependency keeping them around)
* Remove services.SkipValidation
No more JSON schema validation so this option is a noop.
In an attempt to make it easier to
1) navigate the integration test output,
2) find the cause of test failures, and
3) run individual tests, make it easier to run individual
integration tests from the command line,
...this change ports some of the OSS integration tests away from
GoCheck and implements them in terms of the standard `testing`
package.
The main changes are:
* Test suites are now constructed as a normal Test function
with many subtests.
* The GoCheck assertions have been replaced with equivalent
assertions from `testify/require`, for example:
`c.Assert(err, check.IsNil)`
becomes
`require.NoError(t, err)`
... and so on
* Avoid test flake by ensuring the gRPC server is shutdown gracefully before closing the audit log
* Fix lint warnings. Nove tunnel server's Close to earlier to close the proxy watcher and release grpc traffic
* Use graceful shutdown selectively until all tests have improved support for it
* Move session recorder clean up to session.Close
* Always use graceful shutdown for TLS.
Prior to this change, `tsh` will only ever forward the internal key
agent managed by `tsh` to a remote machine.
This change allows a user to specify that `tsh` should forward either
the `tsh`-internal keystore, or the system key agent at `$SSH_AUTH_SOCK`.
This change also brings the `-A` command-line option into line with
OpenSSH.
For more info refer to RFD-0022.
See-Also: #1571
Currently, an app's target FQDN can be obtained only using the endpoint
for creating new app sessions. The OAuth-style back-and-forth redirects
between the app launcher and the app itself are therefore forced to
generate an unnecessary additional app session just to resolve the FQDN.
The new endpoint introduced here allows to resolve such FQDNs by
invoking a dedicated endpoint.
* Attempt to isolate and improve state handling of a NodeSession.
* Add terminal close for kube terminal tests
* Address review comments
* Small tweaks
Co-authored-by: Andrew Lytvynov <andrew@goteleport.com>
```diff
~/.tsh/
└── keys
├── one.example.com --> Proxy hostname
│ ├── certs.pem --> TLS CA certs for the Teleport CA
│ ├── foo --> RSA Private Key for user "foo"
│ ├── foo.pub --> Public Key
- │ ├── foo-cert.pub --> SSH certificate for proxies and nodes
│ ├── foo-x509.pem --> TLS client certificate for Auth Server
+ │ ├── foo-ssh --> SSH certs for user "foo"
+ │ │ ├── root-cert.pub --> SSH cert for Teleport cluster "root"
+ │ │ └── leaf-cert.pub --> SSH cert for Teleport cluster "leaf"
```
When `-J` is provided, this also loads/reissues the SSH cert for the cluster associated with the jumphost's certificate. Fixes#5637.
* mfa: per-session MFA certs for SSH and Kubernetes
This is client-side support for requesting single-use certs with an MFA
check.
The client doesn't know whether they need MFA check when accessing a
resource, this is decided during an RBAC check on the server. So a
client will always try to get a single-use cert, and the server will
respond with NotNeeded if MFA is not required. This is an extra
round-trip for every session which causes ~20% slowdown in SSH logins:
```
$ hyperfine '/tmp/tsh-old ssh talos date' '/tmp/tsh-new ssh talos date'
Benchmark #1: /tmp/tsh-old ssh talos date
Time (mean ± σ): 49.9 ms ± 1.0 ms [User: 15.1 ms, System: 7.4 ms]
Range (min … max): 48.4 ms … 54.1 ms 59 runs
Benchmark #2: /tmp/tsh-new ssh talos date
Time (mean ± σ): 60.2 ms ± 1.6 ms [User: 19.1 ms, System: 8.3 ms]
Range (min … max): 59.0 ms … 69.7 ms 50 runs
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
Summary
'/tmp/tsh-old ssh talos date' ran
1.21 ± 0.04 times faster than '/tmp/tsh-new ssh talos date'
```
Another few other internal changes:
- client.LocalKeyAgent will now always have a non-nil LocalKeyStore.
Previously, it would be nil (e.g. in a web UI handler or when using an
identity file) which easily causes panics. I added a noLocalKeyStore
type instead that returns errors from all methods.
- requesting a user cert with a TTL < 1min will now succeed and return a
1min cert instead of failing
* Capture access approvals on MFA-issued certs
* Address review feedback
* Address review feedback
* mfa: accept unknown nodes during short-term MFA cert creation
An unknown node could be an OpenSSH node set up via
https://goteleport.com/teleport/docs/openssh-teleport/
In this case, we shouldn't prevent the user from connecting.
There's a small risk of authz bypass - an attacker might know a
different name/IP for a registered node which Teleport doesn't know
about. But a Teleport node will still check RBAC and reject the
connection.
* Validate username against unmapped user identity
IssueUserCertsWithMFA is called on the leaf auth server in case of
trusted clusters. Username in the request object will be that of the
original unmapped caller.
* mfa: add IsMFARequired RPC
This RPC is ran before every connection to check whether MFA is
required. If a connection is against the leaf cluster, this request is
forwarded from root to leaf for evaluation.
* Fix integration tests
* Correctly treat "Username" as login name in IsMFARequired
Also, move the logic into auth.Server out of ServerWithRoles.
* Fix TestHA
* Address review feedback
Cluster labels were added in 5.0 to restrict access to trusted clusters.
Enforce this restriction on `tsh login leafName` (aka `GenerateUserCerts`).
Note: access check is already enforced on actual user connections
(ssh/k8s/etc) and listing of trusted clusters (`tsh clusters`). You
cannot bypass authz to actually connect to that cluster.
* auth: API for requesting per-connection certificates
See https://github.com/gravitational/teleport/blob/master/rfd/0014-session-2FA.md#api
This API is a wrapper around GenerateUserCerts with a few differences:
- performs an MFA check before generating a cert
- enforces a single usage (ssh/k8s/db for now)
- embeds client IP in the cert
- marks a cert to distinguish from regular user certs
- enforces a 1min TTL
* Apply suggestions from code review
Co-authored-by: a-palchikov <deemok@gmail.com>
Co-authored-by: a-palchikov <deemok@gmail.com>
* Update logrus package to fix data races
* Introduce a logger that uses the test context to log the messages so they are output if a test fails for improved trouble-shooting.
* Revert introduction of test logger - simply leave logger configuration at debug level outputting to stderr during tests.
* Run integration test for e as well
* Use make with a cap and append to only copy the relevant roles.
* Address review comments
* Update integration test suite to use test-local logger that would only output logs iff a specific test has failed - no logs from other test cases will be output.
* Revert changes to InitLoggerForTests API
* Create a new logger instance when applying defaults or merging with file service configuration
* Introduce a local logger interface to be able to test file configuration merge.
* Fix kube integration tests w.r.t log
* Move goroutine profile dump into a separate func to handle parameters consistently for all invocations
This change has several parts: cluster registration, cache updates,
routing and a new tctl flag.
> cluster registration
Cluster registration means adding `KubernetesClusters` to `ServerSpec`
for servers with `KindKubeService`.
`kubernetes_service` instances will parse their kubeconfig or local
`kube_cluster_name` and add them to their `ServerSpec` sent to the auth
server. They are effectively declaring that "I can serve k8s requests
for k8s cluster X".
> cache updates
This is just cache plumbing for `kubernetes_service` presence, so that
other teleport processes can fetch all of kube services. It was missed
in the previous PR implementing CRUD for `kubernetes_service`.
> routing
Now the fun part - routing logic. This logic lives in
`/lib/kube/proxy/forwarder.go` and is shared by both `proxy_service`
(with kubernetes integration enabled) and `kubernetes_service`.
The target k8s cluster name is passed in the client cert, along with k8s
users/groups information.
`kubernetes_service` only serves requests for its direct k8s cluster
(from `Forwarder.creds`) and doesn't route requests to other teleport
instances.
`proxy_service` can serve requests:
- directly to a k8s cluster (the way it works pre-5.0)
- to a leaf teleport cluster (also same as pre-5.0, based on
`RouteToCluster` field in the client cert)
- to a `kubernetes_service` (directly or over a tunnel)
The last two modes require the proxy to generate an ephemeral client TLS
cert to do an outbound mTLS connection.
> tctl flag
A flag `--kube-cluster-name` for `tctl auth sign --format=kubernetes`
which allows generating client certs for non-default k8s cluster name
(as long as it's registered in a cluster).
I used this for testing, but it could be used for automation too.
Fixes#3604
This commit adds support for cluster_labels
role parameter limiting access to remote clusters by label.
New tctl update rc provides interface to set labels on remote clusters.
Consider two clusers, `one` - root and `remote` - leaf.
```bash
$ tsh clusters
Cluster Name Status
------------ ------
one online
two online
```
Create the trusted cluster join token with labels:
```bash
$ tctl tokens add --type=trusted_cluster --labels=env=prod
```
Every cluster joined using this token will inherit env:prod labels.
Alternatively, update remote cluster labels by modifying
`rc` command. Letting remote clusters to propagate their labels
creates a problem of rogue clusters updating their labels to bad values.
Instead, administrator of root cluster control the labels
using remote clusters API without fear of override:
```bash
$ tctl get rc
kind: remote_cluster
metadata:
name: two
status:
connection: online
last_heartbeat: "2020-09-14T03:13:59.35518164Z"
version: v3
```
```bash
$ tctl update rc/two --set-labels=env=prod
cluster two has been updated
```
```bash
$ tctl get rc
kind: remote_cluster
metadata:
labels:
env: prod
name: two
status:
connection: online
last_heartbeat: "2020-09-14T03:13:59.35518164Z"
```
Update the role to deny access to prod env:
```yaml
kind: role
metadata:
name: dev
spec:
allow:
logins: [root]
node_labels:
'*': '*'
# Cluster labels control what clusters user can connect to. The wildcard ('*') means
# any cluster. If no role in the role set is using labels and cluster is not labeled,
# the cluster labels check is not applied. Otherwise, cluster labels are always enforced.
# This makes the feature backwards-compatible.
cluster_labels:
'env': 'staging'
deny:
# cluster labels control what clusters user can connect to. The wildcard ('*') means
# any cluster. By default none is set in deny rules to preserve backwards compatibility
cluster_labels:
'env': 'prod'
```
```bash
$ tctl create -f dev.yaml
```
Cluster two is now invisible to user with `dev` role.
```bash
$ tsh clusters
Cluster Name Status
------------ ------
one online
```
Added support for an identity aware, RBAC enforcing, mutually
authenticated, web application proxy to Teleport.
* Updated services.Server to support an application servers.
* Updated services.WebSession to support application sessions.
* Added CRUD RPCs for "AppServers".
* Added CRUD RPCs for "AppSessions".
* Added RBAC support using labels for applications.
* Added JWT signer as a services.CertAuthority type.
* Added support for signing and verifying JWT tokens.
* Refactored dynamic label and heartbeat code into standalone packages.
* Added application support to web proxies and new "app_service" to
proxy mutually authenticated connections from proxy to an internal
application.
Previously, we needed:
- create on namespaces
- impersonate on all users/groups/service accounts
- list pods in kube-system namespace (via teleport-ci-test-group)
- exec/portforward on kube-dns pod in kube-system namespace (via teleport-ci-test-group)
Now, we need:
- create on namespaces
- create on pods in namespace teletest
- impersonate on all users/groups
- get/exec/portforward on pod test-pod in namespace teletest (via teleport-ci-test-group)
Unfortunately, `resourceNames` in RBAC doesn't work with `create` verbs,
so we can't scope down impersonation to just the right users/groups.