Commit graph

2334 commits

Author SHA1 Message Date
Andrew Lytvynov 4fc106553f
Upload k8s session recordings regardless of request context (#5145)
The HTTP request context is canceled when the client disconnects. Using
this context in the session recorder prevents it from uploading the
session when it's finished.

Use the server context instead, to prevent lost recordings.
2020-12-16 11:46:59 -08:00
Andrej Tokarčík 1fe6226803
Improve error message reported when node is offline (#5036) 2020-12-15 16:36:39 +01:00
Andrew Lytvynov 05c73c9372
Upgrade gosaml2 library to v0.6.0 (#5118)
See https://github.com/russellhaering/gosaml2/security/advisories/GHSA-xhqq-x44f-9fgg
2020-12-14 11:34:20 -08:00
Andrej Tokarčík ee87fce040
Don't log error on tunnel node after its serving agent is stopped (#5042) 2020-12-11 17:39:19 +01:00
a-palchikov ca60c7eb35
Add SetLevel to utils.Logger interface (#5082) 2020-12-11 12:59:09 +01:00
a-palchikov 7809a47356
Fix a flaky test in lib/srv/app (#5079) 2020-12-11 12:36:02 +01:00
a-palchikov c94e5042c7
Server data race (#4790)
* Add logger attributes to be able to propagate logger from tests for identifying tests
* Add test case for Server's DeepCopy.
* Update test to using the testing package directly. Update dependency after upstream PR.
2020-12-09 16:46:33 +01:00
Andrew Lytvynov 3fa6904377
Multiple fixes for k8s forwarder (#5038)
* kube: emit audit events using process context

Using the request context can prevent audit events from getting emitted,
if client disconnected and request context got closed.
We shouldn't be losing audit events like that.

Also, log all response errors from exec handler.

* kube: cleanup forwarder code

Rename a few config fields to be more descriptive.
Avoid embedding unless necessary, to keep the package API clean.

* kube: cache only user certificates, not the entire session

The expensive part that we need to cache is the client certificate.
Making a new one requires a round-trip to the auth server, plus entropy
for crypto operations.

The rest of clusterSession contains request-specific state, and only
adds problems if cached.
For example: clusterSession stores a reference to a remote teleport
cluster (if needed); caching requires extra logic to invalidate the
session when that cluster disappears (or tunnels drop out). Same problem
happens with kubernetes_service tunnels.

Instead, the forwarder now picks a new target for each request from the
same user, providing a kind of "load-balancing".

* Init session uploader in kubernetes service

It's started in all other services that upload sessions (app/proxy/ssh),
but was missing here. Because of this, the session storage directory for
async uploads wasn't created on disk and caused interactive sessions to
fail.
2020-12-08 11:12:07 -08:00
a-palchikov 673c2907f2
Augment session events with cluster name (#4994)
Add cluster name to event metadata
2020-12-08 13:33:44 +01:00
a-palchikov 7c87576a8b
flaky tests: consistent logging (#4849)
* Update logrus package to fix data races
* Introduce a logger that uses the test context to log the messages so they are output if a test fails for improved trouble-shooting.
* Revert introduction of test logger - simply leave logger configuration at debug level outputting to stderr during tests.
* Run integration test for e as well
* Use make with a cap and append to only copy the relevant roles.
* Address review comments
* Update integration test suite to use test-local logger that would only output logs iff a specific test has failed - no logs from other test cases will be output.
* Revert changes to InitLoggerForTests API
* Create a new logger instance when applying defaults or merging with file service configuration
* Introduce a local logger interface to be able to test file configuration merge.
* Fix kube integration tests w.r.t log
* Move goroutine profile dump into a separate func to handle parameters consistently for all invocations
2020-12-07 15:35:15 +01:00
Andrew Lytvynov 11f5dc6c39
Set TTL on kube_service resources (#5008)
Without this, deleted kube_services linger in the backend and show up as
obsolete kubernetes clusters in tsh.

Ideally, this TTL logic should be enforced centrally, but I'd like to
fix the bug first, and do a larger refactoring later.
2020-12-03 15:51:32 -08:00
jane (quin) 9c26188d30
Fix coordinated omission bug (#4643)
* benchmark package

* use default config if path is not specified

* progressiveBench as a config method

* implement a main.go approach to run progressive tests

* make teleport client, run specified benchmark

* function and method descriptions

* make teleport client

* testing

* change interface method signatures

* dry up bench.go code, move producer goroutines to own function

* output formatting

* remove yaml

* fix linter errors

* remove print

* PR suggested changes, moved export latency profile functionality to the benchmark package

* PR fixes

* method description

* update testing

* linter

* docs and example

* PR suggestion changes

* fix coord omission bug

* remove benchmark struct

* remove threads, using open system

* recover in run

* close channel, check if open with each execution

* update testing, pr suggestions

* add more instructions to readme

* update example.go

* pass back context

* use SyncBuffer

* export response and service histograms

* update readme, exporting profiles section

* return from execute()

* export singular latency profile

* export response profile

* Revert "export response profile"

This reverts commit 5a21cb034c.

* export response profile

* update branch

* format example.go

* remove threads

* update example.go

* update branch

* goimports

* add signal handler & update docs

* PR suggestions

* exit out of interactive session

* revert execute

* PR suggestion

* run commmand on non-interactive instead of nil
2020-12-01 11:04:31 -08:00
Andrew Lytvynov c4583b7a1a
Fix response flushing on streaming k8s requests (#5009)
Streaming requests, like `kubectl logs -f` will slowly write response
data over time. The `http.ResponseWriter` wrapper we added for capturing
the response code didn't propagate `http.Flusher` interface and
prevented the forwarder library from periodically flushing response
contents.

This caused `kubectl logs -f` results to be delayed, delivered in
batches as some internal buffer filled up.
2020-11-30 17:41:50 -08:00
Vladimir Kochnev b911f4b551
Fix JWK kty from "rsa" to "RSA" (#4993)
JWKS libraries expect it to be "RSA", not "rsa", example:
6cfa98f8ac/src/JwksClient.js (L79-L81)

According to RFCs, "kty" field seems to be case-sensitive, though there
cannot be names matching in a case-insensitive manner:
https://tools.ietf.org/html/rfc7518#section-7.4.1

The list of key types available in RFC 7518:
https://tools.ietf.org/html/rfc7518#section-6.1

Co-authored-by: Gus Luxton <gus@gravitational.com>
2020-11-27 11:07:41 -04:00
a-palchikov 9b73af55ab
Fix local etcd backend tests (#4986)
* Fix etcd backend tests to properly skip if etcd is not requested/availalable
* Address review comments
2020-11-26 13:56:28 +01:00
jane (quin) 6eaaf3a27e
Linear benchmark generator (#4588)
* benchmark package

* use default config if path is not specified

* progressiveBench as a config method

* implement a main.go approach to run progressive tests

* make teleport client, run specified benchmark

* function and method descriptions

* make teleport client

* testing

* change interface method signatures

* dry up bench.go code, move producer goroutines to own function

* output formatting

* remove yaml

* fix linter errors

* remove print

* PR suggested changes, moved export latency profile functionality to the benchmark package

* PR fixes

* method description

* update testing

* linter

* docs and example

* PR suggestion changes

* PR changes

* wrap errors

* move bench to benchmark & testing updates

* PR changes

* PR suggestions
2020-11-25 15:47:39 -08:00
Andrew Lytvynov c6832ec606
Set server_addr in audit events from connection info (#4985)
This sets a useful server IP, when no advertise_ip is set. Previously,
the address was taken from the listener, and is usually "0.0.0.0:3022"
or "[::]:3022".

Also, add some test cases in utils for IPv6 handling.
2020-11-25 12:08:37 -08:00
Ben Arent 09928a7f2b
Cherry pick Gravitational -> GoTeleport (#4932) 2020-11-25 11:18:55 -08:00
Andrew Lytvynov cdf26c74e5
Change log about missing kube clusters on login to debug (#4935)
This is a totally OK situation in clusters without k8s integration, so
it shouldn't be a warning.
2020-11-23 18:02:35 +00:00
Russell Jones d0a202f1bc Added error checking to Application Access CLI.
Check if both application name and URI are provided when attempting to
join an application service process to a cluster.
2020-11-20 16:38:52 -08:00
Russell Jones b66ca14f61 Added HTTP method to app.session.request.
Added HTTP method field to "app.session.request" events.
2020-11-20 16:38:40 -08:00
Lisa Kim c56df637d1
Add AuthType field for web config (#4946) 2020-11-20 11:21:07 -08:00
Brian Joerger 1439f35902
[docs] Go API Docs CA (#4777) 2020-11-20 10:17:39 -08:00
a-palchikov 09064cbc6f
Configure etcd client's message size (#4800)
* lib/backend/etcdbk: add a configuration attribute to set the client's
send message size limit.
* Update etcd backend section w.r.t new client configuration attribute

Updates https://github.com/gravitational/teleport/issues/4786.
2020-11-19 14:03:51 +01:00
a-palchikov ab205963f5
Fix typos (#4903) 2020-11-19 13:39:16 +01:00
Forrest Marshall 5ad1a9025c fix early watcher closure 2020-11-18 15:40:56 -08:00
Forrest Marshall 68adee36a9 fix tsh login with trusted clusters 2020-11-18 15:40:56 -08:00
Andrew Lytvynov 645ac573c5
UX improvements for kube CLI interactions (#4893)
- 'tsh kube login' fetches the latest list of kube clusters instead of
  only using existing kubeconfig contexts.
  This makes 'tsh kube login' succeed when a kube cluster was added
  after last 'tsh login'.
- 'tsh kube ls' no longer wrongly marks selected clusters, if they
  weren't generated by tsh.
- 'tctl rm' now works with kube_service objects.
- 'tsh login' now updates kubeconfig entries when a login session is
  already active
- 'teleport.yaml' now uses 'labels' and 'commands' for RBAC labels on
  kubernetes_service; this is consistent with ssh and app services.
2020-11-18 22:31:04 +00:00
Russell Jones 48a37af5ad Updated default admin role.
Updated default admin role to support reading services.KindProxy. This
is needed by "tctl" when using credentials from ~/.tsh to generate the
join message.
2020-11-18 11:49:23 -08:00
Andrew Lytvynov 05f5f2d241
Prevent a panic in tsh kube login when logged out (#4885)
Turns out, client.Status can return a nil error *and* profile.
Handle nil profile separately and return a simple error.
2020-11-18 17:51:28 +00:00
AdamKorcz c0ecb0a081
Minor update to fuzzing README (#4889) 2020-11-18 11:56:01 -04:00
Russell Jones 986bf08ab3 Consolidated application checks.
Consolidated application validation checks. The previous implementation
had a bug in it where it would fail if no /etc/teleport.yaml existed.
2020-11-17 17:57:00 -08:00
Russell Jones 898088a282 Fixed application dialing in proxy recording mode.
Only use the forwarded agent when dialing in proxy recording mode when
the connection type is SSH.
2020-11-17 17:57:00 -08:00
Andrew Lytvynov aceffd9a35
Add more data to k8s session events (#4858)
Added fields:
- kube users/groups
- pod name/namespace
- container name/image
- node name

Container image and node name need to be fetched from the k8s API, they
are not known from just the client request. This fetch is optional, and
if it fails (like due to permission errors), those fields will be
missing.

Since kubernetes_service can talk to k8s API and proxy_service can't,
all session events are now emitted by kubernetes_service and skipped by
the proxy (used to be the other way around).
2020-11-17 23:46:51 +00:00
Andrew Lytvynov 679b3e6719
Fix a server parsing regression between 4.4 and 5.0 (#4865)
The `KubernetesClusters` field in `ServerSpecV2` used to be a
`[]string`:
https://github.com/gravitational/teleport/pull/4354/files#diff-50ec8b71306e75db3cb193b581cdd51139b03f90e23e7804cbef7edf712bbfac
Later, it was changed to `[]*services.KubernetesCluster`, which is
incompatible when parsing.

Unfortunately, the string version slipped into 4.4. When upgrading to
5.0, teleport fails to parse the old server object at startup and
crashes.

Rename the JSON tag from `kubernetes_clusters` to `kube_clusters` to
distinguish the different versions of this field when parsing. The old
`kubernetes_clusters` will just be ignored.
2020-11-17 21:50:12 +00:00
Andrew Lytvynov 43178f34d8
Add a depth limit to RBAC expression parser (#4848)
Our current parsing code runtime grows exponentially with nested
selectors (e.g. '{{a.b.c.d.e.f}}'), mostly due to memory churn from
slice allocations. With 100,000 levels of selectors, parsing takes ~80s
on my machine.
If an attacker can submit these expressions for parsing, they can DoS
the auth server with relatively small payloads (<1MB).

All real-world expressions are <10 AST nodes deep. Add a sanity check of
1000 levels to protect against malicious inputs.

We can optimize the code later on, but it's not very useful for real
world performance.
2020-11-17 18:53:38 +00:00
Andrew Lytvynov 4b2247f340
Rename tsh kube clusters to tsh kube ls (#4850) 2020-11-16 18:50:49 +00:00
Sasha Klizhentas e6681abe6a Fan out events in async mode for async recordings.
This commit fixes #4695.

Teleport in async recording mode sends all events to disk,
and uploads them to the server later.

It uploads some events synchronously to the audit log so
they show up in the global event log right away.

However if the auth server is slow, the fanout blocks the session.

This commit makes the fanout of some events to be fast,
but nonblocking and never fail so sessions will not hang
unless the disk writes hang.

It adds a backoff period and timeout after which some
events will be lost, but session will continue without locking.
2020-11-13 17:10:35 -08:00
Forrest Marshall 45cc314426 improve cache correctness
improves the reliability and correctness of the cache via
various small improvements, including preventing reads of
partially initialized/reset state, and delaying watcher
init events until unhealthy states recover.

fixes an issue where reads could result in missing or
inconsistent results.
2020-11-13 16:29:18 -08:00
Russell Jones 768b58351f Changed redirect endpoint when session is expired.
When the user does not have a session, if the user tries to access a
proxied application at it's FQDN, Teleport does best effort resolution.

This fix changes the behavior of what happens when the user has a
session but the session is expired. The user was being redirected to the
login page. This fix changes the behavior to by in sync with the
no-session behavior in doing best effort resolution.
2020-11-13 16:15:33 -08:00
Russell Jones 211116adaa Added username to "app.session.chunk" event. 2020-11-13 15:04:32 -08:00
Russell Jones cf635a7e60 Addressed code review comments. 2020-11-13 14:52:00 -08:00
Andrew Lytvynov dd3977957a Register a kubernetes cluster from proxy_service
A proxy running in pre-5.0 mode (e.g. with local kubeconfig) should
register an entry in `tsh kube clusters`.
After upgrading to 5.0, without migration to kubernetes_service, all the
new `tsh kube` commands will work as expected.
2020-11-13 14:52:00 -08:00
Russell Jones 8f8b94bcc9 Added application name validation.
Added validation check that ensures application names are valid DNS
subdomains. This is because and application name can potentially be used
in the DNS name of the application if either a public address is not
provided or the application is accessed via a trusted cluster.
2020-11-13 13:50:52 -08:00
Russell Jones 6a4436107c Updated "teleport start" help message. 2020-11-12 20:46:52 -08:00
Russell Jones f13040a433 Added integration tests for Application Access. 2020-11-12 18:01:45 -08:00
Andrew Lytvynov 271d7ea4e7
Migrate kube_service CRUD endpoints to gRPC (#4792)
The REST endpoints weren't used in any release yet, so we don't need to
worry about backwards-compatibility.
2020-11-12 18:35:34 +00:00
Andrew Lytvynov 450f3e7b81
Add kubernetes_cluster to all kube-related events (#4794)
Also, fix a regression in tsh, where it wouldn't update the kubeconfig
on older clusters without kubernetes_service.
2020-11-12 17:20:42 +00:00
Andrew Lytvynov 4bc8011722
RBAC for kubernetes clusters (#4782)
* Add labels to KubernetesCluster resources

Plumb from config to the registered object, keep dynamic labels updated.

* Check kubernetes RBAC

Checks are in some CRUD operations on the auth server and in the
kubernetes forwarder (both proxy or kubernetes_service).
The logic is essentially copy-paste of the TAA version.
2020-11-11 22:58:33 +00:00
Andrew Lytvynov 52c52c7e20
Add "tsh kube" commands (#4769)
1. `tsh kube clusters` - lists registered kubernetes clusters
   note: this only includes clusters connected via `kubernetes_service`

2. `tsh kube credentials` - returns TLS credentials for a specific kube
   cluster; this is a hidden command used as an exec plugin for kubectl

3. `tsh kube login` - switches the kubectl context to one of the
   registered clusters; roughly equivalent to `kubectl config
   use-context`

When updating kubeconfigs, tsh now uses the exec plugin mode:
https://kubernetes.io/docs/reference/access-authn-authz/authentication/#client-go-credential-plugins
This means that on each kubectl run, kubectl will execute tsh with
special arguments to get the TLS credentials.

Using tsh as exec plugin allows us to put a login prompt when certs
expire. It also lets us lazy-initialize TLS certs for kubernetes
clusters.
2020-11-11 22:22:01 +00:00