Commit graph

15336 commits

Author SHA1 Message Date
Yassine Bounekhla 33cfd9dc5f
add usage events for eice discover (#32617) 2023-09-29 21:13:58 +00:00
Nic Klaassen ca27ce9166
fix: improve reconnection reliability after process reloads (#32707)
This commit includes a 7 character fix in lib/service/connect.go to call
connector.Close() instead of connector.Client.Close() when a new client
fails to ping the auth server.
connector.Close() correctly avoids closing the client if it is a shared
copy of the Instance client.
The call to connector.Client.Close() was causing intermittent problems
where reconnectToAuthService could get stuck repeatedly trying to use
the same client that was just closed.
This appears to be fixed now that the Instance client is not being
improperly closed by other components.

I discovered this issue because it manifested itself in flaky failures
of TestHSMMigrate, where logs indicated that the Instance client was
being repeatedly reused but the connection was never successful

```
{"caller":"service/connect.go:1057","component":"proc:18","level":"info","message":"Reusing Instance client for Proxy. additionalSystemRoles=[Proxy]","pid":"34558.18","timestamp":"2023-09-27T21:30:05Z"}
{"caller":"service/connect.go:166","component":"proc:18","level":"debug","message":"Connected client: Identity(Proxy, cert(c90e905c-76e7-4c68-803b-ba364167ec6f.testcluster issued by testcluster:173887050308815087166604899475019267945),trust root(testcluster:322819974523436048061473591931335284057),trust root(testcluster:173887050308815087166604899475019267945),trust root(testcluster:135083743987735629230336583041497316143))","pid":"34558.18","timestamp":"2023-09-27T21:30:05Z"}
{"caller":"service/connect.go:98","component":"proc:18","level":"debug","message":"Connected client Proxy failed to execute test call: rpc error: code = Canceled desc = grpc: the client connection is closing. Node or proxy credentials are out of sync.","pid":"34558.18","timestamp":"2023-09-27T21:30:05Z"}
time="2023-09-27T21:30:13Z" level=warning msg="connection problem: readfrom tcp 172.18.0.2:38114->172.18.0.2:41065: use of closed network connection *net.OpError" dest="172.18.0.2:41065" source="172.18.0.2:37496" trace.component=loadbalancer trace.fields="map[listen:8ce2fe8a89f0:0]"
time="2023-09-27T21:30:13Z" level=warning msg="Failed to forward connection: readfrom tcp 172.18.0.2:38114->172.18.0.2:41065: use of closed network connection." trace.component=loadbalancer trace.fields="map[listen:8ce2fe8a89f0:0]"
time="2023-09-27T21:30:17Z" level=warning msg="Failed to create inventory control stream: rpc error: code = Canceled desc = grpc: the client connection is closing."
{"caller":"service/connect.go:124","component":"proc:18","level":"debug","message":"Retrying connection to auth server after waiting 41.323026451s.","pid":"34558.18","timestamp":"2023-09-27T21:30:46Z"}
{"caller":"service/connect.go:189","component":"proc:18","level":"debug","message":"Connected state: rotating servers (mode: manual, started: Sep 27 2023 21:29:24 UTC, ending: Sep 29 2023 03:29:24 UTC).","pid":"34558.18","timestamp":"2023-09-27T21:30:46Z"}
{"caller":"service/connect.go:1057","component":"proc:18","level":"info","message":"Reusing Instance client for Proxy. additionalSystemRoles=[Proxy]","pid":"34558.18","timestamp":"2023-09-27T21:30:46Z"}
...repeating...
```

The HSM tests have become flaky in the past when reload/reconnect bugs like
this have been introduced, but they are long tests that are a bit tricky
to run locally and issues like this one can be difficult to diagnose.
To try to improve our chances of catching these issues in the future,
I've written a new test that starts up an Auth and Proxy process and
repeatedly reloads both of them, asserting that the reload is always
successful in a reasonable amount of time.

The new test is able to catch the bug every time I have run it locally,
usually in ~4 out of the 8 parallel invocations to runs.
I have not seen any failures with the fix applied.
The entire test completes in ~12 seconds on my local machine.
2023-09-29 17:50:27 +00:00
Andrew Burke 15d190b26c
Add support for Client ID to Azure VM auto-discovery (#32360)
This change adds the `client_id` optio nto the Discovery Service for
Azure VMs, which sets the client ID of the managed identity for discovered
nodes to use when joining the cluster. This allows the discovered nodes
to be discovered while having multiple managed identities assigned.
2023-09-29 16:22:08 +00:00
Noah Stride c8a509c4eb
Improve explanation of TBOT_GITLAB_JWT config in GitLab guide (#32785)
* Improve explanation of `TBOT_GITLAB_JWT` config in GitLab guide

* Explain why

* Explain purpose of `id_tokens`

* Clarify a little further

* Update docs/pages/machine-id/guides/gitlab.mdx

Co-authored-by: Zac Bergquist <zac.bergquist@goteleport.com>

---------

Co-authored-by: Zac Bergquist <zac.bergquist@goteleport.com>
2023-09-29 15:21:52 +00:00
STeve (Xin) Huang 5fa66ab37c
Improve RDS MySQL IAM auth error message (#32742) 2023-09-29 14:31:06 +00:00
Michael Wilson 0bf204cd6b
Check to make sure defaultAllowRules matches preset roles. (#32128)
* Check to make sure defaultAllowRules matches preset roles.

The tests for role defaults now ensure that `defaultAllowRules` will add
the rules that are defined in the `access`, `auditor`, and `editor` presets.
Additionally, several missing rules have been added into `defaultAllowRules`.

* Use presets for defaultAllowRules.

* Remove comments referring to defaultAllowRules in preset definitions.
2023-09-29 14:02:28 +00:00
Rafał Cieślak 8098f63d2f
Improve Connect My Computer UI & logout experience (#32786)
* Kill agent after removing workspace on logout

Quick shutdown of an agent now induces a 3 second timeout if there are
active connections (#31869). I changed the logout procedure so that we
first remove the workspace (and thus close all tabs) and only then kill
the agent. This makes it so that if there were any open connections from
the app, we'll close them before killing the agent, which means the app
will close without that 3 second timeout.

* Make Start Agent button's handler ignore errors

* Avoid rendering labels if there's no node

This could happen if someone started and stopped the agent and then clicked
Start Agent with expired certs. agentNode would go from undefined to defined
back to undefined, but it seems that <Transition> doesn't unmount immediately.

* clusters.Cluster.Logout: Make error handling more explicit

!trace.IsNotFound(err) returns true both when the error is nil and when
it's not nil but the error is not NotFound. Meaning that code after the
conditional would run only when err is NotFound.

I added a bogus error to test something after the conditional and then
spent 30 minutes wondering what's going on.

* Display an empty ring if agent is set up but not running

This makes it consistent with the Connections list which also displays
an empty ring if there are no connections.

* Expand NavigationMenu stories, add story for agent starting

* Avoid calculating indicator status if user cannot use feature

* Copy progress icon from Cloud's ProgressBar

* Return 'not-configured' early if agent is not configured

* Add story for when agent is configured but not started, reorder stories
2023-09-29 12:55:58 +00:00
Krzysztof Skrzętnicki 818d9edb2a
Fix data race in Postgres engine on connection close (#32650)
* Fix data race in Postgres engine.

* Use channel instead of `sync.WaitGroup`.
2023-09-29 07:24:09 +00:00
Michael Wilson e93d165ed1
Redundant event prefix check before event watcher is created. (#32748)
* Revert "Error when redundant prefixes are detected in events. (#32652)"

This reverts commit 84dbc45e1d.

* Error when redundant prefixes are detected in events.

When creating a new events watcher, redundant prefixes will be detected
and produce an error. This should prevent developer mistakes where watched
prefixes overlap, causing subsets of events not to be parsed. This has been
verified manually.

* Fix buffer_test.go

* Update lib/services/local/events.go

Co-authored-by: Edoardo Spadolini <edoardo.spadolini@goteleport.com>

---------

Co-authored-by: Edoardo Spadolini <edoardo.spadolini@goteleport.com>
2023-09-29 01:22:08 +00:00
Brian Joerger 2c53f04310
Client check for YubiKey private key touch/pin policies (#32264)
* Preemptively check yubikey private key touch/pin policies on the client side.

* Replace hardware key signer functions with private key methods.

* Refactor overwrite/generate key logic.
2023-09-29 00:10:05 +00:00
Andrew LeFevre 492e3b36d0
re-add agentless node manual installation docs (#31647)
* re-add agentless node manual installation docs

* fix linter issues

* Add missing closing paren

Co-authored-by: Paul Gottschling <paul.gottschling@goteleport.com>

* use tabs in manual install guide

* removed more ScopedBlocks I missed

* add note about upgrading to v14

* add redirect and notes linking the two guides to one another

* addressed feedback

* Apply suggestions from code review

Co-authored-by: Paul Gottschling <paul.gottschling@goteleport.com>

* fix links

---------

Co-authored-by: Paul Gottschling <paul.gottschling@goteleport.com>
2023-09-28 23:00:29 +00:00
lsgunn-teleport a1ac96a677
Replace Access Plane with Access Platform (#32769)
* Replace Access Plane with Access Platform

* Fix Telleport
2023-09-28 22:40:16 +00:00
Forrest e3b574a797
allow teleport to start when some etcd nodes are unreachable (#32438) 2023-09-28 22:34:44 +00:00
Forrest 45649440b0
reduce log spam (#31208) 2023-09-28 22:26:21 +00:00
Tiago Silva 61e04e0f2e
Fix remote pool of signed certs when exec into leaf clusters (#32644)
* Fix remote pool of signed certs when exec into leaf clusters

This PR fixes the list of acceptable CAs from the leaf cluster when
exec into a leaf cluster pod.

Fixes #32380

Signed-off-by: Tiago Silva <tiago.silva@goteleport.com>

* add unit test

---------

Signed-off-by: Tiago Silva <tiago.silva@goteleport.com>
2023-09-28 17:55:11 +00:00
Jakub Nyckowski eaa5d5be61
Revise Docker handling in OS compatibility script (#32711)
* Revise Docker handling in OS compatibility script

This commit revises how Docker containers are interacted with in build-test-compat.sh. Optimized Docker image pulling process by pulling images in parallel to speed up the testing process. Makefile targets in Github workflow are also parallelized to speed up the build process.

* Simplify and parallel docker logic
2023-09-28 17:39:24 +00:00
Alan Parra ca4caf92bf
Allow MFA deletion using challenges (#32702)
* Add ExistingMFAResponse to DeleteMFADeviceSyncRequest

* Update generated protos

* Support challenge-based deletion on DeleteMFADeviceSync

* Refactor TestDeleteMFADeviceSync; use optional mode and drop spares
2023-09-28 16:53:51 +00:00
Yassine Bounekhla 10a1f2d1d1
add eice discover flow (#32202) 2023-09-28 16:38:26 +00:00
Zac Bergquist c2f470fe66
Fix the in-product link to trusted cluster docs (#32671)
Closes #31354
2023-09-28 15:36:17 +00:00
Anton Miniailo 4def17e560
Add connection information to multiplexer logs so it's easier to investigate (#32712) 2023-09-28 14:52:21 +00:00
Rafał Cieślak 4ff4eb32a0
Make spacing of Connect My Computer status more consistent (#32733)
* Make spacing of Connect My Computer status more consistent

* Add server labels to Running story, add ErrorWithAlertAndLogs story

* Change copy depending on whether the agent is running

* Fix proxyVersion in story

Co-authored-by: Grzegorz Zdunek <gzdunek@users.noreply.github.com>

---------

Co-authored-by: Grzegorz Zdunek <gzdunek@users.noreply.github.com>
2023-09-28 14:33:48 +00:00
Alan Parra 32bcd71591
fix: Fix panic on tsh device enroll --current-device (#32694)
* Test RunAdmin enrollment failure

* Fix RunAdmin when enrollment fails, protect tsh from nil device
2023-09-28 13:56:46 +00:00
Alan Parra 601c9525b7
Cleanup and simplify MFA test helpers (#32592)
* Cleanup getMockedWebauthnAndRegisterRes - TestCompleteAccountRecovery

* Cleanup getMockedWebauthnAndRegisterRes - TestChangeUserAuthentication

* Cleanup getMockedWebauthnAndRegisterRes - createUserWithSecondFactors

* Cleanup and drop getMockedWebauthnAndRegisterRes

* Simplify addOneOfEachMFADevice

* Use SetEmitter()

Co-authored-by: rosstimothy <39066650+rosstimothy@users.noreply.github.com>

---------

Co-authored-by: rosstimothy <39066650+rosstimothy@users.noreply.github.com>
2023-09-28 13:53:48 +00:00
Grzegorz Zdunek 288241d696
Update e (#32721) 2023-09-28 08:29:23 +00:00
Lisa Kim a2d788d698
Add promoted access list title to teleterm access request (#32708) 2023-09-28 06:13:12 +00:00
Jakub Nyckowski 6e7f538aae
An attempt to fix our failing builds (#32681)
* An attempt to fix our failing builds

* Add merge_group condition to checkout step in workflows

This update adds a condition to the checkout step in various GitHub workflows to ensure it only runs when the event_name is "merge_group".

* Fix syntax

* Use v4 tag for checkout action instead of pinned commit

Co-authored-by: Reed Loden <reed@goteleport.com>

---------

Co-authored-by: Reed Loden <reed@goteleport.com>
2023-09-27 23:20:56 +00:00
fheinecke d20606f54e
Cut CI unit test runtime in half (#32706) 2023-09-27 23:17:19 +00:00
fheinecke 2322e007cf
Fixed issue where prerelease container image tags can overwrite production container image tags (#32699) 2023-09-27 22:14:11 +00:00
Steven Martin b97856865f
docs: helm updates (#32639) 2023-09-27 21:36:39 +00:00
Michelle Bergquist b2e6c16565
conditionally show assist popover (#32267) 2023-09-27 20:36:51 +00:00
Steven Martin 3d16a9383f
docs: update Teleport Team prereqs (#32687)
* docs: update Teleport Team prereqs

* update Team prereqs

* correct link

* correct link
2023-09-27 20:22:19 +00:00
Marco André Dinis 9a428beac6
DiscoveryConfig: add service with rbac support (#32371)
* DiscoveryConfig: add service with rbac support

This PR adds the DiscoveryConfig service protected by RBAC rules.

A PR will follow that uses this service to expose the service in the
gRPC server.

* review pt1
2023-09-27 20:09:26 +00:00
Michael Wilson 84dbc45e1d
Error when redundant prefixes are detected in events. (#32652)
* Error when redundant prefixes are detected in events.

When creaeting a new events watcher, redundant prefixes will be detected
and produce an error. This should prevent developer mistakes where watched
prefixes overlap, causing subsets of events not to be parsed. This has been
verified manually.

* Add in test for event watcher verification.

* Run GCI.
2023-09-27 20:05:04 +00:00
Lisa Kim 13f749a971
Web: Extract re-usable parts and add new icons (#32529)
* Create useRefClickOutside

* Add notification icon

* Add teleport gear icon

* Extract components for re-use

* Add list magnifying glass icon

* Address CRs
2023-09-27 19:08:14 +00:00
Brian Joerger f87c46ad73
Make Hardware Key unit test interactive (#32235)
* Make yubikey unit test interactive and add to test plan.

* Move yubikey hardware signer method tests to interactive yubikey test.

* Remove hardware key interactive unit test from testplan
2023-09-27 18:55:47 +00:00
Jakub Nyckowski eb82a9b5db
Add Access List title to access request. (#32618)
* Add Access List title to access request.

* Add test
Squash some bugs
2023-09-27 18:45:49 +00:00
Forrest b3822abafa
various loadtest improvements (#31804) 2023-09-27 18:09:40 +00:00
Anton Miniailo 31de09a02c
Fix usage of ClusterName from config when starting Auth server (#32624) 2023-09-27 17:42:51 +00:00
Steven Martin b654e1d0b1
docs: remove multi level claim reference (#32613)
* docs: remove multi level claim reference

* fix ref
2023-09-27 17:14:46 +00:00
Andrew Burke cf3bf183b9
Drain unused SSH channels (#32505)
This change drains unused SSH channels and requests to prevent a
situation where an attacker could repeatedly open channels and
send data that won't be read, causing Teleport to eventually run
out of memory.
2023-09-27 17:05:48 +00:00
Michael 45b7ee73cd
Sort cloud label names to the back (#32589) 2023-09-27 15:53:32 +00:00
Reed Loden 1a9c026741
Convert examples/teleport-usage to use distroless image (#32532)
Standardize `examples/teleport-usage` to use the same base image
and other (general) build commands as `integrations/kube-agent-updater`
and `integrations/operator`.

The main change is moving from `debian:stable-slim` to `distroless/static-debian12`.
2023-09-27 15:26:08 +00:00
STeve (Xin) Huang d2a284be11
[docs] DB access troubleshoot sts:AssumeRole not authorized (#32406)
* [docs] DB access troubleshot sts:AssumeRole not authorized

* fix spelling

* review comments
2023-09-27 15:02:56 +00:00
Rafał Cieślak aa02d5b75a
Connect: Add --debug flag, don't pass --insecure flag in dev mode by default (#32629)
* Do not pass --insecure to tshd in dev mode by default

We ran into a bug related to certs in tshd that wasn't uncovered during
development because of the --insecure flag being passed by default.

* Add --debug flag

We need a way to pass --debug to tshd from a packaged app.
2023-09-27 14:30:23 +00:00
Krzysztof Skrzętnicki f47a1b4b53
Properly apply client_idle_timeout to database access sessions (#32485)
* Add test for auto disconnect (disconnect happens, query updates the timer).

* Fix: In database service, `clientConn` returned from `MonitorConn`
was never used, causing unjustified idle timeouts.
2023-09-27 14:18:59 +00:00
rosstimothy 5dc2841177
Start migrating tctl edit away from force creating resources (#32604)
`tctl edit` was always performing a forceful update in the same way
that `tctl create -f` was. This prevents optimistic from being
enforced during the update step of the edit command and thus nullifies
some of the usefulness of the feature(preventing concurrent updates
to a resource).

Since not all resources support Update operations, some only support
Upsert, and optimistic locking will slowly be added one resource at
at time the new behavior was only implemented for user resources.
The UpdateHandlers will be updated in follow up PRs when the resource
has support for optimistic locking added.
2023-09-27 13:53:49 +00:00
Grzegorz Zdunek cc8fb14258
Connect My Computer: Agent compatibility fixes (#32477)
* Allow agent compatibility state to be 'unknown'

* Show warning icon instead of read dot as the status indicator

* Add a comment

* Remove unnecessary check for agent compatibility

* Avoid creating portal if topBarContainerRef is not present

* Check compatibility when autostarting, not before autostart

* Flip autoStart to false on failed autostart

* Don't show Alert from CompatibilityError if current action has failed

* Run prettier

---------

Co-authored-by: Rafał Cieślak <rafal.cieslak@goteleport.com>
2023-09-27 13:21:45 +00:00
Michael Wilson bccb25bb8f
Fix Access List Members cache and eventing. (#32619)
* Fix Access List Members cache and eventing.

Two things were happening that were shadowing the Access List members cache
and eventing.

1. In the cache collections, the wrong reader was being assigned to the
   lookup map. The correct reader was being used elsewhere, however, so the
   caching tests appear to have still been working.
2. The watcher in lib/services/local/events.go apparently collapses prefixes
   if they overlap. Prefix `access_list_members` is encompassed by
   `access_list`, so the access list members prefix was eliminated from the
   watcher. As a result, access list member events were being processed by
   the access list parser, which resulted in non-critical warnings.

Local testing and dogfooding has yielded that this has had no apparent impact,
at least in situations without cache propagation. However, I've got a feeling
that this could affect situations with multiple auth servers.

While I'm here, I've eliminated the pointer-to-pointer logic in the access
list unmarshaling, which was excised elsewhere and should be excised here as
well.

* Use ExactKey, fix accessListMemberParser as well.
2023-09-27 13:04:52 +00:00
rosstimothy 06e49aaef1
Use Proxy gRPC API during log in (#32388) 2023-09-27 11:16:40 +00:00
dependabot[bot] 7cb13b09cc
Bump graphql from 16.6.0 to 16.8.1 (#32329)
Bumps [graphql](https://github.com/graphql/graphql-js) from 16.6.0 to 16.8.1.
- [Release notes](https://github.com/graphql/graphql-js/releases)
- [Commits](https://github.com/graphql/graphql-js/compare/v16.6.0...v16.8.1)

---
updated-dependencies:
- dependency-name: graphql
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-27 08:50:48 +00:00