Commit graph

680 commits

Author SHA1 Message Date
fheinecke b178b8b732
Updated Teleport codebase to AGPL3 license (#35259)
Signed-off-by: Fred Heinecke <fred.heinecke@goteleport.com>
2023-12-01 17:48:14 +00:00
Reed Loden 6cd68f0be6
Don't force the use of FIPS endpoints for DynamoDB Streams and Application Auto Scaling (#34876)
DynamoDB Streams and Application Auto Scaling do not currently have FIPS endpoints in
non-GovCloud, leading to invalid endpoints for FIPS users running in AWS Standard.

See also: https://aws.amazon.com/compliance/fips/#FIPS_Endpoints_by_Service

Regression from #34170.

Fixes #34804.

Additionally, clean-up a few more AWS session initiations to be consistent and clear.
2023-11-29 21:18:04 +00:00
Edoardo Spadolini 93be07558e
Deflake TestIntegrations/Discovery again (#34953)
* Fix integration/helpers.WaitForClusters

* fixes to all remote cluster integration tests

* remove leftover debug message

* use random load balancer
2023-11-27 15:26:09 +00:00
Rafał Cieślak d5f36a02be
Connect My Computer: Reload certs if user already has role but role got updated with new login (#34717)
* Untangle assert on CertsReloaded from assigning existing role

We used to assume that if the CMC role already exists and it's assigned
to the user already, then CertsReloaded must be false.

This assumption is wrong. If the user already has the role but the role
gets updated because it didn't include the current system username,
the certs need to be reloaded in order to refresh the list of available
logins.

To be able to represent this scenario as a test case, we must untangle
the expectation on CertsReloaded from the act of adding the existing role
to the user.

* Reload certs if user already has role but logins in role got updated
2023-11-22 09:12:43 +00:00
Anton Miniailo a4be12fbcf
Reorganize service config test fields (#34208)
* Reorganize process config test fields

* Move PollingPeriod back from Testing field

* Fix comment text

Co-authored-by: Nic Klaassen <nic@goteleport.com>

---------

Co-authored-by: Nic Klaassen <nic@goteleport.com>
2023-11-16 05:29:07 +00:00
Anton Miniailo c979952058
Fix PROXY protocol handling of dedicated kube listener with TLS routing (#34317)
* Fix PROXY protocol handling of dedicated kube listener with TLS routing

* Improve test by checking both addresses in multiplexed mode
2023-11-15 20:36:23 +00:00
Andrew Burke 055d49bae8
Fix data race in file descriptors (#34183)
This change fixes a data race in the tests caused by file descriptors
being closed in more than one place.
2023-11-11 02:49:36 +00:00
Rafał Cieślak 90e5fab548
Connect My Computer: Derive agent label from username in main process (#34302)
* Rename cluster to rootCluster

* Rename AgentConfigFileClusterProperties to CreateAgentConfigFileArgs

* Connect My Computer: Derive agent label from username in main process
2023-11-08 17:22:04 +00:00
Andrew Burke 42822dab68
Deflake HTTP_PROXY tests (#33614)
This change rewrites a few HTTP_PROXY tests to be less flaky.
2023-11-08 17:01:05 +00:00
rosstimothy d5a796c056
Enable testify lint (#34222)
Updates our golangci-lint configuration to enable testifylint and
fixes all issues found.

Bump e ref to include gravitational/teleport.e#2567
2023-11-06 20:38:38 +00:00
Andrew Burke 34492de3ee
Improve tsh ssh parallel output (#33429)
This change improves the output of tsh ssh when running on multiple
nodes. Stdout and stderr are now labeled with the hostname of the
node they came from. The --log-dir flag on tsh ssh will create a
directory where the separated output of each command will be stored.
2023-10-30 21:51:36 +00:00
Brian Joerger 129e5901d0
Fix agentless leaf node authorization (#33993)
* Use remoteClient for remoteSite to ensure the correct authorization mechanism is used for openssh leaf nodes.

* Use remoteClient only for auth handler access point.

* Resolve nomenclature comments.

* Update integration test to cover same-name role mapping logic.

* Add nil check.

* Apply suggestions from code review

Co-authored-by: rosstimothy <39066650+rosstimothy@users.noreply.github.com>

---------

Co-authored-by: rosstimothy <39066650+rosstimothy@users.noreply.github.com>
2023-10-30 19:23:57 +00:00
Anton Miniailo 8a1be0cfd5
Improve UX for headless kube proxy by giving user more time when reissuing expired certificates (#33728)
* Improve UX for headless kube proxy by giving user more time when reissuing expired certs.

* Add support for '--set-context-name' to 'tsh proxy kube'
2023-10-23 20:56:35 +00:00
rosstimothy 2087a2fda8
Implement Create/UpdateRole on the auth server (#33491)
In addition to adding server and backend handling for create and
update roles, the services.Access interface was updated to return
a role from the existing Create/UpsertRole methods. Bumps the e
ref to incorporate the associated changes needed there to prevent
breaking the build.
2023-10-18 17:06:50 +00:00
Brian Joerger ae80f05398
Extend test timeouts. (#33587) 2023-10-17 23:45:29 +00:00
Alex McGrath 481617ef75
wait for nodes to register in 'TestIntegrations/DataTransfer' (#33568)
* wait for nodes to register in 'TestIntegrations/DataTransfer'

* resolve comments
2023-10-17 14:49:09 +00:00
STeve (Xin) Huang 103dd6e7b5
Fix an issue tsh fails to connect Proxy behind TLS-terminated loadbalancer in separate port mode (#33374) 2023-10-12 18:33:15 +00:00
Alex McGrath dcb2f13af4
Wait for nodes to be availble in disconnection tests (#33298) 2023-10-12 13:17:38 +00:00
Andrew Burke c4b2861f70
Show resources in Slack notification for access requests (#32887)
This change updates Slack notifications for resource-based access
requests to include the resources being requested.
2023-10-10 21:01:32 +00:00
Nic Klaassen a635ce84ec
disable TestHSMDualAuthRotation (#33242) 2023-10-10 19:02:56 +00:00
rosstimothy b60ea81d54
Update users interface (#32987)
services.UsersService now takes a context and returns the user
from write operations as shown in the diff below. The bulk of the
changes are from modifying code to account for the additional
parameter and/or return value. Functional changes to better make
use of the new API will come in follow up PRs.

```diff
// UserGetter is responsible for getting users
type UserGetter interface {
	// GetUser returns a user by name
-	GetUser(user string, withSecrets bool) (types.User, error)
+	GetUser(ctx context.Context, user string, withSecrets bool) (types.User, error)
}

// UsersService is responsible for basic user management
type UsersService interface {
	UserGetter
	// CreateUser creates user, only if the user entry does not exist
-	CreateUser(user types.User) error
+	CreateUser(ctx context.Context, user types.User) (types.User, error)
	// UpdateUser updates an existing user.
-	UpdateUser(ctx context.Context, user types.User) error
+	UpdateUser(ctx context.Context, user types.User) (types.User, error)
	// UpdateAndSwapUser reads an existing user, runs `fn` against it and writes
	// the result to storage. Return `false` from `fn` to avoid storage changes.
	// Roughly equivalent to [GetUser] followed by [CompareAndSwapUser].
	// Returns the storage user.
	UpdateAndSwapUser(ctx context.Context, user string, withSecrets bool, fn func(types.User) (changed bool, err error)) (types.User, error)
	// UpsertUser updates parameters about user
-	UpsertUser(user types.User) error
+	UpsertUser(ctx context.Context, user types.User) (types.User, error)
	// CompareAndSwapUser updates an existing user, but fails if the user does
	// not match an expected backend value.
	CompareAndSwapUser(ctx context.Context, new, existing types.User) error
	// DeleteUser deletes a user with all the keys from the backend
	DeleteUser(ctx context.Context, user string) error
	// GetUsers returns a list of users registered with the local auth server
-	GetUsers(withSecrets bool) ([]types.User, error)
+	GetUsers(ctx context.Context, withSecrets bool) ([]types.User, error)
	// DeleteAllUsers deletes all users
-	DeleteAllUsers() error
+	DeleteAllUsers(ctx context.Context) error
}
```

Depends on gravitational/teleport.e#2346
Implements step 3 of #32949
2023-10-10 14:07:46 +00:00
Anton Miniailo a26b6d88bf
Fix Proxy Kube listener behavior regarding PROXY protocol usage (#32893)
* Fix Proxy Kube listener behavior regarding PROXY protocol usage

We always provided Proxy's PROXYProtocolMode to the listnening kube server,
but its listener could be already behind alpn multiplexed listener,
which already consumed PROXY protocol.

* Use clusterNetworkConfig

Co-authored-by: Tiago Silva <tiago.silva@goteleport.com>

* Improve wording.

* Add option for testing proxy kube multiplexer

* Modify option for setting IgnoreSelfConnections on kube's multiplexer

* Fix spelling

---------

Co-authored-by: Tiago Silva <tiago.silva@goteleport.com>
2023-10-08 23:16:04 +00:00
Nic Klaassen db39fb56f9
Reliability improvements for HSM tests (#32911)
* log message improvements

* fix etcd cleanup

* re-enable TestHSMDualAuthRotation

* retry client connection tests

* fixes based on code review

* make fix-imports

* fix: use EventuallyWithT

* set short polling period
2023-10-06 18:30:42 +00:00
Andrew LeFevre 31ac8ee746
fix leaf SSH sessions not getting recorded (#32163)
* fix leaf SSH sessions not getting recorded

* add integration test

* address feedback, overhaul integration test

* make each test case use fresh clusters to fix failing case

* address feedback

* Apply suggestions from code review

Co-authored-by: rosstimothy <39066650+rosstimothy@users.noreply.github.com>

* fix integration test failures

---------

Co-authored-by: rosstimothy <39066650+rosstimothy@users.noreply.github.com>
2023-10-06 12:54:57 +00:00
Nic Klaassen ca27ce9166
fix: improve reconnection reliability after process reloads (#32707)
This commit includes a 7 character fix in lib/service/connect.go to call
connector.Close() instead of connector.Client.Close() when a new client
fails to ping the auth server.
connector.Close() correctly avoids closing the client if it is a shared
copy of the Instance client.
The call to connector.Client.Close() was causing intermittent problems
where reconnectToAuthService could get stuck repeatedly trying to use
the same client that was just closed.
This appears to be fixed now that the Instance client is not being
improperly closed by other components.

I discovered this issue because it manifested itself in flaky failures
of TestHSMMigrate, where logs indicated that the Instance client was
being repeatedly reused but the connection was never successful

```
{"caller":"service/connect.go:1057","component":"proc:18","level":"info","message":"Reusing Instance client for Proxy. additionalSystemRoles=[Proxy]","pid":"34558.18","timestamp":"2023-09-27T21:30:05Z"}
{"caller":"service/connect.go:166","component":"proc:18","level":"debug","message":"Connected client: Identity(Proxy, cert(c90e905c-76e7-4c68-803b-ba364167ec6f.testcluster issued by testcluster:173887050308815087166604899475019267945),trust root(testcluster:322819974523436048061473591931335284057),trust root(testcluster:173887050308815087166604899475019267945),trust root(testcluster:135083743987735629230336583041497316143))","pid":"34558.18","timestamp":"2023-09-27T21:30:05Z"}
{"caller":"service/connect.go:98","component":"proc:18","level":"debug","message":"Connected client Proxy failed to execute test call: rpc error: code = Canceled desc = grpc: the client connection is closing. Node or proxy credentials are out of sync.","pid":"34558.18","timestamp":"2023-09-27T21:30:05Z"}
time="2023-09-27T21:30:13Z" level=warning msg="connection problem: readfrom tcp 172.18.0.2:38114->172.18.0.2:41065: use of closed network connection *net.OpError" dest="172.18.0.2:41065" source="172.18.0.2:37496" trace.component=loadbalancer trace.fields="map[listen:8ce2fe8a89f0:0]"
time="2023-09-27T21:30:13Z" level=warning msg="Failed to forward connection: readfrom tcp 172.18.0.2:38114->172.18.0.2:41065: use of closed network connection." trace.component=loadbalancer trace.fields="map[listen:8ce2fe8a89f0:0]"
time="2023-09-27T21:30:17Z" level=warning msg="Failed to create inventory control stream: rpc error: code = Canceled desc = grpc: the client connection is closing."
{"caller":"service/connect.go:124","component":"proc:18","level":"debug","message":"Retrying connection to auth server after waiting 41.323026451s.","pid":"34558.18","timestamp":"2023-09-27T21:30:46Z"}
{"caller":"service/connect.go:189","component":"proc:18","level":"debug","message":"Connected state: rotating servers (mode: manual, started: Sep 27 2023 21:29:24 UTC, ending: Sep 29 2023 03:29:24 UTC).","pid":"34558.18","timestamp":"2023-09-27T21:30:46Z"}
{"caller":"service/connect.go:1057","component":"proc:18","level":"info","message":"Reusing Instance client for Proxy. additionalSystemRoles=[Proxy]","pid":"34558.18","timestamp":"2023-09-27T21:30:46Z"}
...repeating...
```

The HSM tests have become flaky in the past when reload/reconnect bugs like
this have been introduced, but they are long tests that are a bit tricky
to run locally and issues like this one can be difficult to diagnose.
To try to improve our chances of catching these issues in the future,
I've written a new test that starts up an Auth and Proxy process and
repeatedly reloads both of them, asserting that the reload is always
successful in a reasonable amount of time.

The new test is able to catch the bug every time I have run it locally,
usually in ~4 out of the 8 parallel invocations to runs.
I have not seen any failures with the fix applied.
The entire test completes in ~12 seconds on my local machine.
2023-09-29 17:50:27 +00:00
Brian Joerger 1c88f9ed1b
Move lib/utils/prompt to api/utils/prompt (#32334)
* Move /lib/utils/prompt to /api/utils/prompt.

* Replace uses of lib/utils/prompt with api/utils/prompt and delete pacakge.

* go mod tidy.
2023-09-25 19:31:37 +00:00
Alex McGrath ab80b3d693
Allow sudoer files to be created separately from host user creation (#31793)
* Allow sudoer files to be created without host users

* Dont sort when fetching HostUsers

* Dont cleanup at session end

* Resolve comments

* Resolve comments, switch to a notimplemented host sudoers
2023-09-22 15:00:48 +00:00
Grzegorz Zdunek 027566a8cb
Connect My Computer: Remove the agent (#31020)
* Add RPCs for removing the node and reading its name

* Extract `isAccessDeniedError`

* Add a function to remove agent directory

* Add methods in Connect My Computer service to remove node, agent directory and connections

* Do not print warning when there is no agent to kill. The agent could not be started or even configured, so there is no point in showing that warning.

* Remove agent by clicking a button in the status document

* Remove agent by logging out

* Improve comments and error message

* `getConnectMyComputerNodeName` should return `string`, not `ServerUri`

* Move `removeConnections` method from `ConnectMyComputerService` to `ConnectMyComputerContext`

* Simplify integration test

* Document that connections have to be removed before removing agent dir

* Ignore NOT_FOUND errors

* Show a notification after removing the agent and close the tab

* `readUUid` -> `readUUID`

* Run prettier

* Extract a function that renders `useConnectMyComputerContext` hook to avoid duplicating the setup

* Move showing notification outside `catch` block, add tests

* Use `connection.kind` instead of parsing the resource URI

* Add `assertUnreachable`

* Pass `closeDocument` function to the status component instead of a document object

* Post-rebase fixes
2023-09-21 13:07:58 +00:00
Rafał Cieślak 847a1b1167
Implement waiting for Connect My Computer node to join cluster (#30905)
* Add daemon.Service.ResolveClusterURI

* Accept agents dir through command line flag

tshd needs to know this out of band, so that when the Electron app tells
it to watch for host UUID file for a specific cluster, the Electron app
can send just the profile name of the cluster instead of an arbitrary path
on the computer.

* Implement WaitForConnectMyComputerNodeJoin in tsh daemon

* wait: Use addEventListener instead of onabort

* Make TshAbortController emit abort event only once

This aligns it with a regular AbortController, which also emits the event
only once.

* Refactor how types are imported in tshd fixtures

* Implement WaitForConnectMyComputerNodeJoin in Electron app

* createAbortController: Add signal.aborted, use emitter.once

* Improve wait function based on Deno implementation

72d6e6641e/async/delay.ts (L39)

* Add a comment about the events package
2023-09-21 11:43:10 +00:00
Alex McGrath d8e05dd3ae
Manually create the users HOME rather than letting useradd do it (#32207)
Test that CreateHomeDirectory does not follow symlinks

resolve comments

add our own recursive directory copy

Resolve comments

Fix for "Manually create the users HOME rather than letting useradd do it"
2023-09-20 19:06:10 +00:00
Michael Wilson b87a2c9853
Remove Cf-Access-Token header. (#32139)
The Cf-Access-Token header seems to be a infrequently used header that can
easily increase the size of the header by `len(roles) + len(traits)`, which
can cause problems. Users are able to add this in on their own if they need
it using header rewriting, so we'll remove this.
2023-09-19 22:14:57 +00:00
Andrew Burke da680d8303
Add btmp support for user accounting (#31546)
This change adds support for the btmp file (failed logins)
for user accounting. It also fixes a bug where the remote
address of a connection was not being correctly logged.
2023-09-16 00:13:53 +00:00
Forrest 2d8e6d3776
always generate request IDs server-side (#31760)
* server-side request ids

* update e-ref
2023-09-13 16:08:11 +00:00
Anton Miniailo 56d6ec4eb3
Change 'proxy_protocol' default mode and behavior (#31622)
* Change 'proxy_protocol' default mode and behavior

Now by default `proxy_protocol` is unspecified and in that mode
we don't allow IP pinned connections and mark incoming conection with setting
source port = 0.
'on' mode now requires PROXY header.

* Don't require PROXY headers for connection to itself.

There cases when Telelport will call itself and it can go directly,
avoiding load balancer, so connection will not have unsigned PROXY header.

* Refactor validation of 'proxy_protocol' input

* Clarify PROXY protocol modes description

* Tweak unexpected PROXY protocol line error message

* Add comments about setting port to 0

* Return an error if we get unsigned PROXY line after signed

* Improve wording for error when IP pinning is now allowed when source port = 0

* Add test for checking unspecified PROXY protocol mode on multiplexer

* Fix the test

* Refactor tracking of receiving of unsigned PROXY line.

It makes sure we don't allow multiple PROXY lines with LOCAL command.

* Improve comment wording.

* Fix typo and improve wording.

Co-authored-by: Roman Tkachenko <roman@goteleport.com>

* Clarify PROXYProtocolMode description

* Add comment about self connections

* Set correct PROXYProtocolMode for multiplexers that don't need unsigned PROXY support

* Fix typo.

Co-authored-by: Roman Tkachenko <roman@goteleport.com>

* Add comment about PROXY protocol mode on kube service.

* Clarify comment for starting auth service in unspecified PROXY protocol mode

* Add dedicated error for rejecting IP pinning on connection with port=0

* Improve error message

* Rate limit logging error about unexpected PROXY line.

* Tweak error message.

---------

Co-authored-by: Roman Tkachenko <roman@goteleport.com>
2023-09-12 21:41:20 +00:00
Zac Bergquist 616032bced
Fix some lint warnings (#31740)
Including redundant types in composite literals and duplicate
imports in the same file.
2023-09-12 16:18:23 +00:00
Noah Stride c7cc451667
[Buddy 30860] Added --insecure flag to tbot (#31093)
* Added --insecure flag to tbot; Added test; Added test-setup

* Removed old file

* Review comments

* Updated tests; Rework CAPins & CAPath verification; Split functions

* Cleanup old debug lines

* Cleaned up tests; Remove unnecessary InsecureSkipVerify;

* Add back InsecureSkipVerify to fix Authentication

* renamed DefaultBotConfigOpts parameter; remove some stale debug code; restored wrongfully delted InsecureSkipVerify;

* remove stale newline

* Improved warnings

* Updated tbot usage example

* Fix failing test; Cleanup Makefile target

* Removed unused config option from OnboardingConfig; Fixed import order

* Rename test; Rework if statement; Fix newline

* Updated shell script to comply with shellcheck; added example yaml to gitignore

* Remove example file

* Tidier comments for reg code

* behavior

* Remove unused var from Makefile test-go-unit-tbot

* Further simplify makefile

---------

Co-authored-by: FireDrunk <thijs.cramer@gmail.com>
2023-09-11 08:16:18 +00:00
Andrew LeFevre 5054cb685c
return an error when attempting to join a session of an OpenSSH node (#31472)
* return an error when attempting to join a session of an OpenSSH node

* remove item from test plan and note to docs

* add test coverage to integration test

* fix integration test

* fixed linter issue
2023-09-08 22:16:02 +00:00
Edoardo Spadolini dbc0996999
Deflake TestIntegrations/Discovery (#31580)
* Deflake TestIntegrations/Discovery

* Shadow the testing.T in EventuallyWithT
2023-09-07 17:11:21 +00:00
Andrew Burke 4cb4ac8291
Make TestIntegrations/ReconcileLabels a unit test (#31124)
* Increase timeout for waiting for label update

* Advance clock more often

* Make TestReconcileLabels a unit test

* Fix imports

* Fix test

* Increase require.Eventually wait time

* Mock control stream
2023-09-07 16:44:46 +00:00
Forrest beeabd6419
improve ControlMaster error reporting (#31296) 2023-09-01 21:07:53 +00:00
STeve (Xin) Huang d7a7a7e9eb
Attempt to refactor gateway CLI command (#31035) 2023-08-30 19:19:51 +00:00
Zac Bergquist c3e6173651
Remove use of require assertions inside Eventually calls (#31112)
* Remove use of require assertions inside Eventually calls

require.Eventually runs the predicate function in a background
goroutine. It is invalid to use require to make assertions
inside the eventually, because require will fail the test if the
assertion fails, and tests can only be failed from the test's
main goroutine.

* Use EventuallyWithT
2023-08-30 16:54:53 +00:00
Alex McGrath e0086909cc
Dont allow directly dialing to servers not in inventory (#30323)
* Dont allow directly dialing to servers not in inventory

add direct dial escape hatch

* Fix failing unit test

* Fix TestProxySSH

* Fix TestTraitsPropagation

* resolve comment

* fix non-multiplexed trusted cluster setup in tsh test suite

* Fix TestProxySSH

* wait on nodes

* Skip the flaky check for TestSSHLoadAllCAs

---------

Co-authored-by: Forrest Marshall <forrest@goteleport.com>
2023-08-29 11:52:55 +00:00
Rafał Cieślak 04aafb51e8
Get accessInfo based on user on access request drop (#31068)
* Get accessInfo based on user on access request drop

That's how it used to be before user login state was introduced. When
dropping a resource access request, we want to restore certs back to the
state before the access request was assumed, so that the user access is
not limited only to select resources. In the past, this was done by
calculating accessInfo from a plan user object.

This approach had the side effect of refreshing the role list of the user
based on the current backend state without the need to provide credentials
again. Teleport Connect used this side effect to make the setup of Connect
My Computer interaction-free.

Theoretically, it'd be beneficial for `tsh request drop` to use login state
rather than the current backend state, as it'd make it impossible to "escalate"
privileges by refreshing the list of roles without authenticating again.
However, this brakes the setup of Connect My Computer as it expects
GenerateUserCerts to return a role list based on a current user role list.

This commit reverts that change. An alternative would be to change Connect
My Computer setup to require a one-time relogin midway through.

* Use regular login flow in CreateConnectMyComputerRole integration test

* Fix typo
2023-08-28 15:54:10 +00:00
Marek Smoliński 21e9b85774
Try to Fix Flaky Test (#27939) 2023-08-27 20:06:52 +00:00
Andrew Burke 2f99a9a172
Allow Azure/IAM join over reverse tunnel (#30720)
This change adds support for gRPC-based join methods (Azure and IAM)
over the reverse tunnel port.
2023-08-24 18:35:26 +00:00
Forrest 37e18d94b6
improve ControlMaster error reporting (#30631) 2023-08-18 15:21:18 +00:00
Forrest 47414c7aae
single auth client per instance (#30384) 2023-08-17 15:16:20 +00:00
Brian Joerger fc6bcf3cfb
Remove exported Webauthn functions (#30420)
* Add WebauthnLogin field to teleportClient and tsh for tests.

* Use custom WebauthnLogin func instead of test export.

* Remove HasPlatformSupport exported function.

* Add todo to remove lib/client/export.go.

* Parallelize affected tests.

* Apply suggestions from CR.
2023-08-17 02:18:23 +00:00
Gavin Frazar 2cb26477f2
Add RDS Postgres end-to-end tests (#29755)
* test RDS database discovery
* test RDS postgres instance connection
* organize some common test helpers for eks/rds e2e tests
* exclude e2e tests from flaky test base step
* exclude e2e tests in other test flows
* skip e2e db tests by default via env var check
* add postgres web conn test
2023-08-16 22:37:20 +00:00