Commit graph

680 commits

Author SHA1 Message Date
Nic Klaassen 46f0265546
cleanup: shrink/remove keystore interface (#17908)
* shrink/remove keystore interface

This commit introduces the keystore.Manager type to handle all
interaction between CA and the keystore backends.

Why:

* reduces the code that needs to be implemented per keystore backend to
  only the necessary operations
* separate concerns of managing key material and handling CA data
  structures
* define interfaces where they're used, not implemented
* delete net 245 lines of code
* reduce keystore.KeyStore stutter
2022-11-09 01:44:22 +00:00
Michael Wilson 85fac93653
Make TestTCP* tests in appaccess more deterministic. (#18233)
The appaccess TestTCP* tests are highly reliant on time. This has been
reduced (but not eliminated) by using a fakeClock and a channel for
signaling monitor triggered connection closures.
2022-11-08 21:40:54 +00:00
Gavin Frazar b48e711fdf
Fix flaky basic auth dialer test (#18239)
* Fix flaky basic auth dialer test

* Block on sending error to waiters in ProxyAuthorizer, to avoid racing
  against the waiter.
* Change the test to start the node in a separate goroutine and
  manipulate the auth creds while it is still attempting to connect.
  This will allow us to significantly speed up the test since we can
  verify that the proxy authorizer is rejecting bad credentials but then
  change the credentials to be valid afterwards, allowing the node to
  succeed in registering and avoid long wait for it to fail to register.

* Use buffered chan

* Move the stopall defer up in case the test fails earlier

* Remove extra zero count check
2022-11-08 02:32:25 +00:00
Michael Wilson 3d483e2d13
Add in app access connection monitoring. (#17436)
Application access connection monitoring has been introduced so that, when a
lock is created, application access connections will be interrupted until the
lock has been cleared. This includes web sockets and TCP applications.
2022-11-05 02:44:57 +00:00
rosstimothy 514bfc7ac6
Ensure invalid tunnel agent connections get closed (#17899)
* Ensure invalid tunnel agent connections get closed

Connections from reverse tunnel agents were being marked
as invalid by the proxy under certain conditions but would
ultimately never be closed. This could lead to scenarios where
the agent thought things were fine but the proxy considered
that agent unhealthy and unroutable.

Pruning of invalid connections used to occur when a proxy
tried to retrieve a connection for that tunnel. This also
further muddied the point in time at which the proxy could
close a connection as it never explicitly stopped tracking
the connection and closed it at the same time.

To remedy this, connections are explicitly closed by the proxy
and removed from the mapping to stop tracking immediately. In order
to prevent a connection that is servicing an active connection
from being closed the proxy now tracks which connections have
sessions. Closing does not occur when there are any active
sessions to prevent them from being force terminated.

When the proxy receives a heartbeat from an agent it now restores
the connection to a valid state. In the event that too many heart
beats have been missed for an agent, the proxy will now terminate
the connection, again only if it is not serving any sessions.

Fixes #15911
2022-11-04 18:05:13 +00:00
Gavin Frazar 6e316fcadb
fix test alpn proxy http proxy basic auth dial flakiness (#17909)
* Update proxy handler and authorizer mocks

* Use a condition variable to properly sync checks for the last error received

* Fix test to check for all 3 nodes registering correctly

* Update naming of LastError to be more descriptive

* Refactor authorizer to not use a condition variable

* Remove 3rd node to speed up test

* Make test more robust

* Remove the reset count func

* Close client conns before new requests to be sure

* Connection -> Request terminology

* Change test to not mutate the environment variable, but instead manipulate the auth proxy credentials

* this way we can be sure that the test will work correctly when the credentials match. If we mutate env, we don't know whether the callers are still holding a dialer using the old env variable

* Remove extra node

* fix lint

* Fix req waiting

* Change wording in debug message

* fix comment

* Update integration/helpers/proxy.go

Co-authored-by: Edoardo Spadolini <edoardo.spadolini@goteleport.com>

* Fix defer func

Co-authored-by: Edoardo Spadolini <edoardo.spadolini@goteleport.com>
2022-11-02 01:45:55 +00:00
Jakub Nyckowski 0ee91f6c37
Enable GCI linter (#17894) 2022-10-28 20:20:28 +00:00
Jakub Nyckowski 0f51099e3b
Skip AuditOn test (#17813) 2022-10-26 18:22:04 +00:00
Jakub Nyckowski b30df65631
Fix UACC paths in Close() (#17812)
* Fix UACC paths in Close()

* Add Close() call to TestRootUsernameLimit test

* Add comment explaining the getDefaultPaths() behavior.
2022-10-26 01:07:58 +00:00
Gabriel Corado 65c022893d
Add Azure AD user managed identity authentication for SQL server (#17142) 2022-10-21 15:06:51 +00:00
Michael Wilson 756eb91ede
Add X-Forwarded-SSL and X-Forwarded-Port to appaccess. (#16965)
* Add X-Forwarded-SSL and X-Forwarded-Port to appaccess.

Application Access now adds in X-Forwarded-Ssl and X-Forwarded-Port headers.
Tests have been added and adjusted to look for these new headers as well.

* Update lib/srv/app/header_rewriter.go

Co-authored-by: Ryan Clark <ryan.clark@goteleport.com>

* Update integration/appaccess/fixtures.go

Co-authored-by: Roman Tkachenko <roman@goteleport.com>

* Remove common.XForwardedPort

* Change order of websocket delegates.

* Make ReservedHeaders more future-proofed.
2022-10-12 16:54:53 +00:00
Mike Wilson e15f4f351e Add traits to JWT payload.
The JWT payload now includes user traits.
2022-10-10 14:52:06 -04:00
Marek Smoliński 7aa224e430
Add Cassandra/Scylla database support (#15895) 2022-10-10 12:37:51 +02:00
Gavin Frazar ba7df65a0c
support proxy db tunnel mfa access (#16958)
* Add local proxy middleware for db cert checking

* Use tls conversion util instead of inline

* Add middleware to local proxy config

* Add middleware configuration in tsh

* Use route to database check and set defaults func

* Dont trigger normal db login flow if using local proxy tunnel

* Split out adding client creds into helper func for testing

* Add integration test for local proxy tunnel db cert middleware

* Add unit test for local proxy middleware

* Update comment

* Make middleware on new conn block

* godoc

* Make any cert check error trigger cert renewal in local proxy middleware

* Move dbcertchecker into lib/client

* Remove unneeded mutex in local proxy and unused func in lib/utils

* Make local proxy middleware integration test more robust

* Print message before mfa prompt in proxy tunnel

* Add before prompt option to test

* Remove unneeded comment

* Change local proxy messages to be more clear

* Pass local proxy opts by reference

* Pass certs in opts instead of cert/key file path

This is so we can check if the error is recoverable while preparing
local proxy options. A tunneled local proxy can ignore the error because
it does not rely on cert files - it can just renew its certs if
necessary.

* Move db route checking back to tsh

* Fix lint err

* Fix typo and print the hint to same writer as the mfa prompt
2022-10-07 21:29:51 +00:00
Jakub Nyckowski 0df39758a0
Refactor OpenSSH config generation (#17138)
Unify OpenSSH config generation between tsh and tbot.

Co-authored-by: Roman Tkachenko <roman@goteleport.com>
2022-10-07 18:11:57 +00:00
Zac Bergquist 1a62376b78
Fix uses of require in goroutines (#16953)
The require checks from testify call (*testing.T).Fail, which should
only be called from the test goroutine.
2022-10-05 19:14:30 +00:00
Andrew Burke db7fdff809
Add option for tsh to load all CAs (#15178)
This change adds an option to let tsh load CAs for all clusters when logging in, instead of just the current cluster.
2022-10-05 18:29:09 +00:00
Tiago Silva 1e415b33b8
Add a legacy Heartbeat to Kubernetes clusters to maintain support for older clients (#16977)
* Adds a legacy Heartbeat to Kubernetes clusters to maintain support for legacy clients
2022-10-05 10:45:21 +00:00
Mike Wilson a4af8ae256 Revert "Ensure audit logging of tsh app login."
This reverts commit 31859e5d30.
2022-10-04 11:14:12 -04:00
Noah Stride eb42cabbea
Revert "Introduce ProvisionTokenV3 (#16361)" (#16934)
This reverts commit 3fba50261f.
2022-10-03 14:14:01 +00:00
Nic Klaassen 60950a9aa0
security: log non-interactive SSH commands at beginning of session (#16872)
* include exec command in session.start.initial_command

* trim oversized events
2022-09-30 19:57:01 +00:00
Mike Wilson 31859e5d30 Ensure audit logging of tsh app login.
Application sessions were previously only logged when launching an application
session via the UI, and not from the `tsh app login` command. This has been
corrected. The AppName and AppURI are now passed in as part of the gRPC
request to the auth server, which is then used to emit the audit event.
2022-09-30 12:53:13 -04:00
Andrew Burke ac257084a7
Automatically import Azure tags (#16218)
This change lets Teleport automatically import tags from the Azure instance it's running on.
2022-09-28 23:40:13 +00:00
Ryan Clark 806a568ada
Introduce config v3, add auth_server and proxy_server, remove auth_addresses (#15761) 2022-09-28 15:30:15 +00:00
Noah Stride 3fba50261f
Introduce ProvisionTokenV3 (#16361)
* Introduce proto types for ProvisionTokenV3

* Add methods to ProvisionTokenV3 to support ProvisionToken iface

* Start building v3 support into the client

* add support for mashalling and unmarshalling ProvisionTokenV3

* Start unit testing ProvisionTokenV3

* Remove oneof to support yaml marshal/unmarshal

* Client should try V3 methods and fallback to v2

* More tests

* Fix join tests

* Fix integration tests

* Switch integration tests to use v3 spec

* Switch iam tests to use ProvisionTokenV3

* Change ec2 join tests to use V3 tokens

* Fix events tests for V3 token

* support ProvisionTokenV3 within API client events handler

* Explicitly specify JoinMethod

* Tidy up final usage of NewProvisionTokenV2FromSpec in tests

* Improve proto docs on ProvisionTokenV3

* Fix bot join tests

* Clarify error message for invalid join method

* Adjust resource version comment

* Fix comments and return error rather than bool in V2() method

* Catch incompatible conversions case

* Include V2 ProvisionToken in tests and add appropriate DELETE IN notes

* Fix linter warnings/unit test failures

* Use nolint rather than lint:ignore

* Add more DELETE IN notes

* Run goimports on join_ec2_test.go

* Address PR comments from tim.

* Add more deprecation/delete in notices

* Improve godoc comments on checkAndSetDEfaults for provider config

* Simplify implementation by dropping client-ahead compatability

* Add some support for client-ahead but with conversion to v3

* Update code comments to include responsible party

* Rename `Role` to `RoleARN` in EC2 configuration for clarity

* Fix tests for Role -> RoleARN rename

* Move MustCreateProvisionToken out of API and into test packages

* Properly go imports files
2022-09-27 17:05:32 +00:00
rosstimothy 649958b6e3
Reduce number of auth dials for tsh commands (#16367)
* Reduce number of auth dials for tsh commands

One of the major areas of latency for `tsh ssh` is creating multiple
auth clients. Since the auth client is lazy and only actually performs
the dial on first use we can create an auth client once and simply
reuse it. This is done by adding an `auth.ClientI` to `ProxyClient`
which is created via `connectToProxy`. All attempts to connect to
the current auth server via the `ProxyClient` will be given the
cached `auth.ClientI`.

The new method of retrieving the current auth client also allowed
to remove a number of calls to `GetSites` which were used to obtain
the current cluster name. The local profile already contains the name
of the cluster and calls to `GetSites` were unnecessary. All instances
which relied on the site name now retrieve from information that the
`ProxyClient` already has.

In an effort to reduce ambiguity and confusion `CurrentClusterAccessPoint`
and `ClusterAccessPoint` were also removed. AccessPoint denotes that
you are connecting to a cache, but the `ProxyClient` is always going
to be hitting the auth server directly. The two have been replaced
with `CurrentCluster` and `ConnectToCluster`, which they were merely
wrappers for anyhow.
2022-09-27 15:37:51 +00:00
Zac Bergquist 149e61b7db
Disable ControlMaster test for proxy recording mode (#16720)
Control master functionality is currently broken in proxy recording
mode. We're aware of the issue and will disable the test until we
are able to fix the underlying issue.

Updates #16224
2022-09-26 18:05:53 +00:00
Brian Joerger 4c0a6ff5b1
tsh PIV login integration (#15335)
* Add Yubikey PrivateKey implementation for use by Teleport clients.

  - Add yubikey login logic, reusing previously stored private keys.

  - Fix identity file decoding with PIV keys, which sign ecdsa certificates.

  - Add libpcsclite-dev pre-req for building on linux.

  - Remove unnecessary keys.Signer interface and move its functionality to keys.PrivateKey.

  - Move retry and jitter utils to new api/utils/retryutils package.
2022-09-23 19:44:10 +00:00
Marek Smoliński cbfd90601d
Fix flaky integration test: TestAppServersHA/RootServer (#16628) 2022-09-23 13:01:52 +02:00
Alan Parra fe3f9332ee
Update WebAuthn and U2F dependencies (#16572)
Update `duo-labs/webauthn` up to `20220122034320`, which is the latest version
we can get without dipping into dependency hell (`etcd` and `opentelemetry` woes
ensue after [2365c59d9f][1]).

`tstranex` could be dropped for a while now (we moved on to WebAuthn-like
interfaces for mocks). `cfssl` was only imported due to what I assume was an
IDE mishap.

I've elected to keep `fxamacker/cbor`, instead of trying to move to
[webauthncbor][2]. fxamacker is solid, past v0, seems more appropriate for
client-side libs and still backs webauthncbor.

There are no updates for `flynn/hid` and `flynn/u2f`.

Release notes for fxamacker/cbor:
https://github.com/fxamacker/cbor/releases/tag/v2.4.0.

[1]: 2365c59d9f
[2]: https://pkg.go.dev/github.com/duo-labs/webauthn@v0.0.0-20220815211337-00c9fb5711f5/protocol/webauthncbor

* Drop tstranex/u2f dependency
* Drop direct dependency to cloudflare/cfssl
* Update fxamacker/cbor/v2 to v2.4.0
* Update duo-labs/webauthn to 2022-01-22
* Fix: Make sure all credentials are set in the user
* Simplify: Drop now unnecessary AuthenticationSelection copy
2022-09-22 17:08:47 +00:00
rosstimothy ebfbfd496e
Use testauthority instead of native to generate keys in tests (#16486)
* use test authority

* use testauthority for InitConfig RSAKeyPairSource

* add named returns to test authority
2022-09-21 20:53:09 +00:00
Alan Parra a75fcc21d8
Update golangci-lint to 1.49.0 (#16507)
Update metalinter, fix a few lint warnings and replace deprecated linters.

`deadcode`, `structcheck` and `varcheck` are abandoned and now replaced by [`unused`][1].

Since 1.19, `go fmt` reformats godocs according to https://go.dev/doc/comment. I've done a bulk-reformatting of the codebase to keep the linter happy. Backporting is mostly harmless (the exception being `lib/services/role_test.go`, that for some reason breaks the _old_ linter using the new format).

[1]: https://golangci-lint.run/usage/linters/

* Bump golangci-lint version
* Replace abandoned linters
* Fix bodyclose on lib/auth/github.com
* Fix bodyclose on lib/kube/proxy/streamproto/proto_test.go
* Fix bodyclose on lib/srv/alpnproxy/proxy_test.go
* Fix bodyclose on lib/web/conn_upgrade_test.go
* Silence staticcheck on lib/kube/proxy/forwarder_test.go
* Silence staticcheck on lib/utils/certs_test.go
* Address BuildNameToCertificate deprecation warnings
* Run `go fmt ./...`
* Run `go fmt ./...` on api/
* Ignore formatting in role_test.go
* Remove redundant initializers in lib/srv/uacc/
* Update e/
2022-09-19 22:38:59 +00:00
Roman Tkachenko 29e46a2a6a
buddy: Fix incorrect use of loop variables (#16306)
* Fix incorrect use of loop variables

This commit fixes a few occurrences of loop variables being
incorrectly used in the context of Go-routines or (most frequently)
parallel tests. To fix the issues, we create a local copy of the range
variables before the parallel tests (or Go-routine), as suggested in
the documentation of the `testing` package:

https://pkg.go.dev/testing#hdr-Subtests_and_Sub_benchmarks

Issues were found using the `loopvarcapture` linter.

Signed-off-by: Roman Tkachenko <roman@goteleport.com>

* fix TestTraceProvider/spans_exported_with_gRPC+TLS

* run TestSSH serially

* operator: Conserve 'created_by' data in user spec

Signed-off-by: Roman Tkachenko <roman@goteleport.com>
Co-authored-by: Renato Costa <renato@cockroachlabs.com>
Co-authored-by: Tim Ross <tim.ross@goteleport.com>
Co-authored-by: Hugo Hervieux <hugo.hervieux@goteleport.com>
2022-09-14 14:31:56 +00:00
Trent Clarke 0136f8a0ab
Remove more integration test port list allocations (#16266)
Following on from #13658, this patch removes more (but unfortunately not
all) usages of the deprecated, list-based port-allocation scheme.

This patch:

1. Updates the integration test `TeleInstance` fixture to use injected 
   listeners rather than static ports when creating a new proxy node in
   a cluster,
2. Updates tests affected by (1) to pre-allocate and inject listeners,
   including handling caching the listener FDs between proxy restarts
3. Removed unnecessary port allocations when creating LoadBalancer 
   fixtures, and 
4. Moved the remaining list-base port allocation functions out of helpers
   and back into integrations and made private. These functions should 
   never be used by more than one test package concurrently or there is a
   very high chance of a port collision. Rather than just write that rule
   down in the comments, I have contained the deprecated code into the
   affected package made the compiler enforce the rule for us.

See-Also: #12421
See-Also: #13658
See-Also: #14408
2022-09-14 06:53:19 +00:00
Trent Clarke 948417257f
Split out appaccess and proxy integration tests (#16232)
* Proxy tests running

* rollback

* whitespace  fix

* Rollback port fix

* Linter appeasement

* License fix

* Update signals.go
2022-09-08 08:27:51 -06:00
Trent Clarke 9514a313c3
Break DB integration tests out into their own package (#16133)
Making all of our integration tests run in entirely parallel requires
a large engineering effort to enforce test isolation and remove all race
conditions between tests.

A lower-effort alternative may be to split apart the various test suites
into their own Go packages, and test those packages in parallel, even if
the tests inside are still executed serially. Auditing the test suites
for races on system-level resources (e.g. files, ports) is much easier
than chasing down every p[ossible race in the testing system.

This patch acts as a trial run, breaking a fairly well-defined and
self-contained test suite out into its own package. Note that the goal of
this change is not necessarily to shave minutes off the build (although
that would be nice), but to act as an illustration of how other, less
well-formed test suites might be broken apart.

See-Also: #12421
See-Also: #14408
2022-09-07 11:04:35 +10:00
STeve (Xin) Huang 8394f4fb48
ALPN connection upgrade for MySQL behind ALB (#15669) 2022-09-01 16:05:03 +00:00
Gabriel Corado a3a65e863b
TLS routing ping for database protocols (#14887) 2022-08-26 16:42:23 +00:00
Brian Joerger 3a5a285883
Generalize private keys in tsh (PIV integration) (#15334)
Primary Changes:
 - Remove reliance on Private Key PEM:
 - Update native and keygen packages to return PrivateKey instead of PEM key
 - Add new PrivateKey interface which implements crypto.Signer
 - Replace PEM encoded private key usage where possible
 - Replace calls to tls.(Load)X509KeyPair with keys.(Load)X509KeyPair in
client packages

Minor Changes:
 - Remove unused agent.AddedKey return from LoadKey
 - Simplify sshutils and removed unused code paths
 - Add ecdsa and ed25519 key support
2022-08-25 23:26:44 +00:00
Andrew LeFevre 7edf9c333f
Merge pull request #15144 from gravitational/capnspacehook/file-copy-role-option
Added file copying role option and node config option
2022-08-24 20:13:41 +00:00
Roman Tkachenko 22dc9dceef
Deflake TestEC2Hostname (#15794) 2022-08-24 17:17:58 +00:00
Marco André Dinis 4163bbb1d7
Ignore Logins when listing Nodes (#15597)
Currently, we require users to have at least one Allowed Login in order
for them to list/read nodes.

This is different from the other resources.
In those, only the `<resource>_labels` needs to match what the roleset
allows/denies to the user.

This could lead to, for example, not being able to list Nodes even
though the user had a role allowing them to access any Node (ie
```
NodeLabels:
  - '*' : '*'
```

### When we don't have any login:
We can list the servers:
![image](https://user-images.githubusercontent.com/689271/185152432-8508df7c-774e-4d41-963f-f94d5edda114.png)

Trying to ssh into a node returns an error (web ui and `tsh`)
```bash
$ tsh --insecure --proxy 127.0.0.1.nip.io:3080 ssh marco@lenix
ERROR: access denied to marco connecting to lenix on cluster lenix
```

![image](https://user-images.githubusercontent.com/689271/185938766-ba6db481-8ccd-4d13-8c21-51e1cc01f544.png)


Adding a single login and then trying to login with a different login
(in this case we added a `andre` login but tried to login as `marco`)
```bash
$ tsh --insecure --proxy 127.0.0.1.nip.io:3080 ssh marco@lenix
ERROR: access denied to marco connecting to lenix on cluster lenix
```
![image](https://user-images.githubusercontent.com/689271/185939601-83210370-97ad-4d25-aba2-d565785de1bf.png)

Setting the `marco` as a denied login, we can't use it anymore even if it's part of the allowed logins:
```bash
$ tsh --insecure --proxy 127.0.0.1.nip.io:3080 ssh marco@lenix
Enter password for Teleport user marco:
WARNING: You are using insecure connection to SSH proxy https://127.0.0.1.nip.io:3080
ERROR: access denied to marco connecting to lenix on cluster lenix
```

![image](https://user-images.githubusercontent.com/689271/185940230-1dfe2afb-7909-4c75-8ebc-bad1dc5b69c1.png)
![image](https://user-images.githubusercontent.com/689271/185940272-ca948272-eefc-4e6b-be8d-d59b17dcffec.png)

Removing the login denial allows for a successful login:
![image](https://user-images.githubusercontent.com/689271/185940527-0acba499-541a-4ef8-ba6b-fb8bc9c867af.png)
2022-08-24 09:13:54 +00:00
Roman Tkachenko 31b4a00a86
(buddy) Pass JWT headers on websocket requests (#15667)
* transport: Rewrite headers, including JWTs, for websockets.

Applications can otherwise 401 on websocket requests, as they do not
present any authentication headers.

docs: Fix the reserved JWT header name.

Signed-off-by: Roman Tkachenko <roman@goteleport.com>

* Add test for JWT header in websocket apps

Signed-off-by: Roman Tkachenko <roman@goteleport.com>
Co-authored-by: Alex Vandiver <alex@chmrr.net>
2022-08-22 20:43:50 +00:00
Andrew Burke 9607fdd78c
Allow reverse tunnel join without exposing the web API (#13598)
This change allows agents to join over a reverse tunnel (port 3024 by default) only, instead of also requiring access to the web API (port 3080).
2022-08-15 21:28:24 +00:00
Joel f2dd75801a
Remove legacy session service (#15155) 2022-08-12 16:39:45 +00:00
Ryan Clark 29175e57d3
Use a getter/setter for reading the token value from the config (#14080) 2022-08-10 08:50:21 +00:00
NajiObeid 787395395a
Add config setting for proxy peering public addr (#14905)
* peer proxy public addr config

* address pr comments

* address pr comments

* address pr comments
2022-08-09 15:16:22 +00:00
Zac Bergquist d1c6b0618e
Fix lint warnings (#15312)
Mostly duplicated imports and redundant types in struct literals.
2022-08-08 20:20:29 +00:00
rosstimothy 0cb248ddd3
Trace ssh sessions (#14966)
Adds a wrapper around `ssh.Session` which injects tracing context
in a similar manner to the `ssh.Client` wrapper. All usages of
`ssh.Session` have now been replaced and have the appropriate
`context.Context` passed along

Part of #12241
2022-08-04 22:14:37 +00:00
Gabriel Corado ced6276c7b
Use waitForError instead of require.Eventually in SessionRecordingModes integration tests (#15212) 2022-08-04 20:51:34 +00:00
Forrest Marshall 142333e509 fix peer addr for in-memory control stream 2022-08-04 09:43:31 -07:00
Tiago Silva 037daad083
Introduce dedicated server type for Kubernetes resources (#14389)
## What

First part of the Kubernetes [Discovery RFD](https://github.com/gravitational/teleport/pull/13376/) to introduce a Kubernetes server per cluster. 

This PR introduces a separate Kubernetes server that uses the already introduced `KubernetesClusterV3`. 

## Compatibility

In previous versions, Kubernetes Clusters were part of regular `ServerV2` resource and this refactoring deprecates the `ServerV2` usage but keeps them for compatibility with previous version.

Everything is backward compatible, so v10 kubernetes agents and trusted clusters can connect fine.

## Next steps

Once this is merged, a new PR will introduce dynamic registration for Kubernetes Clusters discovered through EKS Discovery.
2022-08-04 14:21:11 +00:00
Edoardo Spadolini fa65fd02b1
Refactor Supervisor.WaitForEvent (#14940) 2022-07-28 13:34:27 +00:00
Joel b7a319d40d
Correctly propagate information about the target during forwarding (#14564) 2022-07-28 11:05:45 +00:00
Edoardo Spadolini 58b01b964b
Embed auth.Cache in auth.Server (#14698)
* Embed auth.Cache in auth.Server

* Hit the backend during Auth initialization

* Bypass the cache when rotating CAs

* Services.UpsertTrustedCluster is different

* Bypass the cache in waitForTunnelConnections

* Fix infinite recursion

* More cache bypassing during init and rotations

* Rename Services to Uncached in auth.Server

* Further cleanups

* Don't start the auth cache immediately

* Go back to Services rather than Uncached

* Comments and a missing method
2022-07-27 21:05:53 +00:00
Roman Tkachenko 38b8bb4307
Add support for proxying TCP apps (#13455)
Add support for proxying tcp apps
2022-07-26 19:01:39 +00:00
rosstimothy fba159e9c4
Add context.Context to session.Service inteface (#14668)
* Add context.Context to session.Service interface

Updates GetSessions, GetSession, CreateSession, UpdateSesion, and
DeleteSession to take a context.Context. All call paths are updated
to properly pass along a real context instead of relying on a
to eliminate context.TODOs.
2022-07-25 22:05:09 +00:00
Marco Dinis 5effbd8359 Add Teleport operator
This commit adds the Teleport operator. The operator reconciles
TeleportUsers and TeleportRoles Kubernetes resources with Users and
Roles Teleport resources.
2022-07-25 15:27:10 -04:00
Zac Bergquist 13d68af6f4
Ensure that the WindowsDesktopReady event is emitted (#14804)
When desktop access is enabled, the TeleportReady event will not
be emitted until the WindowsDesktopReadyEvent is emitted, and it
turns out we have *never* emitted a WindowsDesktopReadyEvent.

This is likely due to desktop access being copied from kube access
since the very beginning. The same issue was recently fixed for
kube access in #9418.
2022-07-22 20:41:36 +00:00
Trent Clarke 1686a71c8a
Remove centralised port allocation for tests (#13658)
Ports used by the unit tests have been allocated by pulling them out of a list, with no guarantee that the port is not actually in use. This central allocation point also means that tests cannot be split into separate packages to be run in parallel, as the ports allocated between the various packages will be allocated multiple times and end up intermittently clashing.

There is also no guarantee, even when the tests are run serially, that the ports will not clash with services already running on the machine.

This patch (largely) replaces the use of this centralised port allocation with pre-created listeners injected into the test via the file descriptor import mechanism use by Teleport to pass open ports to child processes.

There are still some cases where the old port allocation system is still in use. I felt this was already getting beyond the bounds of sensibly reviewable, so I have left those for a further PR after this.

See-Also: #12421
See-Also: #14408
2022-07-20 12:04:54 +10:00
David Boslee 27c04c5f94
Fix TestProxyTunnelStrategyAgentMesh flakiness (#14398)
Fixes an issue where the agentpool backoff channel would be redefined 
each time an event was received while waiting for the backoff to complete.
This could lead to a longer backoff period than expected.

Waits for each resource to connect individually by splitting up the test into
multiple runs ran in parallel
2022-07-14 10:49:11 -06:00
Noah Stride 02b4f8575f
Configure linter to catch British 🇬🇧 spellings 🇺🇸 🦅 📖 (#14363)
* configure golangci-lint misspell to check for anglicized spellings

* Americanize spellings

* fix aws constant value with british spelling 🇬🇧

* update api types with americanized spellings

* use american spellings .cloudbuild/scripts
2022-07-14 10:51:23 +00:00
Joel 63e17f8a0f
Honor --no-enable-escape-sequences in tsh (#13507) 2022-07-13 11:48:21 +00:00
Alex McGrath 59063b1078
Replace occurences of . and ~ with _ when creating sudoers files. (#14300) 2022-07-12 09:22:38 +00:00
Marek Smoliński a47b62d60f
Boost database integration tests (#14226)
* Boost database integration tests

* Make linter happy again

* update
2022-07-11 07:38:34 +00:00
Andrew LeFevre a150b0c8e1
SFTP server side support (#13491)
add sftp server functionality
2022-07-07 20:08:26 +00:00
Zac Bergquist 75fa968e28
Make proxy peering an enterprise only feature (#14155) 2022-07-06 23:49:00 +00:00
Zac Bergquist 3d72b702db
Make source IP-pinning an enterprise feature (#14141) 2022-07-06 17:25:31 +00:00
Gavin Frazar 187d2e04d3
Gavinfrazar/start postgres listener with no tls no mux (#13998)
* Start postgres without TLS when multiplexing is disabled

* Add integration test for starting postgres with --insecure-no-tls

* Fix dupe postgres listener mistake

* Log the actual address of listeners

* Remove unnecessary error checking
2022-07-06 02:33:47 +00:00
Gabriel Corado fec42e3895
Wait for application servers tunnel connection before integration tests (#14084) 2022-07-06 00:31:04 +00:00
David Boslee 0f7762c41b
Fix agent mesh integration test (#13954)
By using a randomized load balancer we improve the chances of an agent
connecting to all proxy servers within the given time period.
2022-07-05 16:01:03 +00:00
Alex McGrath aee44e5678
Prefix sudoers lines with the user that is logging in instead of requiring a trait be templated. (#14007)
Prefix sudoers lines with user being logged in as
2022-07-01 09:28:14 +00:00
Russell Jones c0cd120820 Fixed TestAppServersHA. 2022-06-30 16:59:11 -07:00
STeve (Xin) Huang 86d9e30765
Fix an issue DB rotation event get send to older remote cluster (#13857) 2022-06-30 21:16:12 +00:00
Marek Smoliński 86ac49b10e
Try to fix TestAppServersHA flakiness (#13992) 2022-06-30 15:41:39 +02:00
Forrest Marshall b0bac8e546 fix ec2 join check 2022-06-28 18:05:30 -07:00
Marek Smoliński 20b63e071e
Fix JumpHost TLSRouting flow when root cluster is offline (#13791) 2022-06-28 14:09:37 +02:00
Roman Tkachenko 3ba3c429f4
Speed up app access integration tests (#13867) 2022-06-25 13:20:12 +00:00
Gavin Frazar 1858aafa15
Fix http proxy basic auth (#13140)
* Fix http proxy basic auth

* Update docs about HTTP CONNECT env var formats
2022-06-23 00:27:29 +00:00
Noah Stride 5e8cfb345c
Correct terminology from SSHAddr to ListenAddr for Auth server (#13725)
Rename auth SSHAddr to ListenAddr
2022-06-22 23:03:08 +00:00
Nic Klaassen a3e8bdcdc6
serialize hsm tests (#13632) 2022-06-18 00:02:45 +00:00
Forrest Marshall 31f258fec9 inventory control stream & certs 2022-06-15 22:26:24 -07:00
Nic Klaassen 77a90c1f8e
improve HSM test reliability (#13504) 2022-06-15 18:30:13 +00:00
Trent Clarke 3ff6889389
Split integration test fixtures into a package (#13465)
As a prelude to breaking individual integration test suites out into
their own packages (in order to make them more amenable to running
in parallel), this patch extracts the common test fixtures and places
them in a common `helpers` package.

This will allow the integration test package to share common
infrastructure and vocabulary once they are split out.
2022-06-15 17:07:26 +10:00
rosstimothy e5c745f331
Add manual tracing instrumentation to tsh (#13204)
Create spans for all public facing TeleportClient,
ProxyClient, and NodeClient methods. This makes
correlating spans easier to reason about when
looking at `tsh` traces. As a result of creating
spans, some additional context propagation is
required as well to ensure that spans are linked
properly.

This also removes the unused `quiet` argument from
`ConnectToCluster`. It's usage was not consistent
by existing callers, and it was ignored, so in order
to avoid confusion in future calls, it was removed.

#12241
2022-06-11 15:34:40 +00:00
Andrew Burke 9af04f4502
Fix dependencies in integration tests (#13321)
This change moves some type/function definitions in integration tests to fix compilation.
2022-06-10 22:41:29 +00:00
Alex McGrath 502b001130
Add sudoers provisioning support (#12061)
* Add sudoers provisioning support

* Add a fix for macos tests
2022-06-09 16:06:18 +00:00
Przemko Robakowski 951aff47ed
IP-based validation for SSH (#13243)
This change adds IP-based validation for SSH certificates.
There's new option in role definition:

kind: role
metadata:
  name: dev
spec:
  options:
    pin_source_ip: true
When that is set to true client IP must be the same when generating certificates and using them. It uses source_address critical option that should be supported by both teleport and sshd and only applies to certificates we send to user (like in tsh login), we don't pin IP in certificates issued for web UI as they can't leak.
This change also omits machine ID (it uses different code path) - it will be added in separate PR.

Most of the lines changed are from regenerating types.proto, change itself is not that big

Relates #11719
2022-06-08 22:49:37 +00:00
Jakub Nyckowski c30eee366e
Move SetTestTimeouts() to TestMain (#13312) 2022-06-08 17:05:06 -04:00
Andrew Burke 870ac4ca9b
tsh list resources accross proxies and clusters (#12934)
This change adds the --all/-R flag to tsh ls, tsh apps ls, tsh db ls, and tsh kube ls, which lets tsh list resources from across all clusters and logged in proxies.
2022-06-08 18:42:25 +00:00
Brian Joerger 2717c1d2e0
Security fixes (#13298)
* Add CSRF mitigations

This commit includes two fixes:

1. Enforce an application/json Content-Type server-side.
2. When checking the bearer token, verify that the user
   associated with the token matches the user associated
   with the cookie.

* Fix TEL-Q122-13: Access Requests Denial Of Service Via Request Reason (#125) (#127)

* Ignore input when data flow is off in TermManager

When data flow is disabled in TermManager (at the beginning or when TermManager.Off was called) we should ignore all input we receive (currently we buffer it)

* Agent forwarding socket security fix.

Co-authored-by: Lisa Kim <lisa@goteleport.com>
Co-authored-by: Joel <jwejdenstal@icloud.com>
Co-authored-by: Przemko Robakowski <przemko@przemko-robakowski.pl>
2022-06-08 18:12:45 +00:00
Alex McGrath 581efdc60f
Add support for automatic user provisioning (#11830)
* Add support for automatic user provisioning

* Add UID parker to reexec

* Add a `teleport park` subcommand that does nothing

Co-authored-by: Edoardo Spadolini <edoardo.spadolini@goteleport.com>
2022-06-08 12:24:13 +00:00
Andrew Burke 22c0fccba7
Restore HTTP_PROXY for multi-port mode (#13048)
This change undoes the changes in #11990 and #12335 for Teleport going forward.
2022-06-07 11:57:16 -07:00
Gabriel Corado c459ddbbe6
SSH Session recording modes (#12916) 2022-06-06 20:29:35 +00:00
Marco André Dinis 306d011151
Deprecate ca_signature_algo config (#13033)
After the merge of https://github.com/gravitational/teleport/pull/12674 we no longer use the following configuration:
```yaml
teleport:
    ca_signature_algo: "rsa-sha2-512"
```
As we now rely upon the `x/crypto` package to choose the signing algorithm (it defaults to `rsa-sha2-512`)

**Demo**
If we set `ca_signature_algo` (the value is irrelevant) and start `teleport` we get:
```shell
root@marco:/workspace# teleport start --debug
2022-06-02T09:33:58Z WARN             ca_signing_algo config option is deprecated and will be ignored, we'll always default to rsa-sha2-512. config/configuration.go:348
2022-06-02T09:33:58Z INFO             Generating new host UUID: b001159a-10e0-49a7-b4dc-61c73fbe9e42. service/service.go:726
...
```

Fixes #12905
2022-06-06 16:18:15 +01:00
Marek Smoliński f22d8e9723
Fix Audit Event max event size flow (#12352) 2022-06-06 09:58:48 +02:00
rosstimothy 25ec2c8a39
Add client side circuit breaker to auth clients (#10282)
* Add client side circuit breaker to auth clients

In order to apply back pressure we can utilize a circuit breaker that
monitors error responses from auth server. When tripped it will prevent
all outbound requests to auth for a period of time. This can also help
prevent a potential thundering heard when auth is in an unhealthy state.
By default the circuit breaker will only be tripped if 90% of the
requests made in the monitoring interval fail.
2022-06-03 11:55:56 -04:00
Andrew Burke 7f730d2a58
Add disabled imds client by default for integration tests (#13109)
The instance metadata client added in #12593 significantly slows down integration tests. This change adds a disabled client to integration tests to improve performance.
2022-06-02 12:52:41 -07:00
David Boslee 32695a2f05
Add proxy peering support (#12359)
This adds proxy peering support. A configurable setting that allows for agents 
to connect to a subset of proxies and be reachable through any proxy in the
cluster. This is achieved by creating grpc connections between each proxy
server. Client connections can then be passed between proxies to the desired
agent.
2022-06-02 17:08:24 +00:00
Andrew Burke 230692f769
Fix EC2 labels concurrent write (#13072)
This change fixes a bug in EC2 labels (#12593) involving concurrent writes to the labels map. This is fixed by making EC2.Get() return a copy instead of the actual label map.
2022-06-01 21:26:28 +00:00
Andrew Burke a8ed7bd1fd
Automatically import EC2 tags (#12593)
This change allows Teleport to automatically import EC2 tags when running in an EC2 instance.
2022-05-31 23:19:16 +00:00
Marco André Dinis ba7a3204f6
Improve error msg when client fails to auth in Teleport (#12677)
When the client connects to teleport with invalid credentials (eg
expired ones) it will retry multiple times until the context deadline is
reached.
When it happens, we receive the generic error: context deadline
exceeded.
However, we can ask for the latest connection error, one which will give
us more information on why it happened.
To ask for this extra error we need to add the following
grpc.DialOption: grpc.WithReturnConnectionError()

After doing this, we will get the errors that happenned when trying to
connect to the grpc Server.

This should help us debug possible connection problems.

We had to refactor a little bit the way we handle the parallel
connection attempts to receive all the connection errors from the
multiple flows.
2022-05-31 15:24:57 +00:00
Marco André Dinis 2493448cbd
Bump x/crypto to 20220518 and remove custom algorithm signer (#12674)
This commit upgrades the version of x/crypto we use, to the current latest
`go get -u golang.org/x/crypto`

We also replaced the deprecated variables and updated the tests to match the
current default KEX Algos

The x/crypto didn't support RSA-SHA2 algos, so we developed our own algorithm
signer. This is no longer the case, and after upgrading x/crypto to 20220518 we
can safely remove the custom code we have.


From OpenSSH 8.8+, it works if we explicitly add the older algorithm
Somthing like this: `./ssh -vvv -oPubkeyAcceptedAlgorithms=+ssh-rsa-cert-v01@openssh.com teleportadmin@moon.marco.mydemo`
2022-05-25 14:47:00 +01:00
rosstimothy 9f094aaef6
Add tracing instrumentation for ssh clients/servers (#12434)
* Add tracing instrumentation for ssh clients/servers

Add tracing context to the existing ProxyHelloSignature to provide
span information across ssh connections. To add span context per
ssh session on top of new connections, the same tracing context is
passed in the first global request of the session.

In order to ensure that tracing context is pulled from and inserted
into the proper context.Context, some interfaces and methods were
changed to take one as the first argument.
2022-05-25 12:24:02 +00:00
Noah Stride 2f1675e480
Run HSM integration tests in parallel (#12470)
* run HSM tests in parallel

* add missing punctuation to commit

Co-authored-by: STeve (Xin) Huang <xin.huang@goteleport.com>

Co-authored-by: STeve (Xin) Huang <xin.huang@goteleport.com>
2022-05-19 13:41:34 +00:00
Marek Smoliński 275a443f19
Upgrade MySQL driver to v1.5.0 (#12667) 2022-05-18 11:27:10 +02:00
rosstimothy 1ac0957d0e
Improve CertAuthorityWatcher (#10403)
* Improve CertAuthorityWatcher

CertAuthorityWatcher and its usage are refactored to allow for
all the following:
 - eliminate retransmission of the same CAs
 - reduce memory usage by having one local watcher per proxy
 - adds the ability to filter only the CAs that are desired
 - reduce the time required to send the first CAs

watchCertAuthorities now compares all CAs it receives from the
watcher with the previous CA of the same type and only sends to
the remote site if they are not identical. This is to reduce
unnecessary network traffic which can be problematic for a
root cluster with a larger number of leafs.

The CertAuthorityWatcher is refactored to leverage a fanout
to emit events to any number of watchers, each subscription
can be for a subset of the configured CA types. The proxy
now has only one CertAuthorityWatcher that is passed around
similarly to the LockWatcher. This reduces the memory usage
for proxies, which prior to this has one local CAWatcher per
remote site.

updateCertAuthorities no longer waits on the utils.Retry it
is provided with before starting to watch CAs. By doing this
the proxy no longer has to wait ~8 minutes before it even
starts to watch CAs.
2022-05-17 19:06:41 +00:00
Andrew Burke e1e6437879
Ignore HTTP_PROXY in reverse tunnels, part 2 (#12335)
This change disables HTTP_PROXY in a few places that were missed in #11990.
2022-05-11 23:00:58 +00:00
Roman Tkachenko 0b6fe7257d
App access JWT header improvements (#12567) 2022-05-11 22:15:11 +00:00
Zac Bergquist a7ab44f15b
Fix linter after Go 1.18 upgrade (#12585)
* Update golangci-lint

To accomodate the recent Go 1.18 upgrade

* Fix new lint warnings as a result of linter upgrade

* Set golangci-lint to Go 1.18 mode

golangci-lint will automatically skip linters that don't have support
for Go 1.18.

See: https://github.com/golangci/golangci-lint/issues/2649
2022-05-11 21:53:37 +00:00
Edoardo Spadolini 9d91466a0e
Proxy restart fixes (#11802)
* Remove unused backend wrapper from Cache

* Remove double printShutdownStatus

* Fix readyz race condition

* Test coverage for the readyz.monitor fix

* Close listeners immediately in proxy.shutdown

* Use and handle net.ErrClosed correctly

This adapts utils.IsUseOfClosedNetworkError to check for net.ErrClosed
even inside trace.Aggregate errors, makes it so that we always return
something that would pass errors.Is(err, net.ErrClosed) when returning
from a (net.Listener).Accept(), and handles closed listeners within our
various Serve() loops so that we don't hit spurious backoff waits while
shutting down.

* Close listeners early and emitters late

* Test coverage for the proxy listener changes

* Revert some errors back to trace.ConnectionProblem

* Reduce PR scope to just the proxy, add comments

* Improve error logging.
2022-05-06 18:12:11 +02:00
Marek Smoliński 158a70a7d5
Fix flaky integration test: increase deadline (#12449) 2022-05-05 21:31:00 +00:00
Joel 652536f4e5
Don't enforce standard k8s and ssh auth mechanisms when joining sessions (#11144) 2022-05-05 19:42:57 +00:00
Joel 3120876aea
Only acquire semaphore lease if maxconnections is configured (#12462) 2022-05-05 17:42:07 +00:00
Joel 21ff6221ad
Limit Kubernetes connections (#12275) 2022-05-02 17:24:09 +02:00
Jakub Nyckowski d5d2a72ace
Advertise correct MySQL server version (#12196)
Teleport now will try to extract MySQL server version from initial handshake package instead of sending `8.0.0-Teleport` every time. This string can be overridden by new configuration option `mysql.server_version`. On DB service start Teleport will also try to fetch the current version from MySQL/MariaDB instance. After that the server version will be updated on every successful connection to keep it up to date.

Co-authored-by: STeve (Xin) Huang <xin.huang@goteleport.com>
Co-authored-by: Paul Gottschling <paul.gottschling@goteleport.com>
2022-04-29 22:22:11 +00:00
Edoardo Spadolini 39ffa56766
Specify the NodeName in auth.ReRegister (#12272)
* Specify the NodeName in auth.ReRegister

* Make cleanup consistent
2022-04-29 18:05:08 +00:00
Roman Tkachenko d78f6925a4
Revert readyz changes (#12244)
* Revert "Make `PortList.Pop()` thread-safe (#11799)"

This reverts commit a17337d1a1.

* Revert "Ensure stateOK is reported only when all components have sent updates (#11249)"

This reverts commit b749302e2c.

* Revert "Throw startup error if `TeleportReadyEvent` is not emitted (#11725)"

This reverts commit 933e247287.

* Revert "Fix ProxyKube not reporting its readiness (#12150)"

This reverts commit 6cdcfe7721.
2022-04-26 22:16:55 +00:00
rosstimothy 71dea2df4c
Speed up TestAppServersHA (#12128)
* Speed up TestAppServersHA

Allow test cases to be run in parrallel and allow app servers to
be spawned in parrallel to reduce test time from ~99s to ~20s.
2022-04-26 15:05:24 -04:00
Joel 99116409d4
Remove needlessly complex key generation scheme (#12113) 2022-04-25 09:26:10 +00:00
Edoardo Spadolini 6cdcfe7721
Fix ProxyKube not reporting its readiness (#12150) 2022-04-21 15:32:24 +00:00
Brian Joerger 93f6f61386
Fix flaky test - TestAuditOn (#12101) 2022-04-21 01:20:26 +00:00
Gabriel Corado b9829d1b38
Delete app sessions on logout (#9873)
* feat: delete app web sessions during logout

* Apply suggestions from code review

Co-authored-by: Roman Tkachenko <roman@goteleport.com>

* refactor(auth): add VerbList action to delete user app sessions

Co-authored-by: Roman Tkachenko <roman@goteleport.com>
2022-04-14 17:31:51 +00:00
Zac Bergquist 663e3d04c5 Remove calls to depreated pool.Subjects() method
This deprecation was kind of a pain, because x509.CertPool becomes
a black box - there is no public API to determine how many certs
have been added to the pool. To account for this, some of our method
signatures needed to be updated to report the number of certs that
were added.
2022-04-14 09:25:41 -06:00
Joel cfcc646762
Restrict moderated sessions users from accessing V8 kube cluster agents (#11691) 2022-04-06 15:31:24 +00:00
Vitor Enes 933e247287
Throw startup error if TeleportReadyEvent is not emitted (#11725)
* Throw startup error if `TeleportReadyEvent` is not emitted

Before this commit, the `TeleportReadyEvent` was only waited for when a
process reload occurred. Thus, if a bug exists in the code that emits
this event (as it's currently the case since the `MetricsReady` and
`WindowsDesktopReady` events are never emitted), such a bug may go
unnoticed for a while.

This commit ensures that the `TeleportReadyEvent` is always waited for
on startup, and throws an error if the event is not emitted (after some
timeout).

This commit also:
- removes the `MetricsReady` event (as this is not produced by a
  component that sends heartbeats, which is the case of every other
  event required by the `TeleportReadyEvent` event mapping)
- ensures that `WindowsDesktopReady` event is emitted
- refactors some of the code in `lib/service/supervisor.go`
- moves the event mapping registration to a new `registerTeleportReadyEvent` function
2022-04-06 16:09:59 +01:00
Jakub Nyckowski 1aa38f4bc5
Create Database CA (#9593)
Introduce Database Certificate Authority. New CA is used by Database Access to sign database certificates making them independent from Host CA. 

Co-authored-by: Marek Smoliński <marek@goteleport.com>
2022-04-05 19:44:46 +00:00
Andrew Burke e3a8fb7a0f
NO_PROXY port support + special case for proxying via localhost (#11403)
This change updates NO_PROXY handling to allow blocking specific host:port combinations, rather than just the host. It also adds a special case for downgrading requests to plain HTTP when --insecure is true and the request goes through a plain HTTP proxy at localhost (i.e. HTTP_PROXY=http://localhost).
2022-04-04 14:23:50 -07:00
Joel 30630a1ecf
Pipe terminal stdin to session in kubernetes peer mode (#11288) 2022-04-01 17:48:40 +00:00
Edoardo Spadolini dafc7895d3
Always use in-memory caches (#11386)
* Always use in-memory caches

This also cleans up now-useless fields and constants related to on-disk
caches.

* Remove the cache tombstone mechanism

As we're never reopening the same cache backend twice, this is no longer
useful.

* Warn if a cache directory exists on disk

We can't remove it automatically because we might be in the middle of an
upgrade with a old version of Teleport still running.
2022-03-30 14:54:57 +00:00
Gabriel Corado 58ca1bdbb0
fix(db): send initial heartbeat when there is no static dbs (#11160) 2022-03-25 20:17:54 +00:00
Zac Bergquist 3f507dfd06 Remove uses of deprecated ioutil package 2022-03-16 15:05:42 -06:00
Joel 92543d9b3e
Moderated Sessions improvements (#10991) 2022-03-10 23:04:12 +00:00
rosstimothy 550d23d15d
Fix goroutine and memory leak in watchCertAuthorities (#10871)
* Fix goroutine and memory leak in watchCertAuthorities

The CA Watcher was blocking both on writing to a channel when the watcher
was closed and on HTTP calls that had no request timeout or context passed
to cause cancellation.

All resourceWatcher implementations that had a bug which may cause them to block
on writing to a channel forever were fixed by selecting on the write and ctx.Done.

Adding context.Context to all Get/Put/Post/Delete methods on the auth HTTPClient to
force callers to propagate context. Prior all calls used context.TODO which
prevents requests from being properly cancelled.

Add context propagation to RotateCertAuthority, RotateExternalCertAuthority,
GetCertAuthority, GetCertAuthorities. This is needed to get the correct ctx
from the CertAtuhorityWatcher all the way down to the HTTPClient that makes
the call.

Closes #10648
2022-03-10 11:05:39 -05:00
Lisa Kim 350ea5bb95
Updates tsh ls for node/app/db/kube to accept new filter flags (#10980)
* Also adds a search keyword parser that takes in different
  delimiters (comma is used for tsh, space is used for web UI)

part of RFD 55
2022-03-09 23:56:55 +00:00
Brian Joerger 600022b290
Change NewRole to use V5 by default, old consumers now user NewRoleV3. (#10884) 2022-03-08 21:11:53 +00:00
STeve (Xin) Huang 93a89a0906
fix flaky integration test: TestDatabaseAccessMongoConnectionCount (#10869) 2022-03-08 13:35:42 +00:00
Gabriel Corado 4c634a087b
feat(app): consider reverse tunnel errors in apps HA mechanism (#10734) 2022-03-04 16:17:51 +00:00
Marek Smoliński 1a7a667af3
Fix Mongo topology resource release (#10664) 2022-03-03 11:58:57 +01:00
Roman Tkachenko a480259d97
Improve HA behavior of database agents in leaf clusters (#10641) 2022-03-02 02:33:04 +00:00
Joel 278b3b7e05
Fix ineffective timeout break in integration test (#10130) 2022-02-25 17:12:33 +00:00
Andrew Burke 4e3bd6c647
Clear terminal when auth server is in FIPS mode (#10095)
This change clears the terminal at the end of a session when the auth server is in FIPS mode, even if tsh isn't.
2022-02-17 10:16:36 -08:00
Marek Smoliński 4285a6b074
Fix HSM flaky integration tests (#10390) 2022-02-17 10:10:21 +01:00
Brian Joerger eeef122954
Check for shell user's home directory as that user (#10321) 2022-02-16 23:51:02 +00:00
rosstimothy 886277af11
Fix Reverse Tunnels Not Properly reconnecting (#10368)
* Fix Reverse Tunnels Not Properly reconnecting

Nodes were not generating new agents which prevented reverse tunnels
from being re-established after a collapse. Ensuring we always
release the lease if the agent pool is unable to add a new agent.

* add tunnel collapse reconnect test
2022-02-15 18:11:08 -05:00
Jim Bishopp 22e043c430
Add TestModules (#10369)
Allows tests to set fake values to be returned from modules.GetModules()
2022-02-15 21:54:40 +00:00
Joel ea810d30d9
Implement Moderated Sessions (#8563)
* Implement Moderated Sessions
2022-02-15 17:02:10 +01:00
Jakub Nyckowski ed62fa17c6
Add missing DatabasesReady event to DB proxy (#10152)
* Add missing DatabasesReady

* Expect TeleportReadyEvent to be emitted when DB proxy run in tests.
2022-02-11 02:37:24 +00:00
Nic Klaassen bc441ef2cf
IAM Join Method (gRPC service) (#10087) 2022-02-10 00:41:34 +00:00
Russell Jones 0534d43939 Cleaned up NewClient in integration tests.
Updated "NewClient" in integration tests to not take testing.T and
return an error.
2022-02-08 16:29:48 -08:00
Russell Jones 34586dd4e8 Fixed TestSessionStartContainsAccessRequest.
Fixed issue with TestSessionStartContainsAccessRequest where the login
was not being injected into the user role.
2022-02-08 16:29:48 -08:00
Russell Jones 2b32277e09 Fixed TestDisconnection
Integration tests were creating the users SSH and x509 certificates at
the time of server creation, specifically at the time of the
"TeleInstance" creation.

If any component within the "TeleInstance" was slow to start, for example
if the proxy watcher had to be reset as can be seen below, the users
certificate would often be expired by the time the SSH request was issued
which would lead to the SSH server rejecting the connection and the test
failing with an unexpected error.

For most tests this is not an issue, because test certificates are valid
for 24 hours. However "TestDisconnection" scopes the TTL of certificates
down (often times to about 2 seconds) to test different certificate
lifetime disconnection scenarios which would lead to this test being flaky.

Updated integration test logic to instead "issue" the users certificate at
the time of client creation instead of server creation.
2022-02-08 16:29:48 -08:00
Nic Klaassen e00ff42cb8
IAM Join Method (backend implementation) (#10085) 2022-02-08 18:48:13 +00:00
Russell Jones 26860e5e01 Fix. 2022-02-03 11:52:45 -08:00
Russell Jones 30c20d7be8 Removed TestProxyReverseTunnel.
`TestProxyReverseTunnel` has been consistently failing with the following errors.

--- FAIL: TestProxyReverseTunnel (11.76s)
    sshserver_test.go:1127:
        	Error Trace:	sshserver_test.go:1127
        	Error:      	Received unexpected error:
        	            	timeout waiting for announce to be sent
        	Test:       	TestProxyReverseTunnel
FAIL
FAIL	github.com/gravitational/teleport/lib/srv/regular	68.907s

--- FAIL: TestProxyReverseTunnel (23.43s)
    assertion_compare.go:332:
        	Error Trace:	sshserver_test.go:1144
        	Error:      	"5.775333527" is not less than "5"
        	Test:       	TestProxyReverseTunnel
        	Messages:   	[]
FAIL
FAIL	github.com/gravitational/teleport/lib/srv/regular	96.861s

These both appear to be timing related. By removing `t.Parallel()` or
increasing the timeouts it was possible to stabilize this test and have it
consistently pass. However, looking at what `TestProxyReverseTunnel` actually
tested, I don't think it can actually be removed completely. I've outlined
what it tested and why this is no longer necessary and why we should remove it.

* Reverse tunnels can be established

This test was written prior to integration tests existing. Teleport now has
integration tests that are more extensive, robust, and stable which cover
reverse tunnel functionality like `TwoClustersProxy`, `TwoClustersTunnel`, and
`TrustedTunnelNode`. The only thing they were missing that
`TestProxyReverseTunnel` had was checking `LastConnected` time. This PR has
been updated to add that to integration tests.

* Connectivity can be established over reverse tunnels

Similar to the above, we have more extensive, robust, and stable integration
test coverage for establishing connectivity over a reverse tunnel now.

The only bit of functionality that integration don't appear to have is
connecting by DNS name _and_ IP address at sshserver_test.go#L1066-L1067.
However, we do now have a dedicated unit test for this in
`TestProxySubsys_getMatchingServer`.

* Labels are synchronized

While this test does schenonize dynamic labels, it never actually checks if
they were synchronized correctly. That functionality was removed many years
ago in https://github.com/gravitational/teleport/pull/250.

We do now have unit test coverage for dynamic labels at
lib/labels/labels_test.go.
2022-02-03 11:52:45 -08:00
rosstimothy 6cb13715ba
Dynamically resolve reverse tunnel address (#9958)
* Dynamically resolve reverse tunnel address

The reverse tunnel address is currently a static string that is
retrieved from config and passed around for the duration of a
services lifetime. When the `tunnel_public_address` is changed
on the proxy and the proxy is then restarted, all established
reverse tunnels over the old address will fail indefinintely.
As a means to get around this, #8102 introduced a mechanism
that would cause nodes to restart if their connection to the
auth server was down for a period of time. While this did
allow the nodes to pickup the new address after the nodes
restarted it was meant to be a stop gap until a more robust
solution could be applid.

Instead of using a static address, the reverse tunnel address
is now resolved via a `reversetunnel.Resolver`. Anywhere that
previoulsy relied on the static proxy address now will fetch
the actual reverse tunnel address via the webclient by using
the Resolver. In addition this builds on the refactoring done
in #4290 to further simplify the reversetunnel package. Since
we no longer track multiple proxies, all the left over bits
that did so have been removed to accomodate using a dynamic
reverse tunnel address.
2022-02-03 16:24:48 +00:00
Naji Obeid 1d79f0472b remove unnecessary file 2022-02-02 14:31:30 -08:00
Naji Obeid 1d195761de unfix test case 2022-02-02 14:31:30 -08:00
Naji Obeid 105eb22c01 tests 2022-02-02 14:31:30 -08:00
Marek Smoliński fbd5a2aafd
Fix tsh tctl do not load all CAS (#9357) 2022-01-31 13:35:15 +01:00
Joel 62173e096b
use google/uuid instead of pborman/uuid (#9793)
* replace imports

* use google/uuid

* fix test

* reverse changelog changes

* update gomod

* zac steps

* tidy

Co-authored-by: Zac Bergquist <zac.bergquist@goteleport.com>
2022-01-19 23:44:48 +00:00
rosstimothy 8932ed4e03
Replace cluster periodics with watchers (#9609)
* Replace cluster periodics with watchers
Remove periodically sending locks and certificate authorities to leaf clusters. Instead
we can rely on the watcher system to only deliver resources to leaf clusters when changes
occur.

Fixes #8817
2022-01-19 16:53:45 -05:00
NajiObeid b4e3427bdb
Naji/force http2 kubernetes (#9294)
* [cloud#1043] force kubernetes proxy to use http2

* typo

* typo x2

* test k8 proxy http2 capabilities

* linting
2022-01-14 20:03:40 +00:00
Gabriel Corado 86e92be253
feat: app server requests failover (#9288) 2022-01-13 18:24:33 +00:00
Trent Clarke ea176c2b3c
Attempts to make CI integration test logs more useful (#9626)
Actually tracking down the cause of a failure in the integration tests can 
be hard:

* It's hard to get an overall summary of what failed
* The tests sometimes emit no output before timing out, meaning any 
  diagnostic info is lost
* The emitted logs are too voluminous for a human to parse
* The emitted logs can present information out of order
* It's often hard to tell where the output from one test ends 
  and the next one begins

This patch attempts to address these concerns without attempting to rewrite 
any of the underlying teleport logging.

 * It improves the render-tests script to (optionally) report progress per-
   test, rather than on a per-package basis. My working hypothesis on the
   tests that time out with no output is that go test ./integration is
   waiting for the entire set of integration tests tests to be complete
   before reporting success or failure. Reporting on a per-test cycle gives
   faster feedback and means that any timed-out builds should give at least
   some idea of where they are stuck.

 * Adds the render-tests filter to the integration and integration-root make
   targets. This will show an overall summary of test results, as well as
    - Discarding log output from passing tests to increase signal-to-noise 
      ratio, and
    - Strongly delimiting the output from each failed test, making failures 
      easier to find.

 * Removes the notion of a failure-only logger in favour of post-processing
   the log events with render-tests. The failure-only logger catches log
   output from the tests and only forwards it to the console if the test 
   fails. Unfortunately, not all log output is guaranteed to pass through
   this logger (some teleport packages do not honour the configured logger,
   and reports from the go race detector certainly don't), meaning some 
   output is presented at the time it happens, and other output is batched
   and displayed at the end of the test. This makes working out what 
   happened where harder than it need be.

In addition, this patch also promotes the render-tests script into a fully-
fledged program, with appropriate makefile targets, make clean support, etc. 
It is now also more robust in the face on non-JSON output from go test 
(which happens if a package fails to compile).
2022-01-05 10:42:07 +11:00
Marek Smoliński 5afd0e6204
Update API client: dial auth service with TLS Routing (#9498) 2022-01-03 11:32:45 +01:00
Zac Bergquist 383bf998d3 Improve TestTwoClustersTunnel troubleshooting
This PR likely won't fix any of the flakiness, but should help in
debugging:

- Wait for TeleInstance's process to close
- Don't delete the data directory before the process is shut down
- Include site name in logging to make it easier to distinguish which
  site is shutting down
- Don't call StopAll twice (it was deferred and run manually)
- Include cluster name in log output
- Remove unused TestWrapper left over from the Check framework
2021-12-31 14:53:08 -07:00
Zac Bergquist d35da059a8 Emit the correct session ID for SessionLeave events
Fixes #9574
2021-12-28 09:31:58 -07:00
Forrest Marshall ddd4ab673d tweak test timeout 2021-12-23 10:44:10 -08:00
Marek Smoliński 95547a277b
Fix initKube: broadcast KubeReady event (#9418) 2021-12-20 19:42:43 +00:00
rosstimothy ab857001de
Add jitter and backoff to prevent thundering herd on auth (#9133)
Move cache and resourceWatcher watchers from a 10s retry to a jittered backoff retry up to ~1min. Replace the
reconnectToAuthService interval with a retry to add jitter and backoff there as well for when a node restarts due to
changes introduced in #8102.

Fixes #6889.
2021-12-16 11:41:08 -05:00
Marek Smoliński 141193bd56
Fix NO_PROXY addr logic (#9287) 2021-12-15 15:22:51 +00:00
Marek Smoliński f906831e58
Add ability to run Mongo proxy on separate listener (#9194) 2021-12-14 14:26:14 +01:00
Marek Smoliński d24ae5b1ce
Add ability to run Postgres proxy on separate listener (#8323) 2021-12-10 11:05:19 +01:00
Edoardo Spadolini c3dee235a2
Ensure we don't miss the resolution of an access request (#9193)
This makes it so that tsh will watch for access request resolution on the
correct (root) cluster, and it will not create access requests before the event
watcher is ready.


Fixes #9003 and #9244.
2021-12-10 08:09:36 +00:00
Zac Bergquist 53562aadb0
Use t.Setenv in tests (#9154)
This new feature in Go 1.17 automatically restores the environment
variable to its previous value when a test ends, making it simpler
to set up the environment for tests and less likely that we accidentally
leave behind global state.

Also convert some of the remaining uses of check to standard Go tests.
2021-12-01 10:43:12 -07:00
Trent Clarke cce6db2e5f
Google CloudBuild support (#9090)
Part of this change is implementing a "no secrets" policy for CI. Given that

    we have to support CI for arbitrary external contributors, and
    it is easy to craft a malicious PR that exfiltrates secrets during a CI build

any test that runs under CI must be able to do so without any injected secrets.

This means that several of the test we currently run under Drone will not be run on GCB, at least as part of the regular CI. The plan is to create a separate task that periodically runs tests that require external credentials (e.g. Kube tests, various backend data stores, etc.) in a more secure way and report failures asynchronously. And while these tests will not run under CI, the should still be built under CI so that required changes are caught during review.
2021-11-30 12:12:16 +11:00
Alan Parra 64679d2db8
Implement where conditions for active sessions (#9040)
Implements RFD 45 / "where" conditions for active sessions[1].

In few words, the purpose of the RFD is to allow the creation of roles that
permits users to only join a subset of active sessions (for example, only their
own sessions).

Implementation goes a bit further than the RFD, allowing the conditions to be
applied to  `update` and `delete` verbs as well.

Originally implemented by @andrejtokarcik (#8568), tweaks by @codingllama.

[1] https://github.com/gravitational/teleport/blob/master/rfd/0045-ssh_session-where-condition.md


* Implement where conditions for active sessions list/read
* actionWithConditionForList => actionForListWithCondition
* Make Context-exposed sessions follow the RFD API
* Add tests for "where" conditions on active sessions
* Fix typos
* Fix typos and spacing
* Rename "parties" to "participants" in the context session
* Update RFD to reflect PR changes

Update RFD to reflect PR changes

Specifically, mark as implemented and rename `parties` to `participants`.

* Push list authz logic to ServerWithRoles, obsolete cond
* Remove cond from GetSessions signature
* Simplify cast in lib.utils.Fields.GetString
* Add TODO to refactor SearchSessionEvents / stored sessions

Co-authored-by: Andrej Tokarčík <andrej@goteleport.com>
2021-11-18 15:05:13 -08:00
Gabriel Corado df588df526
Add app metatada to app audit events (#8930) 2021-11-18 11:40:04 -03:00
Trent Clarke 97c18fa1a9
Restart entire node on tunnel collapse (#8102)
Fixes #7606, where a node doesn't notice when the tunnel port changes. 

Imagine you have a cluster with a node connected in via a tunnel through a proxy `proxy.example.com` on port `3024`

Now change the proxy config so that `tunnel_public_address` is `proxy.example.com:4024`. You either restart the proxy, or reload the proxy config with a `SIGHUP`.

...and then the node
  a) loses its connection to auth (because the tunnel is gone), and 
  b)  _doesn't reconnect_, because even though the proxy address hasn't changed,
      the node has cached the old tunnel_public_address and keeps trying to connect
      to that.

You can always manually restart the node to have it reconnect, but that would be a pain if you have thousands of nodes.

In order to not have to manually restart all nodes, this change implements a check for a connection failures to the auth server, and re-starts the node if there are multiple connection failures in a given period of time. The check as-implemented piggybacks on the node's "common.rotate" service, which can already restart the node in certain circumstances, and uses the success of the periodic rotation sync as a proxy for the health of the node's connection to the auth server.

See-Also: #7606
2021-11-17 12:01:48 +11:00
Marek Smoliński afab1aa3dd
Fix dialing kube trusted cluser in v2 telport config (#8993) 2021-11-16 03:15:09 -08:00
Trent Clarke 3956ed27a6
Fix race condition in integration tests. (#8888)
Some integration tests modify global "constants" to speed up test
execution (e.g. shortening polling intervals). This is occasionally
tripping the Go data race detector, so I have added explicit
serialisation to reading and writing these global settings.

These values are only ever changed in a test environment, and there
should be zero contention for them in a non-test environment.
2021-11-10 11:34:34 +11:00
Zac Bergquist 7535fb0880 integration: name our subtests
Stop using t.Run("", ..), as it makes it impossible to tell which
subtest failed.
2021-11-04 15:50:25 -06:00
Roman Tkachenko d87ee8f640
Fix mongo access with mfa and add tests (#8799) 2021-11-02 12:06:58 -07:00
Trent Clarke 5463c799ea
Fix race condition in PipeNetCon (#8643)
The race condition detector is being tripped by a concurrent `Write` and
`Close` in the `PipeNetCon` in several integration tests. This is a naive
fix to serialize the write and close operations to resolve the race
condition.

The affected tests were also not handling asynchronous error reporting
correctly (i.e. it's not legal to call `require.XYZ()` from a goroutine
other than the one executing the test function.). This patch introduces
some plumbing to marshal asynchronous errors back into the main test
routine before failing the test.
2021-10-28 09:38:51 +11:00
Forrest Marshall babd6b07dd remove OnlyRecent behavior 2021-10-22 16:42:33 -07:00
Marek Smoliński 17a5cadabb
Add Proxy listener mode and proxy v2 configuration (#8511) 2021-10-21 14:45:47 +02:00
Marek Smoliński 7606d330e9
AWS CLI access (#8151) 2021-10-19 10:43:53 +02:00
Zac Bergquist 01ced111f4
Add RBAC for Windows desktop access (#8520)
* Add RBAC for Windows desktop access

This commit adds RBAC checks for Windows Desktops as described in
RFD 33 and RFD 34:

- add Windows desktop logins & labels to role definition
- introduce new file config for host labels based on a regexp match
- auth server API performs access checking for Windows desktop resources
- add RDP client callback to authorize the user
- support user/role locks
- respect the client idle timeout setting

Note: in cases where an connection is terminated to to RBAC, the web UI
currently displays "websocket connection failed" because the connection
is closed from the server. We'll need to follow up with a nice error
message for the client side to improve the UX here.

Other changes:

* Remove OSS RBAC migration marked for deletion
* Stop creating a default admin role
* add wildcard desktop access to the preset access role

Updates #7761
2021-10-12 14:52:59 -06:00
Nic Klaassen 2d10515f19
Implement Simplified Node Joining (#8250) 2021-10-08 10:41:28 -07:00
Marek Smoliński 56c536e61f
ALPN DB Proxy fix insecure flag (#8440) 2021-10-08 14:38:51 +02:00
Brian Joerger 2c8342c9de
Remove RoleConditions type alias from lib/services. (#8441) 2021-10-05 14:04:18 -07:00
Roman Tkachenko 8c3fac832c
Set flush interval when forwarding application http requests (#8359)
Signed-off-by: Roman Tkachenko <roman@gravitational.com>

Co-authored-by: Stefan Sedich <stefan.sedich@gmail.com>
2021-09-28 15:16:57 -07:00
Nic Klaassen 99cc8eb5ef
Require enterprise license for HSM support (#8370) 2021-09-27 10:40:47 -07:00
Marek Smoliński e8f9220fe7
Fix ALPN SNI Proxy TLS termination for DB connections (#8303) 2021-09-24 09:42:13 +02:00
Zac Bergquist 839cdcfa97
Convert GenerateServerKeys to GRPC (#8193)
This commit contains 2 changes:

1. Rename GenerateServerKeys to GenerateHostCerts.
   This is a more accurate name and consistent with the existing
   GenerateUserCerts endpoint.
2. Change the request type to include a single role, rather than a
   list of roles. We only ever allowed a single role in the list
   anyway, so this change will prevent future mis-use of the API.

Note: a side effect of this change is we now have two similar endpoints:
- GenerateHostCert: old API that generates SSH cert only
- GenerateHostCerts: a newer API that generates SSH and TLS certs

To avoid making this change too big, we'll aim to deprecate
GenerateHostCert in the future.
2021-09-13 14:37:28 -07:00
Marek Smoliński c92b7dc435
Fix linter: remove unused code (#8214) 2021-09-13 20:23:19 +02:00
rosstimothy 7c327ee296
Fix interactive sessions always exiting with code 0 (#8081)
* propogate error codes from interactive ssh sessions correctly (#3202)
2021-09-13 13:41:59 -04:00
Marek Smoliński c142b656c8
ALPN SNI Proxy (#7524) 2021-09-13 11:54:49 +02:00
Zac Bergquist 8a15c9a3a6
Require that public TLS and SSH keys are provided to register via token (#8135)
* Require that public TLS and SSH keys are provided to register via token

The original behavior attempted to make providing public keys optional,
and would generate keys if they were not provided. This had several
problems:

- The auth server is generating private keys for nodes and is
  potentially able to share them over the network.
- The return value for keys.Key would sometimes be set and sometimes
  be empty (the key is only set if the auth server generated it and
  knows what the key is)
- We only ever relied on this behavior as a shortcut in test code.
  In the production code this behavior was never used (and actually
  never worked due to a bug that would overwrite and discard the
  generated private key)

This commit requires that public keys are always provided, ensuring
that the private key is generated locally and never known by the
auth server.

It also results in a cleaner error message when either or both of the
public keys are missing from the request.

* Address review comments

* Fix tests that relied on certs being generated
2021-09-08 10:17:37 -07:00
Roman Tkachenko 3410bc8594
Dynamically register/unregister database resources (#7957) 2021-09-01 15:27:02 -07:00
Andrew Lytvynov 5b090f8633
Connect proxy <-> windows_desktop_service <-> RDP server (#7990)
* Connect proxy <-> windows_desktop_service <-> RDP server

Link together the proxy (websocket), service (mTPS) and RDP client. Pass
target desktop UUID via SNI on the TLS connection from the proxy.

* Use client CAs to validate incoming desktop_service connections

* Send binary frames on desktop websocket
2021-08-30 11:22:39 -07:00
Alan Parra dba49bfad6
Lint and fix missing license headers (#8075)
Introduce new make targets to check and add license headers to files
("make lint-license" and "make fix-license"). License checking is now a part of
"make lint" as well.

Initial attempts used goheader, but it caused "make lint-go" to become about 9x
slower (if not more), plus it only targets go files. Google's addlicense is fast
enough and targets however many file types we want.

Existing files that were missing licenses got the header added, using the
current year as the license date.

* Introduce lint-license and fix-license make targets
* Ignore generated files
* Add license to go files
* Replace irregular licenses with standard copyright/license
* Add license to proto files
* Install addlicense in build.assets Dockerfile
2021-08-30 09:44:09 -07:00
Nic Klaassen da951723f6
Add file configuration for HSMs (#7959) 2021-08-18 21:58:05 -07:00
Nic Klaassen c48ee9f062
Add support for HSM CA rotation (#7862) 2021-08-18 21:21:43 -07:00
Brian Joerger 25c9c982db
API client tunnel address discovery fix (#7533) 2021-08-11 14:34:50 -07:00
Trent Clarke 1d37ede936
Do not exit teleport when unable to enumerate k8s cluster (#7523)
Teleport will fail to start when when a k8s cluster is unavailable when
using the kubeconfig in a `kubernetes_service` configuration. This means
that a single missing cluster can disrupt _all_ of the configured
clusters, even if the others are online.

This change makes failing the cluster credential enumeration a
per-k8s-cluster warning, rather than a stop-the-world error.

It also expands the testing shims inside the k8s proxy to allow more
sophisticated mocked scenarios, in order to test the above.

See-Also: #7215
2021-08-10 11:04:26 +10:00
Joel 9dffd4dc32
Fix soundness issues in uacc (#7785) 2021-08-09 11:15:52 +02:00
Roman Tkachenko 629042ed30
Decouple database server from database (#7771) 2021-08-05 01:50:21 -07:00
Zac Bergquist e5b6ca7d9b
Ensure defaults are set for DB integration tests (#7787)
setDefaultIfNotSet needs to have a pointer receiver, otherwise
it will be setting defaults on a copy of the options, leaving
the original unset.
2021-08-04 07:13:41 -07:00
Nic Klaassen a8db09fe1e
Use KeyStore instead of raw keys with CAs (#7615) 2021-08-03 10:13:08 -07:00
Brian Joerger 9b8b9d6d0c
rollback - Upgrade api version. (#7751) 2021-07-30 15:34:19 -07:00
Brian Joerger c040aca4c1
Upgrade api version. (#7609) 2021-07-28 13:51:21 -07:00
Nic Klaassen ffd401a98e
Replace GenerateSelfSignedCAWithPrivateKey with GenerateSelfSignedCAWithSigner (#7612) 2021-07-23 12:07:08 -07:00
Andrej Tokarčík d5ca862280
Apply locks to connections tracked by srv.Monitor (#7506) 2021-07-23 14:11:50 +02:00
Eugene Yakubovich 67c0eb3b4c Add restricted session
Adds the ability to block network traffic on SSH sessions.
The deny/allow lists of IPs are specified in teleport.yaml file.
Supports both IPv4 and IPv6 communication.

This feature currently relies on enhanced recording for
cgroup management so that needs to be enabled as well.

-- Design rationale:
This patch uses Linux Security Module (LSM) hooks, specifically
security_socket_connect and security_socket_sendmsg, to control
egress traffic. The LSM provides two advantages over socket filtering
program types.
- It's executed early enough that the task information is available.
  This makes it easy to report PID, COMM, etc.
- It becomes a model for extending restrictions beyond networking.

The set of enforced cgroups is stored in a BPF hash map and the
deny/allow lists are stored in BPF trie maps. An IP address is
first checked against the allow list. If found, it's checked for
an override in the deny list. The policy is default deny. However,
the absence of the NetworkRestrictions API object is allow all.

IPv4 addresses are additionally registered in IPv6 trie (as mapped)
to account for dual stacks. However it is unclear if this is sufficient
as 4-to-6 transition methods utilize a multitude of translation and
tunneling methods.
2021-07-16 16:49:04 -07:00
Joel 98d51c529d
Implement an API for exporting session events (#7360) 2021-07-14 01:29:00 +02:00
Joel 01dd8068aa
Allow querying for audit events in either an ascending or descending order (#7425) 2021-07-12 13:14:05 +02:00
Nic Klaassen c50f4465f4
Use ssh.Signer instead of raw private keys (#7438)
* use ssh.Signer for (Host|User)CertParams
2021-07-06 10:13:09 -07:00
Roman Tkachenko 6b9726f961
Add MongoDB database access support (#7213) 2021-06-21 22:54:05 -07:00
Brian Joerger bd07d7be20
CheckAndSetDefaults sets all defaults. (#6846) 2021-06-18 12:57:29 -07:00
Andrej Tokarčík d63d144e8e
Move ClusterID field from ClusterConfig to ClusterName (#7050) 2021-06-18 18:42:09 +02:00
Trent Clarke 52fb813390
Adds per-node ability to disable ssh TCP forwarding (#6989)
Prior to this change, TCP forwarding over SSH could only be disallowed
by user-based rules, rather than by individual target nodes.

This change adds:
  * the`port_forwarding` key to the yaml SSH config block, with a boolean value
  * Plumbing to pipe the resulting config value through to the SSH server
  * A predicate check in the SSH server to [dis]allow port forwarding based on the setting.

This change also:
    * adds a common way for integration tests to await the establishment of an SSH session
    * refactors several integration tests to use this new method rather than manually waiting
    * adds some marshaling code to move errors from spawned goroutines back into the 
      main test routine in verifySessionJoin()

See-Also: Issue #6783
2021-06-16 20:17:26 -05:00
Andrew Lytvynov d4247cb150
hsm: migrate CA storage schema (#7245)
* hsm: migrate CA storage schema

Migrate types.CertAuthorityV2 schema according to
https://github.com/gravitational/teleport/blob/master/rfd/0025-hsm.md#backend-storage

Includes proto changes, types.CertAuthority wrapper changes and data
migration.

Note that we keep and update the old fields for backwards-compatibility.
If a cluster is upgraded to v7 and then downgraded back to v6,
everything should keep working.

* Address review feedback
2021-06-16 12:17:03 -05:00
Andrej Tokarčík 3d22eaac0e
Turn AuditConfig into a standalone resource (#6997) 2021-06-14 15:49:22 -05:00
Nic Klaassen 109fa7443d
Add V4 Roles (#7118) 2021-06-10 11:52:10 -07:00
Brian Joerger 4d36870ff0
Remove remaining API aliases (#7137) 2021-06-08 12:08:55 -07:00
Andrej Tokarčík 9410346b8d
Make SessionRecordingConfig resource dynamically configurable (#7054) 2021-06-08 08:30:44 -07:00
Andrej Tokarčík 2747cc75bf
Move ClusterConfig auth fields into ClusterAuthPreference (#6876) 2021-06-07 11:07:02 -07:00
Brian Joerger 7bff7c41bd
Remove API aliases (#6983) 2021-06-04 13:29:31 -07:00
Andrej Tokarčík 3ca21aca9a
Make ClusterNetworkingConfig resource dynamically configurable (#7013) 2021-06-04 19:42:50 +02:00
jane quin 5c78f1f756
Improve Access Request Events (#6863) 2021-06-03 14:28:38 -07:00
Marek Smoliński 24d5bbd949
Add delay in TestRootLeafIdleTimeout test (#7116) 2021-06-03 21:58:37 +02:00
Andrew Lytvynov cd2f4fceb7
Remove JSON schema validation (#6685)
* Remove JSON schema validation

Removing JSON schema validation from all resource unmarshalers.

--- what JSON schema gets us

Looking at the JSON schema spec and our usage, here are the supposed benefits:
- type validation - make sure incoming data uses the right types for the right fields
- required fields - make sure that mandatory fields are set
- defaulting - set defaults for fields
- documentation - schema definition for our API objects

Note that it does _not_ do:
- fail on unknown fields in data
- fail on a required field with an empty value

--- what replaces it

Based on the above, it may seem like JSON schema provides value.
But it's not the case, let's break it down one by one:
- type validation - unmarshaling JSON into a typed Go struct does this
- required fields - only checks that the field was provided, doesn't actually check that a value is set (e.g. `"name": ""` will pass the `required` check)
  - so it's pretty useless for any real validation
  - and we already have a separate place for proper validation - `CheckAndSetDefaults` methods
- defaulting - done in `CheckAndSetDefaults` methods
  - `Version` is the only annoying field, had to add it in a bunch of objects
- documentation - protobuf definitions are the source of truth for our API schema

--- the benefits

- performance - schema validation does a few rounds of `json.Marshal/Unmarshal` in addition to actual validation; now we simply skip all that
- maintenance - no need to keep protobuf and JSON schema definitions in sync anymore
- creating new API objects - one error-prone step removed
- (future) fewer dependencies - we can _almost_ remove the Go libraries for schema validation (one transient dependency keeping them around)

* Remove services.SkipValidation

No more JSON schema validation so this option is a noop.
2021-06-01 15:27:20 -07:00
Marek Smoliński eb7bb01d34
Support disconnect_expired_cert for database access (#6857) 2021-05-31 10:26:50 +02:00
Brian Joerger 5fbffaab80
keypaths package (#6848) 2021-05-27 10:31:05 -07:00
Trent Clarke d710424ae7
Ports some integration tests to Testify/Subtests (#6884)
In an attempt to make it easier to
 1) navigate the integration test output,
 2) find the cause of test failures, and
 3) run individual tests, make it easier to run individual
    integration tests from the command line,

...this change ports some of the OSS integration tests away from
GoCheck and implements them in terms of the standard `testing`
package.

The main changes are:
 * Test suites are now constructed as a normal Test function
   with many subtests.
 * The GoCheck assertions have been replaced with equivalent
   assertions from `testify/require`, for example:
     `c.Assert(err, check.IsNil)`
   becomes
     `require.NoError(t, err)`

   ... and so on
2021-05-26 19:05:46 -07:00
Nic Klaassen f268ba173e
Stop registering a Kubernetes cluster named after the Teleport cluster (#6786) 2021-05-25 17:50:35 -07:00
Andrej Tokarčík 555695dfdd
Introduce SessionRecordingConfig extracting fields from ClusterConfig (#6708) 2021-05-19 12:01:37 -07:00
a-palchikov ee6e2c85d8
AuditLog/grpc server data race (#6170)
* Avoid test flake by ensuring the gRPC server is shutdown gracefully before closing the audit log

* Fix lint warnings. Nove tunnel server's Close to earlier to close the proxy watcher and release grpc traffic

* Use graceful shutdown selectively until all tests have improved support for it

* Move session recorder clean up to session.Close

* Always use graceful shutdown for TLS.
2021-05-18 17:57:57 -07:00
Joel b68c519b4c
Implement RFD 19: Event Iteration API (#6731) 2021-05-18 16:46:01 +02:00
Andrej Tokarčík ad00c6c789
Introduce ClusterNetworkingConfig extracting fields from ClusterConfig (#6638) 2021-05-07 13:54:08 +02:00
Trent Clarke 769b4b5eec
Implements RFD-0022 - OpenSSH-compatible Agent Forwarding (#6525)
Prior to this change, `tsh` will only ever forward the internal key
agent managed by `tsh` to a remote machine.

This change allows a user to specify that `tsh` should forward either
the `tsh`-internal keystore, or the system key agent at `$SSH_AUTH_SOCK`.

This change also brings the `-A` command-line option into line with
OpenSSH.

For more info refer to RFD-0022.

See-Also: #1571
2021-05-06 17:17:50 -07:00
Roman Tkachenko db6fb57dae
Add app access headers rewrite (#6601) 2021-05-06 11:24:49 -07:00
Roman Tkachenko 7f01f2d4b6
Propagate external traits to leaf clusters (#6540) 2021-04-29 09:39:43 -07:00
Brian Joerger 9def18cb9f
gRPC conversions - Nodes (#6535) 2021-04-28 18:27:12 -07:00
Andrej Tokarčík 8e5ff95014
Provide a dedicated API endpoint for app FQDN resolving (#6449)
Currently, an app's target FQDN can be obtained only using the endpoint
for creating new app sessions.  The OAuth-style back-and-forth redirects
between the app launcher and the app itself are therefore forced to
generate an unnecessary additional app session just to resolve the FQDN.

The new endpoint introduced here allows to resolve such FQDNs by
invoking a dedicated endpoint.
2021-04-26 13:31:56 -07:00
Andrew Lytvynov 5265688fc8 Revert "Node session race (#6195)"
This reverts commit 4acf50902c.
2021-04-26 17:24:06 +00:00
a-palchikov 4acf50902c
Node session race (#6195)
* Attempt to isolate and improve state handling of a NodeSession.

* Add terminal close for kube terminal tests

* Address review comments

* Small tweaks

Co-authored-by: Andrew Lytvynov <andrew@goteleport.com>
2021-04-22 17:16:28 -07:00
Brian Joerger d830ed6db7
Refactor api package and docs to use pkg.go.dev effectively. (#6388) 2021-04-20 16:44:17 -07:00