* Add integration test for daemon.Service.AddCluster
* Call SaveProfile on clusterClient rather than cfg
This way we don't have to explicitly set ClientStore as
client.NewClient(cfg) does that for us.
Back in the day, the ping response didn't include the actual name of
the cluster. So instead we had to make a separate request for /web/config.js.
However, the ClusterName field was added to /webapi/ping in v10 (#12848).
Since we already ping the server, let's use the name from the ping response
instead.
- Add a generalized client store made up of a key, profile, and trusted certs store. Each sub store can support different backends (~/.tsh, identity_file, in-memory).
- Replace custom identity file handling with in-memory client store.
- Fix issues with trusted certs handling.
Wire device authentication into `tsh`, so it attempts to acquire device
certificates after user login. This affects direct logins (`tsh login`),
indirect logins (RetryWithRelogin) and Connect.
If authentication fails (non-Enterprise cluster, device not enrolled, etc) `tsh`
proceeds as usual, but the final user certificate won't contain device
extensions.
gravitational/teleport.e#514
* Add TTL field to integration/helpers.UserCredsRequest
This will let us create expired user certs by providing a negative TTL.
* Reissue gateway cert if middleware detects it expired
* Add integration test for gateway cert renewal
Many of our tests (db package, I'm looking at you) generate many RSA keys. This has two main side effects; makes our tests slow and flaky as CPU usage spikes in random moments when the tests are run in parallel.
This change pre-generates RSA keys at the beginning of each test module and reuse them in randomized order to reduce the situation that one key has been used multiple times in one test.
I had to move a few files to avoid circular dependencies.
* Log gateway.Close errors during test cleanups
Unless gateway.Close is called on a gateway that was already closed, it
shouldn't return an error.
However, when working on handling expired certs in Connect I ran into a
buggy test where that error from gateway.Close provided a crucial clue
in fixing the bug. But because initially it was simply ignored, it took
me a while to figure out what was going on.
That's why this commit adds logging around those errors.
2 out of those 3 places are helper functions which get used in a variety
of tests, hence why they call t.Cleanup. The other place does call
gateway.Close eventually but we still use t.Cleanup in case the
execution doesn't get to that point.
* Automatically add useful fields to gateway loggers
It's useful to see what resource the gateway is targeting and what is
the URI of the gateway.
Previously the field with URI was hardcoded in cluster_gateways.go or
added only when cfg.Log was nil, meaning that we weren't able to benefit
from it in places such as gateway_test.go.
This commit makes it so that the `resource` and `gateway` fields are
added to any logger that is passed through gateway.Config.
* Remove copylocks warning from Gateway.NewWithLocalPort
gateway.Gateway holds a mutex in one of its fields. NewWithLocalPort
accepted gateway by value so vet was issuing a warning about copying a
lock.
NewWithLocalPort doesn't actually use the copied lock. But it makes
sense to get rid of the warning anyway.
* Make ReissueDBCerts accept tlsca.RouteToDatabase as arg
ReissueDBCerts used to accept a full-blown types.Database object just to
read a couple of fields from it. In the context of Connect, such object
is obtainable only by making a request to the cluster.
However, in the upcoming PR we want to be able to reissue the cert
without having to perform an unnecessary request to the cluster.
gateway.Gateway already holds all data we need to reissue the cert, so
let's make ReissueDBCerts accept tlsca.RouteToDatabase instead of
types.Database to avoid making that extra request.
* Add Gateway.ReloadCert
In the upcoming PR, after we reissue the db cert, we need to be able to
update the cert used by the running alpn.LocalProxy. This commit exposes
exactly that functionality.
Also, this commit adds RWMutex to Gateway to avoid a situation where
multiple goroutines attempt to reload the cert. This shouldn't happen
under normal circumstances but better safe than sorry.
RWMutex is also used for any field on Gateway that has a setter.
* Add basic implementation of LocalProxyMiddleware
* Add OnExpiredCert callback to gateway.Config
This callback will let the layer above gateway.Gateway handle a
situation in which the cert used by the gateway has expired but there's
a client that tries to make a connection through the gateway.
gateway.Gateway doesn't have the ability to reissue the cert by itself,
hence why we need to accept a callback from above.
* Refactor error handling for GetConfigForClient
If GetConfigForClient returns an error, the error is not visible in any
logs or by the client making the request.
Instead of failing, we return a config that doesn't let any client
through. It has ClientAuth set to RequireAndVerifyClientCert, but the
config lacks ClientCAs to verify the client cert against.
This also means that when a client with an invalid cert dials the server,
it's going to fail on net.Conn.Read and not tls.Dialer.DialContext. This
will help us add uniform tests in the next commit.
This use of GetConfigForClient is more similar to what we do in other
parts of the codebase, for example in lib/srv/db/proxyserver.go.
* Add tests for teleterm.Serve with TCP address
Previously we'd only test teleterm.Serve with a unix socket, meaning our
whole TCP setup would not be tested.
Testing the TCP server means that we need to set up proper TLS configs
for clients in tests.
We were inconsistent throughout the codebase and would sometimes
use the slices package and other times use our own equivalents
in api/.
This removes our versions in favor of the golang.org/x package that
does the same, which has the added benefit of reducing the surface
area of the public API module.
Note: despite existing uses of the slices package, for some reason
it didn't show up in go.mod or go.sum. Fixed that too.
To enable feature detection in the Connect application, we need to
ping the auth server to understand which features are enabled.
Previously, we could get away with any cluster information stored in the
cluster profile but a proxy dial is necessary now to get an auth ping response.
* Connect: Accommodate for making gRPC client creds from tshd key pair
For tshd-initiated communication, the tshd process will need to create a
client that will connect to a gRPC server operated by the renderer
process of the Electron app.
On Windows, we use gRPC over TCP with mTLS. Each process creates its own
keypair and saves the public key to a predetermined location.
The previous code assumes that tshd is only going to need server
credentials. This commit makes it possible to create client credentials
from the same key pair.
* Refactor server options
* Expand the comment for createServerCredentials
* Remove unnecessary filepath.Join
* generateAndSaveCert: Use os.CreateTemp
Moves from github.com/golang/protobuf protoc-gen-go plugin to google.golang.org/
plugins.
This change was a long-time coming, but is now possible to do since our
dependencies are up-to-date.
* Move away from deprecated protoc-gen-go plugin
* Embed unimplemented server in handler.Handler
* Embed unimplemented server in multiplexer_test.go
* Update generated protos
* access requests for teleterm
* removed unused imports and named returns
* remove comment
* using timestamppb instead of string for access requests
* updated proto with some more comments
* updated protos with comments
* using clusterClient, comments, and moving validation to daemon for access request delete
* separated GetAccessRequests into separate RPCs
* protobuf updates
* moved requestid check before resolving cluster
* fullstops in comments
* used standard access_request_id through rpc messages
* updated protofiles
* updated daemon service types to match grpc
* added kube advanced search support
* updated protos for kubes in access requests
* testing tag build
* fix detached head
* new tag build
* protobuf update
* lint fixes
* allow drone windows Connect build to include webapps.e
* protobuf files
* remove drone changes and updated comment
* proto changes with comment fixes and changed field order
* protobuf updates
Update metalinter, fix a few lint warnings and replace deprecated linters.
`deadcode`, `structcheck` and `varcheck` are abandoned and now replaced by [`unused`][1].
Since 1.19, `go fmt` reformats godocs according to https://go.dev/doc/comment. I've done a bulk-reformatting of the codebase to keep the linter happy. Backporting is mostly harmless (the exception being `lib/services/role_test.go`, that for some reason breaks the _old_ linter using the new format).
[1]: https://golangci-lint.run/usage/linters/
* Bump golangci-lint version
* Replace abandoned linters
* Fix bodyclose on lib/auth/github.com
* Fix bodyclose on lib/kube/proxy/streamproto/proto_test.go
* Fix bodyclose on lib/srv/alpnproxy/proxy_test.go
* Fix bodyclose on lib/web/conn_upgrade_test.go
* Silence staticcheck on lib/kube/proxy/forwarder_test.go
* Silence staticcheck on lib/utils/certs_test.go
* Address BuildNameToCertificate deprecation warnings
* Run `go fmt ./...`
* Run `go fmt ./...` on api/
* Ignore formatting in role_test.go
* Remove redundant initializers in lib/srv/uacc/
* Update e/
Primary Changes:
- Remove reliance on Private Key PEM:
- Update native and keygen packages to return PrivateKey instead of PEM key
- Add new PrivateKey interface which implements crypto.Signer
- Replace PEM encoded private key usage where possible
- Replace calls to tls.(Load)X509KeyPair with keys.(Load)X509KeyPair in
client packages
Minor Changes:
- Remove unused agent.AddedKey return from LoadKey
- Simplify sshutils and removed unused code paths
- Add ecdsa and ed25519 key support
## What
First part of the Kubernetes [Discovery RFD](https://github.com/gravitational/teleport/pull/13376/) to introduce a Kubernetes server per cluster.
This PR introduces a separate Kubernetes server that uses the already introduced `KubernetesClusterV3`.
## Compatibility
In previous versions, Kubernetes Clusters were part of regular `ServerV2` resource and this refactoring deprecates the `ServerV2` usage but keeps them for compatibility with previous version.
Everything is backward compatible, so v10 kubernetes agents and trusted clusters can connect fine.
## Next steps
Once this is merged, a new PR will introduce dynamic registration for Kubernetes Clusters discovered through EKS Discovery.
* Set default shutdown signals for Teleterm
The server in tests was actually immediately shutting down because
Signal.Notify relayed all signals to it and thus closed the server
prematurely.
* Remove time.Sleep in teleterm tests
Also, rename Start() to Serve().
* daemon: Put gateway-related methods next to each other
* Remove unused fields from daemon.Config
* Make Config a private field instead of embedding it
* Add tests for gateway CRUD
* Remove unnecessary ctx and error from daemon.Service gateway methods
* Refactor daemon.Service.gateways to a hash map
* Add comment explaining error handling in removeGateway
Edoardo asked about it some time ago. I forgot to add the explanation to
the code.
https://github.com/gravitational/teleport/pull/14135#discussion_r914687662
* Do not return pointers from ListGateways()
* Remove FindGateway, fix lock issues in RestartGateway() and RemoveGateway()
In the previous version, the proxy client would be closed immediately
after addMetadataToRetryableError. This commit makes it so that the proxy
client is closed only after GetAllowedDatabaseUsers finishes.
When running Connect on Windows, Grzegorz ran into a problem where fetching
db users for MSSQL would fail but only on Windows and only for MSSQL:
Failed to fetch current user information: connection error:
desc = "transport: Error while dialing failed to dial: read tcp
10.211.55.4:55519->52.14.45.73:3023: use of closed network
connection". services\role.go:764
Other times the error would be
connection error: desc = "transport: Error while dialing failed
to dial: ssh: unexpected packet in response to channel open:
<nil>"] apiserver\middleware.go:39
Surprisingly, `tsh db ls` didn't have this problem. So when thinking about
what we're doing differently than tsh and how it might be related to
a closed connection, I noticed that I made a bug in the code that closes
the proxy client.
* Make it possible to test gateway opening/closing in Connect
Open() and Close() used to not return any error and Open() used to start
the gateway in a goroutine, making it rather hard to write tests for it.
This commit makes it so that Open() and Close() return errors and Open()
blocks.
Adjustments have been made to other places in lib/teleterm to account
for that missing goroutine and returned errors.
* Close httptest server in alpnproxy/local_proxy_test.go
While writing tests for the gateways, I was relying heavily on tests for
the local proxy. I noticed that it starts the server but doesn't close it
so I added an appropriate call to the cleanup function.
* Rename Gateway.Open to Gateway.Serve