self-hosted/teleport

mirror of https://github.com/gravitational/teleport synced 2024-10-21 01:34:01 +00:00

Author	SHA1	Message	Date
Andrew Lytvynov	3fa6904377	Multiple fixes for k8s forwarder (#5038 ) * kube: emit audit events using process context Using the request context can prevent audit events from getting emitted, if client disconnected and request context got closed. We shouldn't be losing audit events like that. Also, log all response errors from exec handler. * kube: cleanup forwarder code Rename a few config fields to be more descriptive. Avoid embedding unless necessary, to keep the package API clean. * kube: cache only user certificates, not the entire session The expensive part that we need to cache is the client certificate. Making a new one requires a round-trip to the auth server, plus entropy for crypto operations. The rest of clusterSession contains request-specific state, and only adds problems if cached. For example: clusterSession stores a reference to a remote teleport cluster (if needed); caching requires extra logic to invalidate the session when that cluster disappears (or tunnels drop out). Same problem happens with kubernetes_service tunnels. Instead, the forwarder now picks a new target for each request from the same user, providing a kind of "load-balancing". * Init session uploader in kubernetes service It's started in all other services that upload sessions (app/proxy/ssh), but was missing here. Because of this, the session storage directory for async uploads wasn't created on disk and caused interactive sessions to fail.	2020-12-08 11:12:07 -08:00
a-palchikov	7c87576a8b	flaky tests: consistent logging (#4849 ) * Update logrus package to fix data races * Introduce a logger that uses the test context to log the messages so they are output if a test fails for improved trouble-shooting. * Revert introduction of test logger - simply leave logger configuration at debug level outputting to stderr during tests. * Run integration test for e as well * Use make with a cap and append to only copy the relevant roles. * Address review comments * Update integration test suite to use test-local logger that would only output logs iff a specific test has failed - no logs from other test cases will be output. * Revert changes to InitLoggerForTests API * Create a new logger instance when applying defaults or merging with file service configuration * Introduce a local logger interface to be able to test file configuration merge. * Fix kube integration tests w.r.t log * Move goroutine profile dump into a separate func to handle parameters consistently for all invocations	2020-12-07 15:35:15 +01:00
Ben Arent	09928a7f2b	Cherry pick Gravitational -> GoTeleport (#4932 )	2020-11-25 11:18:55 -08:00
Russell Jones	986bf08ab3	Consolidated application checks. Consolidated application validation checks. The previous implementation had a bug in it where it would fail if no /etc/teleport.yaml existed.	2020-11-17 17:57:00 -08:00
Sasha Klizhentas	e6681abe6a	Fan out events in async mode for async recordings. This commit fixes #4695. Teleport in async recording mode sends all events to disk, and uploads them to the server later. It uploads some events synchronously to the audit log so they show up in the global event log right away. However if the auth server is slow, the fanout blocks the session. This commit makes the fanout of some events to be fast, but nonblocking and never fail so sessions will not hang unless the disk writes hang. It adds a backoff period and timeout after which some events will be lost, but session will continue without locking.	2020-11-13 17:10:35 -08:00
Russell Jones	cf635a7e60	Addressed code review comments.	2020-11-13 14:52:00 -08:00
Andrew Lytvynov	dd3977957a	Register a kubernetes cluster from proxy_service A proxy running in pre-5.0 mode (e.g. with local kubeconfig) should register an entry in `tsh kube clusters`. After upgrading to 5.0, without migration to kubernetes_service, all the new `tsh kube` commands will work as expected.	2020-11-13 14:52:00 -08:00
Andrew Lytvynov	4bc8011722	RBAC for kubernetes clusters (#4782 ) * Add labels to KubernetesCluster resources Plumb from config to the registered object, keep dynamic labels updated. * Check kubernetes RBAC Checks are in some CRUD operations on the auth server and in the kubernetes forwarder (both proxy or kubernetes_service). The logic is essentially copy-paste of the TAA version.	2020-11-11 22:58:33 +00:00
Russell Jones	e13cc165c7	Updated storage configuration to apply to events. Updated storage configuration to not only apply to DynamoDB in the backend package, but also DynamoDB in the events package. This allows configuring continuous backups and auto scaling for the events table.	2020-11-10 16:40:08 -08:00
Andrew Lytvynov	b16ad647b4	Kubernetes request routing and cluster registration (#4670 ) This change has several parts: cluster registration, cache updates, routing and a new tctl flag. > cluster registration Cluster registration means adding `KubernetesClusters` to `ServerSpec` for servers with `KindKubeService`. `kubernetes_service` instances will parse their kubeconfig or local `kube_cluster_name` and add them to their `ServerSpec` sent to the auth server. They are effectively declaring that "I can serve k8s requests for k8s cluster X". > cache updates This is just cache plumbing for `kubernetes_service` presence, so that other teleport processes can fetch all of kube services. It was missed in the previous PR implementing CRUD for `kubernetes_service`. > routing Now the fun part - routing logic. This logic lives in `/lib/kube/proxy/forwarder.go` and is shared by both `proxy_service` (with kubernetes integration enabled) and `kubernetes_service`. The target k8s cluster name is passed in the client cert, along with k8s users/groups information. `kubernetes_service` only serves requests for its direct k8s cluster (from `Forwarder.creds`) and doesn't route requests to other teleport instances. `proxy_service` can serve requests: - directly to a k8s cluster (the way it works pre-5.0) - to a leaf teleport cluster (also same as pre-5.0, based on `RouteToCluster` field in the client cert) - to a `kubernetes_service` (directly or over a tunnel) The last two modes require the proxy to generate an ephemeral client TLS cert to do an outbound mTLS connection. > tctl flag A flag `--kube-cluster-name` for `tctl auth sign --format=kubernetes` which allows generating client certs for non-default k8s cluster name (as long as it's registered in a cluster). I used this for testing, but it could be used for automation too.	2020-11-09 19:40:02 +00:00
Russell Jones	904b0d0488	Added Application Access. Added support for an identity aware, RBAC enforcing, mutually authenticated, web application proxy to Teleport. * Updated services.Server to support an application servers. * Updated services.WebSession to support application sessions. * Added CRUD RPCs for "AppServers". * Added CRUD RPCs for "AppSessions". * Added RBAC support using labels for applications. * Added JWT signer as a services.CertAuthority type. * Added support for signing and verifying JWT tokens. * Refactored dynamic label and heartbeat code into standalone packages. * Added application support to web proxies and new "app_service" to proxy mutually authenticated connections from proxy to an internal application.	2020-11-03 14:32:13 -08:00
Andrew Lytvynov	5ec194cd0d	Implement kubernetes_service registration and startup (#4611 ) * Implement kubernetes_service registration and sratup The new service now starts, registers (locally or via a join token) and heartbeats its presence to the auth server. This service can handle k8s requests (like a proxy) but not to remote teleport clusters. Proxies will be responsible for routing those. The client (tsh) will not yet go to this service, until proxy routing is implemented. I manually tweaked server addres in kubeconfig to test it. You can also run `tctl get kube_service` to list all registered instances. The self-reported info is currently limited - only listening address is set. * Address review feedback	2020-10-30 17:19:53 +00:00
Andrew Lytvynov	fd2959260e	Add kube_listen_addr to proxy_service (#4616 ) This is a shorthand for the larger kubernetes section: ``` proxy_service: kube_listen_addr: "0.0.0.0:3026" ``` if equivalent to: ``` proxy_service: kubernetes: enabled: yes listen_addr: "0.0.0.0:3026" ``` This shorthand is meant to be used with the new `kubernetes_service`: https://github.com/gravitational/teleport/pull/4455 It reduces confusion when both `proxy_service` and `kubernetes_service` are configured in the same process.	2020-10-28 21:52:08 +00:00
Sasha Klizhentas	ac2fb2f9b4	Fixes configuration with multiple event backends This commit fixes #4598 Config with multiple event backends was crashing on 4.4: ```yaml storage: audit_events_uri: ['dynamodb://streaming', 'stdout://', 'dynamodb://streaming2'] ```	2020-10-21 15:23:56 -07:00
Andrew Lytvynov	5cd212fecd	Add kubernetes_service to teleport.yaml (#4497 ) * Fix local etcd test failures when etcd is not running * Add kubernetes_service to teleport.yaml This plumbs config fields only, they have no effect yet. Also, remove `cluster_name` from `proxy_config.kubernetes`. This field will only exist under `kubernetes_service` per https://github.com/gravitational/teleport/pull/4455 * Handle IPv6 in kubernetes_service and rename label fields * Disable k8s cluster name defaulting in user TLS certs Need to implement service registration first.	2020-10-19 17:28:10 +00:00
Andrew Lytvynov	566b7cc457	RFD 1: user testify/require instead of testify/assert `require` is a sister package to `assert` that terminates the test on failure. `assert` records the failure but lets the test proceed, which is un-intuitive. Also update all existing tests to match.	2020-10-16 00:15:25 +00:00
Andrew Lytvynov	92ed2db38a	Fixing golint warnings, batch 1 Mostly cosmetic changes: - making receiver names consistent - renaming `foo.FooBar` to `foo.Bar` (using package name as prefix) - removing redundant `else` branches - changing `a += 1` to `a++`	2020-10-13 00:22:49 +00:00
Andrew Lytvynov	46a1321051	Use the correct k8s port in tctl auth sign --format=kubernetes Previously, it used the public_addr of the proxy directly. Now it parses it and replaces the port with k8s port.	2020-10-05 18:01:06 +00:00
Andrew Lytvynov	3c2e4e2ec1	Add cluster_name to proxy kubernetes config Cluster name from this field plug all clusters from kubeconfig are stored on the auth server via heartbeats. This info will later be used to route k8s requests back to proxies. Updates https://github.com/gravitational/teleport/issues/3952	2020-09-30 15:56:31 +00:00
Sasha Klizhentas	d160507430	Session streaming This commit introduces GRPC API for streaming sessions. It adds structured events and sync streaming that avoids storing events on disk. You can find design in rfd/0002-streaming.md RFD.	2020-09-28 23:08:56 -07:00
Andrew Lytvynov	74cbd1379c	Split remote cluster watching from reversetunnel.AgentPool (#4290 ) * Split remote cluster watching from reversetunnel.AgentPool Separating the responsibilities: - AgentPool takes a proxy (or LB) endpoint and manages a pool of agents for it (each agent is a tunnel to a unique proxy process behind the endpoint) - RemoteClusterTunnelManager polls the auth server for a list of trusted clusters and manages a set of AgentPools, one for each trusted cluster Previously, AgentPool did both of the above. Also, bundling some cleanup in the area: - better error when dialing through tunnel and directly both fail - rename RemoteKubeProxy to LocalKubernetes to better reflect the meaning - remove some dead code and simplify config structs * reversetunnel: factor out track.Key ClusterName is the same for all Agents in an AgentPool. track.Tracker needs to only track proxy addresses.	2020-09-17 15:07:44 +00:00
Andrew Lytvynov	3587cca784	Always collect metrics about top backend requests (#4282 ) * Always collect metrics about top backend requests Previously, it was only done in debug mode. This makes some tabs in `tctl top` empty, when auth server is not in debug mode. * backend: use an LRU cache for top requests in Reporter This LRU cache tracks the most frequent recent backend keys. All keys in this cache map to existing labels in the requests metric. Any evicted keys are also deleted from the metric. This will keep an upper limit on our memory usage while still always reporting the most active keys.	2020-09-16 20:33:19 +00:00
Andrew Lytvynov	ba6c4a1354	Get teleport /readyz state from heartbeats instead of cert rotation Heartbeats are more frequent and result in more up-to-date /readyz status. Concretely, it goes from ~10min status update to <1m. Also, refactored the state tracking code to track the status of individual teleport components (auth/proxy/node).	2020-09-14 23:55:35 +00:00
Forrest Marshall	c5fc642d01	relax local log restrictions & improve file log errors	2020-07-20 12:27:09 -07:00
Andrew Lytvynov	c68b571080	Add a Migrate method to backend.Backend Unify migrations and expose them to the calling code at startup. All backends except for etcd implement a nop migration.	2020-07-02 23:24:49 +00:00
Andrew Lytvynov	d3260103ff	Keep using the default (ssh-rsa) signing algo for SSH handshakes x/crypto/ssh does not support SHA2 signatures for handshakes yet. We'll keep using SHA2 for cert signing, but handshakes have to wait.	2020-06-24 21:25:33 +00:00
Andrew Lytvynov	d7dc41659d	Use CA signing alg from config file on manual rotation This allows users to manually switch to a different algorithm by: - setting the config file field - running "tctl auth rotate" If config file field is not set, existing signing algorithm of the CA is preserved.	2020-06-24 21:25:33 +00:00
Andrew Lytvynov	6746213886	Preserve SSH signing alg for existing CAs Store the signing algorithm along the CA private key. When reading old CAs that don't have it set, default to UNKNOWN proto enum which corresponds to the old SHA1-based signing alg. The only time you get a SHA2 signature is when creating a fresh cluster and generating a new CA. This can be disabled in the config.	2020-06-24 21:25:33 +00:00
Andrew Lytvynov	9bc8fb3ae0	Add ca_signing_algo to the config file This allows users to override the SHA2 signing algorithms we default to now for compatibility with the (very) old OpenSSH versions. For host and user certs, use the CA signing algo for their own handshakes. This allows us to propagate the signing algo from auth server everywhere else.	2020-06-24 21:25:33 +00:00
Andrew Lytvynov	1d9e01bb80	errcheck: add missing error checks in lib/service*	2020-06-01 17:00:07 +00:00
Andrew Lytvynov	617afc7e6f	Fix remaining gosimple findings List of fixed items: ``` integration/helpers.go:1279:2 gosimple S1000: should use for range instead of for { select {} } integration/integration_test.go:144:5 gosimple S1009: should omit nil check; len() for nil slices is defined as zero integration/integration_test.go:173:5 gosimple S1009: should omit nil check; len() for nil slices is defined as zero integration/integration_test.go:296:28 gosimple S1019: should use make(chan error) instead integration/integration_test.go:570:41 gosimple S1019: should use make(chan interface{}) instead integration/integration_test.go:685:40 gosimple S1019: should use make(chan interface{}) instead integration/integration_test.go:759:33 gosimple S1019: should use make(chan string) instead lib/auth/init_test.go:62:2 gosimple S1021: should merge variable declaration with assignment on next line lib/auth/tls_test.go:1658:22 gosimple S1024: should use time.Until instead of t.Sub(time.Now()) lib/backend/dynamo/dynamodbbk.go:420:5 gosimple S1004: should use !bytes.Equal(expected.Key, replaceWith.Key) instead lib/backend/dynamo/dynamodbbk.go:656:12 gosimple S1039: unnecessary use of fmt.Sprintf lib/backend/etcdbk/etcd.go:458:5 gosimple S1004: should use !bytes.Equal(expected.Key, replaceWith.Key) instead lib/backend/firestore/firestorebk.go:407:5 gosimple S1004: should use !bytes.Equal(expected.Key, replaceWith.Key) instead lib/backend/lite/lite.go:317:5 gosimple S1004: should use !bytes.Equal(expected.Key, replaceWith.Key) instead lib/backend/lite/lite.go:336:6 gosimple S1004: should use !bytes.Equal(value, expected.Value) instead lib/backend/memory/memory.go:365:5 gosimple S1004: should use !bytes.Equal(expected.Key, replaceWith.Key) instead lib/backend/memory/memory.go:376:5 gosimple S1004: should use !bytes.Equal(existingItem.Value, expected.Value) instead lib/backend/test/suite.go:327:10 gosimple S1024: should use time.Until instead of t.Sub(time.Now()) lib/client/api.go:1410:9 gosimple S1003: should use strings.ContainsRune(name, ':') instead lib/client/api.go:2355:32 gosimple S1019: should use make([]ForwardedPort, len(spec)) instead lib/client/keyagent_test.go:85:2 gosimple S1021: should merge variable declaration with assignment on next line lib/client/player.go:54:33 gosimple S1019: should use make(chan int) instead lib/config/configuration.go:1024:52 gosimple S1019: should use make(services.CommandLabels) instead lib/config/configuration.go:1025:44 gosimple S1019: should use make(map[string]string) instead lib/config/configuration.go:930:21 gosimple S1003: should use strings.Contains(clf.Roles, defaults.RoleNode) instead lib/config/configuration.go:931:22 gosimple S1003: should use strings.Contains(clf.Roles, defaults.RoleAuthService) instead lib/config/configuration.go:932:23 gosimple S1003: should use strings.Contains(clf.Roles, defaults.RoleProxy) instead lib/service/supervisor.go:387:2 gosimple S1001: should use copy() instead of a loop lib/tlsca/parsegen.go:140:9 gosimple S1034: assigning the result of this type assertion to a variable (switch generalKey := generalKey.(type)) could eliminate type assertions in switch cases lib/utils/certs.go:140:9 gosimple S1034: assigning the result of this type assertion to a variable (switch generalKey := generalKey.(type)) could eliminate type assertions in switch cases lib/utils/certs.go:167:40 gosimple S1010: should omit second index in slice, s[a:len(s)] is identical to s[a:] lib/utils/certs.go:204:5 gosimple S1004: should use !bytes.Equal(certificateChain[0].SubjectKeyId, certificateChain[0].AuthorityKeyId) instead lib/utils/parse/parse.go:116:45 gosimple S1003: should use strings.Contains(variable, "}}") instead lib/utils/parse/parse.go:116:6 gosimple S1003: should use strings.Contains(variable, "{{") instead lib/utils/socks/socks.go:192:10 gosimple S1025: should use String() instead of fmt.Sprintf lib/utils/socks/socks.go:199:10 gosimple S1025: should use String() instead of fmt.Sprintf lib/web/apiserver.go:1054:18 gosimple S1024: should use time.Until instead of t.Sub(time.Now()) lib/web/apiserver.go:1954:9 gosimple S1039: unnecessary use of fmt.Sprintf tool/tsh/tsh.go:1193:14 gosimple S1024: should use time.Until instead of t.Sub(time.Now()) ```	2020-05-27 19:36:38 +00:00
Andrew Lytvynov	b4150f7c05	Check that only one listener per type is found in TeleportProcess TeleportProcess can have multiple listeners per type during graceful restart. Return an error from address getters to avoid flaky behavior. These getters should only get called from tests.	2020-05-21 20:38:37 +00:00
Andrew Lytvynov	f20da0caca	Fix data races when accessing internal listeners These listeners are already protected elsewhere, just missing locking in a couple methods.	2020-05-21 20:38:37 +00:00
Andrew Lytvynov	c6821560bb	Auto assign ports in lib/service tests	2020-05-21 20:38:37 +00:00
Andrew Lytvynov	7f3ff1b67c	Add helper methods for listening addresses in TeleportProcess This is primarily for tests to fetch the actual listening port of these endpoints when config has port 0. But it's also convenient for callers to avoid digging into the config fields.	2020-05-21 20:38:37 +00:00
Andrew Lytvynov	a763e5ce1e	Advertise k8s listen_addr from the proxy If kube.public_addr is not set and kube.listen_addr uses a non-standard port, the client can't discover it. Advertise kube.listen_addr same as we advertise it for SSH. Also, override client.TeleportClient.KubeProxyAddr with info from proxy's Ping response. Existing value in that field comes from ~/.tsh/profile and can contain the wrong value.	2020-05-21 20:37:28 +00:00
Andrew Lytvynov	ba9c394a83	Start /readyz state tracking in stateStarting instead of stateOK This only matters for nodes. The new stateStarting will be in effect until the node successfully joins the cluster. This means that /readyz for nodes will return '400 Bad Request' instead of '200 OK' until it joins. Updates #3700	2020-05-19 22:40:02 +00:00
Andrew Lytvynov	f3d9298674	gosimple: remove redundant select wrappers for single-channel receive Select is unnecessary, unless there are multiple channels to send/receive from.	2020-05-15 16:32:45 +00:00
Andrew Lytvynov	0add471f16	gosimple: remove comparisons to boolean constants `if x == true` or `if x == false` should be just `if x` or `if !x`.	2020-05-15 16:32:45 +00:00
Andrew Lytvynov	cd1344a4a5	Add prometheus metric mirroring /readyz state This allows users to get the health of their nodes from prometheus metrics pipeline instead of polling readyz separately. Updates #3700	2020-05-14 18:08:10 +00:00
Andrew Lytvynov	bdd388e0d0	Fix remaining staticcheck findings in lib/... Fixed findings: ``` lib/sshutils/server_test.go:163:2: SA4006: this value of `clt` is never used (staticcheck) clt, err := ssh.Dial("tcp", srv.Addr(), &cc) ^ lib/sshutils/server_test.go:91:3: SA5001: should check returned error before deferring ch.Close() (staticcheck) defer ch.Close() ^ lib/shell/shell_test.go:33:2: SA4006: this value of `shell` is never used (staticcheck) shell, err = GetLoginShell("non-existent-user") ^ lib/cgroup/cgroup_test.go:111:2: SA9003: empty branch (staticcheck) if err != nil { ^ lib/cgroup/cgroup_test.go:119:2: SA5001: should check returned error before deferring service.Close() (staticcheck) defer service.Close() ^ lib/client/keystore_test.go:138:2: SA4006: this value of `keyCopy` is never used (staticcheck) keyCopy, err = s.store.GetKey("host.a", "bob") ^ lib/client/api.go:1604:3: SA4004: the surrounding loop is unconditionally terminated (staticcheck) return makeProxyClient(sshClient, m), nil ^ lib/backend/test/suite.go:156:2: SA4006: this value of `err` is never used (staticcheck) result, err = s.B.GetRange(ctx, prefix("/prefix/c/c1"), backend.RangeEnd(prefix("/prefix/c/cz")), backend.NoLimit) ^ lib/utils/timeout_test.go:84:2: SA1019: t.Dial is deprecated: Use DialContext instead, which allows the transport to cancel dials as soon as they are no longer needed. If both are set, DialContext takes priority. (staticcheck) t.Dial = func(network string, addr string) (net.Conn, error) { ^ lib/utils/websocketwriter.go:83:3: SA4006: this value of `err` is never used (staticcheck) utf8, err = w.encoder.String(string(data)) ^ lib/utils/loadbalancer_test.go:134:2: SA4006: this value of `out` is never used (staticcheck) out, err = Roundtrip(frontend.String()) ^ lib/utils/loadbalancer_test.go:209:2: SA4006: this value of `out` is never used (staticcheck) out, err = RoundtripWithConn(conn) ^ lib/srv/forward/sshserver.go:582:3: SA4004: the surrounding loop is unconditionally terminated (staticcheck) return ^ lib/service/service.go:347:4: SA4006: this value of `err` is never used (staticcheck) i, err = auth.GenerateIdentity(process.localAuth, id, principals, dnsNames) ^ lib/service/signals.go:60:3: SA1016: syscall.SIGKILL cannot be trapped (did you mean syscall.SIGTERM?) (staticcheck) syscall.SIGKILL, // fast shutdown ^ lib/config/configuration_test.go:184:2: SA4006: this value of `conf` is never used (staticcheck) conf, err = ReadFromFile(s.configFileBadContent) ^ lib/config/configuration.go:129:2: SA5001: should check returned error before deferring reader.Close() (staticcheck) defer reader.Close() ^ lib/kube/kubeconfig/kubeconfig_test.go:227:2: SA4006: this value of `err` is never used (staticcheck) tlsCert, err := ca.GenerateCertificate(tlsca.CertificateRequest{ ^ lib/srv/sess.go:720:3: SA4006: this value of `err` is never used (staticcheck) result, err := s.term.Wait() ^ lib/multiplexer/multiplexer_test.go:169:11: SA1006: printf-style function with dynamic format string and no further arguments should use print-style function instead (staticcheck) _, err = fmt.Fprintf(conn, proxyLine.String()) ^ lib/multiplexer/multiplexer_test.go:221:11: SA1006: printf-style function with dynamic format string and no further arguments should use print-style function instead (staticcheck) _, err = fmt.Fprintf(conn, proxyLine.String()) ^ ```	2020-04-28 15:17:44 +00:00
Andrew Lytvynov	7ccdd87496	Enable more Go linters: varcheck,bodyclose,structcheck All changes should be noop, except for `integration/integration_test.go`. The integration test was ignoring `recordingMode` test case parameter and always used `RecordAtNode`. When switching to `recordingMode`, test cases with `RecordAtProxy` fail with a confusing error about missing user agent. Filed https://github.com/gravitational/teleport/issues/3606 to track that separately and unblock enabling `structcheck` linter.	2020-04-24 15:52:43 +00:00
Andrew Lytvynov	703708ac79	Improve error message for revoked node join token The node first tries using the token with auth server. If that fails, it tries the same address as a proxy server. If both fail, user only sees the error from the latter attempt. If using CA pin, this error will be "x509: certificate signed by unknown authority", which is confusing. Log both errors, and mention that a fallback is happening. The output looks like: ERRO [AUTH] Failed to register through auth server: "my-hostname" [3e53a982-afd1-4d2f-8864-54c25fbe5865] can not join the cluster with role Node, the token is not valid; falling back to trying the proxy server auth/register.go:123 ERRO [PROC:1] Node failed to establish connection to cluster: failed to register through proxy server: x509: certificate signed by unknown authority. time/sleep.go:149	2020-04-23 20:09:35 +00:00
Andrew Lytvynov	64b11dd5ab	Prevent nil pointer panic on node shutdown If node hasn't fully initialized before getting stopped (such as when join token isn't valid), most pointer vars in `initSSH` will be nil. Handle that cleanly.	2020-04-23 20:09:35 +00:00
Andrew Lytvynov	d1ea40d074	Enable linters: deadcode,goimports,govet,typecheck And fix the relevant findings for these linters. Also, set extra flags for `golangci-lint run` to make sure no findings are suppressed.	2020-04-17 17:46:51 +00:00
Alexey Kontsevoy	3c670d5d58	Merge Teleport V4.3 UI branch to master (#3583 ) * Add monorepo * Add reset/passwd capability for local users (#3287) * Add UserTokens to allow password resets * Pass context down through ChangePasswordWithToken * Rename UserToken to ResetPasswordToken * Add auto formatting for proto files * Add common Marshaller interfaces to reset password token * Allow enterprise "tctl" reuse OSS user methods (#3344) * Pass localAuthEnabled flag to UI (#3412) * Added LocalAuthEnabled prop to WebConfigAuthSetting struct in webconfig.go * Added LocalAuthEnabled state as part of webCfg in apiserver.go * update e-refs * Fix a regression bug after merge * Update tctl CLI output msgs (#3442) * Use local user client when resolving user roles * Update webapps ref * Add and retrieve fields from Cluster struct (#3476) * Set Teleport versions for node, auth, proxy init heartbeat * Add and retrieve fields NodeCount, PublicURL, AuthVersion from Clusters * Remove debug logging to avoid log pollution when getting public_addr of proxy * Create helper func GuessProxyHost to get the public_addr of a proxy host * Refactor newResetPasswordToken to use GuessProxyHost and remove publicUrl func * Remove webapps submodule * Add webassets submodule * Replace webapps sub-module reference with webassets * Update webassets path in Makefile * Update webassets 1b11b26 Simplify and clean up Makefile (#62) https://github.com/gravitational/webapps/commit/1b11b26 * Retrieve cluster details for user context (#3515) * Let GuessProxyHost also return proxy's version * Unit test GuessProxyHostAndVersion & GetClusterDetails * Update webassets 4dfef4e Fix build pipeline (#66) https://github.com/gravitational/webapps/commit/4dfef4e * Update e-ref * Update webassets 0647568 Fix OSS redirects https://github.com/gravitational/webapps/commit/0647568 * update e-ref * Update webassets e0f4189 Address security audit warnings Updates "minimist" package which is used by 7y old "optimist". https://github.com/gravitational/webapps/commit/e0f4189 * Add new attr to Session struct (#3574) * Add fields ServerHostname and ServerAddr * Set these fields on newSession * Ensure webassets submodule during build * Update e-ref * Ensure webassets before running unit-tests * Update E-ref Co-authored-by: Lisa Kim <lisa@gravitational.com> Co-authored-by: Pierre Beaucamp <pierre@gravitational.com> Co-authored-by: Jenkins <jenkins@gravitational.io>	2020-04-15 15:35:26 -04:00
Andrew Lytvynov	f8661edea3	Clean up dead code across the codebase Spring cleaning! A very mechanical cleanup using several linters (unused, deadcode, structcheck). Build and tests still pass so no behavior should be affected.	2020-04-09 21:10:12 +00:00
Forrest Marshall	24e6d73224	fix cert reissue compatibility	2020-03-17 11:30:16 -07:00
Forrest Marshall	ea45118850	detect old cert format on startup	2020-03-05 10:30:20 -08:00
Forrest Marshall	56eea87d13	implement transparent UUID based routing	2020-03-05 10:30:20 -08:00
Benjamin Alpert	6061d9ebb7	changed to conventions	2020-02-20 17:50:22 -08:00
Benjamin Alpert	17e92c0321	small fix	2020-02-20 17:50:22 -08:00
Benjamin Alpert	1d0c1b0c3d	Fixes integration of dynamodb compatible endpoints (#3329 )	2020-02-20 17:50:22 -08:00
Russell Jones	de25684689	Added testing.Verbose to allow silencing of tests.	2020-02-06 11:15:44 -08:00
Sasha Klizhentas	a22f7be365	Adds in-memory cache option, improves scalability for IOT mode. This commit resolves #3227 In IOT mode, 10K nodes are connecting back to the proxies, putting a lot of pressure on the proxy cache. Before this commit, Proxy's only cache option were persistent sqlite-backed caches. The advantage of those caches that Proxies could continue working after reboots with Auth servers unavailable. The disadvantage is that sqlite backend breaks down on many concurrent reads due to performance issues. This commit introduces the new cache configuration option, 'in-memory': ```yaml teleport: cache: # default value sqlite, # the only supported values are sqlite or in-memory type: in-memory ``` This cache mode allows two m4.4xlarge proxies to handle 10K IOT mode connected nodes with no issues. The second part of the commit disables the cache reload on timer that caused inconsistent view results for 10K displayed nodes with servers disappearing from the view. The third part of the commit increases the channels buffering discovery requests 10x. The channels were overfilling in 10K nodes and nodes were disconnected. The logic now does not treat the channel overflow as a reason to close the connection. This is possible due to the changes in the discovery protocol that allow target nodes to handle missing entries, duplicate entries or conflicting values.	2020-02-06 09:16:48 -08:00
Benjamin Alpert	219cada7cc	Added S3 third party support (#3054 )	2020-01-02 17:56:53 -08:00
Forrest Marshall	568e185996	Add support for access request resource to cache (#3213 ) Cache was missing support for access requests, causing watchers to hang indefinitely without receiving events when cache was in use.	2019-12-19 14:14:22 -08:00
Russell Jones	a5af2d72ff	More enhanced session recording error imporvements Move BPF checks to node startup and group them togeather.	2019-12-06 18:14:26 -08:00
Russell Jones	8aaed66c26	Improve enhanced session recording error messages. If the user enabled enhanced session recording in file configuration but the binary was built without BPF support (like macOS) then exit right away with a message explaining that their operating system does not support enhanced session recording.	2019-12-06 18:14:26 -08:00
Josh D	69d78b63c4	Make Teleport log its version upon service start #3145 (#3168 ) * Make Teleport log its version upon service start #3145 This change implements a resolution to issue #3145. Version and Gitref string are output when components start information is logged. https://github.com/gravitational/teleport/issues/3145 * fix merge artifact	2019-12-04 15:22:00 -08:00
Russell Jones	77e8b63470	Enhanced Session Recording. Added package cgroup to orchestrate cgroups. Only support for cgroup2 was added to utilize because cgroup2 cgroups have unique IDs that can be used correlated with BPF events. Added bpf package that contains three BPF programs: execsnoop, opensnoop, and tcpconnect. The bpf package starts and stops these programs as well correlating their output with Teleport sessions and emitting them to the audit log. Added support for Teleport to re-exec itself before launching a shell. This allows Teleport to start a child process, capture it's PID, place the PID in a cgroup, and then continue to process. Once the process is continued it can be tracked by it's cgroup ID. Reduced the total number of connections to a host so Teleport does not quickly exhaust all file descriptors. Exhausting all file descriptors happens very quickly when disk events are emitted to the audit log which are emitted at a very high rate. Added tarballs for exec sessions. Updated session.start and session.end events with additional metadata. Updated the format of session tarballs to include enhanced events. Added file configuration for enhanced session recording. Added code to startup enhanced session recording and pass package to SSH nodes.	2019-12-02 15:10:39 -08:00
Heather Young	69f0698636	Architecture revision (#3093 ) Architecture revisions from @one000mph and &Yet.	2019-10-22 11:10:28 -07:00
Sasha Klizhentas	779b50c083	Merge branch 'gcp_ha_support' of https://github.com/bigcommerce/teleport into bigcommerce-gcp_ha_support	2019-09-24 11:00:53 -07:00
Alexander Klizhentas	7f494f7c10	Updating dependencies for etcd v3.3.15 (#2965 ) Fixes #2762 This commit updates go etcd client that fixes issue of the first etcd peer going down briging down the whole cluster.	2019-09-08 10:50:56 -07:00
Joshua Durbin	d346f2b124	adds support for GCP HA environments with gcs recording storage, firestore-backed events, and firestore backend storage	2019-09-05 13:09:55 -07:00
Forrest Marshall	05f3eeaf00	Support resource-based bootstrapping for backend. (#2871 ) * Support resource-based bootstrapping for backend. Outside of static configuration, most of the persistent state of an auth server exists as a collection of resources, stored in its backend. The resource API also forms the basis of Teleport's more advanced dynamic configuration options. This commit extends the usefulness of the resource API by adding the ability to bootstrap backend state with a set of previously exported resources. This allows the resource API to serve as a rudimentary backup/migration tool. Notes: This features is a work in progress, and very easy to misuse; while it will prevent you from overwriting the state of an existing auth server, it won't stop you from bootstrapping into a wildly misconfigured state. In general, resource-based bootstrapping is not a complete solution for backup or migration. * update e-ref	2019-08-29 16:16:03 -07:00
Gus Luxton	4b022fcacb	Handle HTTP connections to TLS socket in a more graceful way (#2886 )	2019-08-13 14:03:22 -03:00
Russell Jones	630d2bf266	Only check certificate algorithms in FIPS mode. Update utils.CertChecker to only check key and certificate algorithms when in FIPS mode. Otherwise accept keys and certificates generated with any algorithm.	2019-07-26 13:25:18 -07:00
Sasha Klizhentas	aa2335151a	Add ability to output audit logs to stdout. This commit implements #2872. Similarly to file://, the scheme `stdout://` could be used complimentary to the existing external scheme to logs audit logs: to stdout: ```yaml audit_events_uri: ['dynamodb://events', 'stdout://',] ``` Just like `file://` scheme it is only possible to use 'stdout://' scheme when external events and session uploader are defined, so all audit upload and search features of teleport could work.	2019-07-25 16:10:23 -07:00
Sasha Klizhentas	ba1fcf5d77	Fix teleport parsing to support IPV6 This commit fixes #2124	2019-07-23 20:36:34 -07:00
Alexander Klizhentas	32b84e6765	Read join tokens from file, fixes #2515 . (#2864 )	2019-07-17 12:51:18 -07:00
Russell Jones	9c2cfa1cd8	Cleanup of dead code. * Removed legacy backends no longer supported. * Removed code marked for deletion. * Updated Makefile to use $ instead of ` to match Enterprise.	2019-07-02 18:01:44 -07:00
Russell Jones	c19765a9a4	Rotate certificate upon valid principals change. Ignore 0.0.0.0 when checking if certificate needs to be rotated.	2019-06-28 16:08:12 -07:00
Russell Jones	15478ec065	Don't include port in host certificate principals. When attempting to guess the IP address of a remote host to add to the host certificate, always remove the port. Improve logging, so it's clear when a nodes host certificate changes due to the principals list being updated.	2019-06-24 11:34:52 -07:00
Russell Jones	3a0a8548d4	Improve logging when failing to connect to cluster.	2019-06-13 10:44:32 -07:00
Russell Jones	ecda810e46	Connect to tunnel nodes through recording proxy. Pass connection to target node, even if it's a node connected over a reverse tunnel, to the forwarding server.	2019-06-12 16:26:06 -07:00
Russell Jones	089de07e5c	Remove IP from nodes connected over tunnel. Don't heartbeat address for nodes connected to clusters over a reverse tunnel. Print warning to users if listen_addr or public_addr are set as these are not used.	2019-06-12 16:26:06 -07:00
Russell Jones	4e773e9b38	Remove CredentialsClient. Removed CredentialsClient, instead pass client.HostCredentials to auth.Register to break circular import.	2019-05-21 09:59:10 -07:00
Russell Jones	a795aec624	Update preference order for tunnel address. Return the tunnel address in the following preference order: 1. Reverse Tunnel Public Address. 2. SSH Proxy Public Address. 3. HTTP Proxy Public Address. 4. Tunnel Listen Address.	2019-05-07 13:17:13 -07:00
Russell Jones	09241c635e	Added support for FedRAMP/FIPS 140-2. Added "--fips" flag to "teleport start" command which can start Enterprise in FedRAMP/FIPS 140-2 mode. In FIPS mode, Teleport configures the TLS and SSH servers with FIPS compliant cryptographic algorithms. In FIPS mode, if non-compliant algorithms are chosen, Teleport will fail to start. In addition, Teleport checks if the binary was compiled against an approved cryptographic module (BoringCrypto) and fails to start if it was not. If a client, like tsh, tries to use non-FIPS encryption, like NaCl, those requests are also rejected.	2019-05-07 12:51:02 -07:00
Sasha Klizhentas	d308e3a68e	Fix several goroutine/connection leaks This commit fixes several gorotuine and connection leaks by setting header read timeout on http servers and cleaning up failed connections.	2019-05-07 12:21:01 -07:00
Sasha Klizhentas	e3ca4df5fc	Simplify IOT reverse tunnel logic. In case of IOT (whenever teleport nodes are connecting to the proxy), there is no need to create ReverseTunnel objects in the backend, as there is always one reverse tunnel per node. This commit removes the logic that created reverse tunnel object in the backed in IOT cases and refactors some other parts of the code.	2019-05-03 10:51:06 -07:00
Russell Jones	e534183a46	Support dial back nodes with Trusted Clusters. Instantiate agent pool (and agent) with a reference to the reverse tunnel server. Pass list of principals to agents when initiating a transport dial request. The above two changes allow the agent to look up principals in local site when attempting to connect to a node within a trusted cluster.	2019-05-02 17:28:39 -07:00
Sasha Klizhentas	d5243dbe8d	Add keep alive support to GRPC clients. This commit turns on KeepAlive support for GRPC clients to make sure that dropped connections are detected properly.	2019-05-02 15:09:33 -07:00
Sasha Klizhentas	7467e47718	Cache auth servers and new find endpoint Whenever many IOT style nodes are connecting back to the web proxy server, they all call /find endpoint to discover the configuration. This new endpoint is designed to be fast and not hit the database. In addition to that every proxy reverse tunnel connection handler was fetching auth servers and this commit adds caching for the auth servers on the proxy side.	2019-04-30 17:43:01 -07:00
Russell Jones	6d1c16f745	Added support for nodes dialing back to cluster. Updated services.ReverseTunnel to support type (proxy or node). For proxy types, which represent trusted cluster connections, when a services.ReverseTunnel is created, it's created on the remote side with name /reverseTunnels/example.com. For node types, services.ReverseTunnel is created on the main side as /reverseTunnels/{nodeUUID}.clusterName. Updated services.TunnelConn to support type (proxy or node). For proxy types, which represent trusted cluster connections, tunnel connections are created on the main side under /tunnelConnections/remote.example.com/{proxyUUID}-remote.example.com. For nodes, tunnel connections are created on the main side under /tunnelConnections/example.com/{proxyUUID}-example.com. This allows searching for tunnel connections by cluster then allows easily creating a set of proxies that are missing matching services.TunnelConn. The reverse tunnel server has been updated to handle heartbeats from proxies as well as nodes. Proxy heartbeat behavior has not changed. Heartbeats from nodes now add remote connections to the matching local site. In addition, the reverse tunnel server now proxies connection to the Auth Server for requests that are already authenticated (a second authentication to the Auth Server is required). For registration, nodes try and connect to the Auth Server to fetch host credentials. Upon failure, nodes now try and fallback to fetching host credentials from the web proxy. To establish a connection to an Auth Server, nodes first try and connect directly, and if the connection fails, fallback to obtaining a connection to the Auth Server through the reverse tunnel. If a connection is established directly, node startup behavior has not changed. If a node establishes a connection through the reverse tunnel, it creates an AgentPool that attempts to dial back to the cluster and establish a reverse tunnel. When nodes heartbeat, they also heartbeat if they are connected directly to the cluster or through a reverse tunnel. For nodes that are connected through a reverse tunnel, the proxy subsystem now directs the reverse tunnel server to establish a connection through the reverse tunnel instead of directly. When sending discovery requests, the domain field has been replaced with tunnelID. The tunnelID field is either the cluster name (same as before) for proxies, or {nodeUUID}.example.com for nodes.	2019-04-26 15:41:45 -07:00
Alexander Klizhentas	6b5935fb71	Use RADIX trees for prefix matching. (#2666 ) Buffer fan out used simple prefix match in a loop, what resulted in high CPU load on many connected watchers. This commit switches to RADIX trees for prefix matching what reduces CPU load substantially for 5K+ connected watchers.	2019-04-22 15:28:04 -07:00
Sasha Klizhentas	8356ae6a74	Use in-memory cache for the auth server API. This commit expands the usage of the caching layer for auth server API: * Introduces in-memory cache that is used to serve all Auth server API requests. This is done to achieve scalability on 10K+ node clusters, where each node fetches certificate authorities, roles, users and join tokens. It is not possible to scale DynamoDB backend or other backends on 10K reads per seconds on a single shard or partition. The solution is to introduce an in-memory cache of the backend state that is always used for reads. * In-memory cache has been expanded to support all resources required by the auth server. * Experimental `tctl top` command has been introduced to display common single node metrics. Replace SQLite Memory Backend with BTree SQLite in memory backend was suffering from high tail latencies under load (up to 8 seconds in 99.9%-ile on load configurations). This commit replaces the SQLite memory caching backend with in-memory BTree backend that brought down tail latencies to 2 seconds (99.9%-ile) and brought overall performance improvement.	2019-04-12 14:23:09 -07:00
Sasha Klizhentas	dd5f343e8b	Kubernetes SNI proxy improvements, implements #2614 This commit hex encodes trusted cluster names in target addresses for kubernetes SNI proxy. For example, assuming public address of Teleport Kubernetes proxy is main.example.com, and trusted cluster is remote.example.com, resulting target address added to kubeconfig will look like k72656d6f74652e6578616d706c652e636f6d0a.main.example.com And Teleport Proxy's DNS Name will include wildcard: '.main.example.com' in addition to 'main.example.com' Note that no dots are in the SNI address thanks to hex encoding. This will allow administrators to avoid manually updating list of public_addr sections every time the trusted cluster and use the wildcard DNS name. The following addr: remote.example.com.main.example.com would not have matched .main.example.com per DNS wildcard spec.	2019-03-21 14:05:33 -07:00
Sasha Klizhentas	aefe8860c1	Kubernetes proxy to use impersonation API This commit switches Teleport proxy to use impersonation API instead of the CSR API. This allows Teleport to work on EKS clusters, GKE and all other CNCF compabitble clusters. This commit updates helm chart RBAC as well. It introduces extra configuration flag to proxy_service configuration parameter: ```yaml proxy_service: # kubeconfig_file is used for scenarios # when Teleport Proxy is deployed outside # of the kubernetes cluster kubeconfig_file: /path/to/kube/config ``` It deprecates similar flag in auth_service: ```yaml auth_service: # DEPRECATED. THIS FLAG IS IGNORED kubeconfig_file: /path/to/kube/config ```	2019-03-18 15:46:49 -07:00
Russell Jones	ac9af87dfb	Emit data transfer events. Created utils.TrackingConn that wraps the server side net.Conn and is used to track how much data is transmitted and received over the net.Conn. At the close of a connection (close of a srv.ServerContext) the total data transmitted and received is emitted to the Audit Log.	2019-03-08 19:22:20 +00:00
Sasha Klizhentas	f4635de5c2	Allow S3 buckets in different regions, implements #2007 This commit allows additional configuration for the `audit_sessions_uri` parameter: `audit_sessions_uri: s3://example.com/path?region=us-east-1` Additional query parameter `region` will override default `audit` section `region` if set.	2019-02-09 21:39:40 -08:00
Russell Jones	cea10926a2	Convert "permission denied" errors into trace errors.	2019-02-04 10:10:37 -08:00
Russell Jones	7a62b25921	Validate host certificates in both tsh as well as the recording proxy. Add IP addresses to host certificate.	2018-12-12 16:33:03 -08:00
Sasha Klizhentas	f40df845db	Events and GRPC API This commit introduces several key changes to Teleport backend and API infrastructure in order to achieve scalability improvements on 10K+ node deployments. Events and plain keyspace -------------------------- New backend interface supports events, pagination and range queries and moves away from buckets to plain keyspace, what better aligns with DynamoDB and Etcd featuring similar interfaces. All backend implementations are exposing Events API, allowing multiple subscribers to consume the same event stream and avoid polling database. Replacing BoltDB, Dir with SQLite ------------------------------- BoltDB backend does not support having two processes access the database at the same time. This prevented Teleport using BoltDB backend to be live reloaded. SQLite supports reads/writes by multiple processes and makes Dir backend obsolete as SQLite is more efficient on larger collections, supports transactions and can detect data corruption. Teleport automatically migrates data from Bolt and Dir backends into SQLite. GRPC API and protobuf resources ------------------------------- GRPC API has been introduced for the auth server. The auth server now serves both GRPC and JSON-HTTP API on the same TLS socket and uses the same client certificate authentication. All future API methods should use GRPC and HTTP-JSON API is considered obsolete. In addition to that some resources like Server and CertificateAuthority are now generated from protobuf service specifications in a way that is fully backward compatible with original JSON spec and schema, so the same resource can be encoded and decoded from JSON, YAML and protobuf. All models should be refactored into new proto specification over time. Streaming presence service -------------------------- In order to cut bandwidth, nodes are sending full updates only when changes to labels or spec have occured, otherwise new light-weight GRPC keep alive updates are sent over to the presence service, reducing bandwidth usage on multi-node deployments. In addition to that nodes are no longer polling auth server for certificate authority rotation updates, instead they subscribe to event updates to detect updates as soon as they happen. This is a new API, so the errors are inevitable, that's why polling is still done, but on a way slower rate.	2018-12-10 17:20:24 -08:00
Cove Schneider	8b299e9c28	spelling cleanup	2018-11-15 12:44:51 -08:00
Russell Jones	c18e33b71f	Support different ready states.	2018-11-05 15:00:32 -08:00
Sasha Klizhentas	193a465e0d	Reduce the polling period for rotation. This commit reduces traffic consumed by the teleport cluster by polling for CA status less frequently. It also addresses a bug in cert regeneration that checked for the wrong principals	2018-10-26 17:00:37 -07:00
Russell Jones	e69e67e372	Add support for CA pinning when joining a cluster.	2018-10-15 16:44:27 -07:00
Sasha Klizhentas	02a33675ed	Detect remote cluster by SNI name This commit improves performance of teleport with hundreds of connected trusted clusters. TLS handshake protocol expects server to send a list of trusted certificate authorities to the client and client must present certificate signed by those. With Teleport current implementation, every remote cluster client is signed by local certificate and is not cross signed. Auth server now expects clients to announce the remote cluster they are connecting from using SNI. Auth server will send only certificate authorities of the cluster announced via SNI. Alternative idea is to cross sign the certificate of the client of the remote cluster. We will explore this idea in the next releases. This commit also removes unnecessary reads from the database to check the remote server status that slows down user interface and other clients. This is done at the expense of proxies showing servers as offline in case if this individual proxy does not have the connection, although it's a small UI price to pay for not reading the database, as proxy will eventually get the connection thanks to the discovery protocol.	2018-09-28 11:00:36 -07:00
Sasha Klizhentas	cd068733ab	Read kubernetes config from kubeconfig Fixes #1986 When deployed outside of the kubernetes cluster, teleport now reads all configuration from kubernetes config file, supplied via parameter. Auth server then passes information about target api server back to the proxy.	2018-09-25 17:32:28 -07:00
Sasha Klizhentas	f17be1d9b9	Do not reveal error to the client, do not log token	2018-09-14 18:36:48 -07:00
Sasha Klizhentas	5c4a80a14b	Shut down process on critical services errors Whenever critical services in teleport exit with errors, system should shut down immediatelly and exit with error. This was not the case since 2.7 release.	2018-09-07 18:09:18 -07:00
Sasha Klizhentas	dca65a5234	Move keypair generation from auth server. When many nodes join the cluster or rotate certificates, auth server was forced to generate may private/public key pairs simultaneosly creating bottleneck on the auth server side. This commit pushes the private public key generation logic back to clients releiving the pressure from auth server.	2018-09-05 17:48:45 -07:00
Russell Jones	97074076cb	Split public_addr into web_proxy_addr and ssh_proxy_addr.	2018-08-31 16:33:54 -07:00
Russell Jones	1439408b34	If the server has a public address set, use that as the address instead of the one passed in by the user.	2018-08-31 16:33:54 -07:00
Russell Jones	3d9c34f1f0	Don't pass and clone client tls.Config, instead pass cipher suites and create new tls.Config. Add test coverage for this.	2018-08-21 17:09:57 -07:00
Sasha Klizhentas	1f3b4e2c96	Kubernetes configuration, fetch proxy settings. This commit moves proxy kubernetes configuration to a separate nested block to provide more fine grained settings: ```yaml auth: kubernetes_ca_cert_path: /tmp/custom-ca proxy: enabled: yes kubernetes: enabled: yes public_addr: [custom.example.com:port] api_addr: kuberentes.example.com:443 listen_addr: localhost:3026 ``` 1. Kubernetes config section is explicitly enabled and disabled. It is disabled by default. 2. Public address in kubernetes section is propagated to tsh profile The other part of the commit updates Ping endpoint to send proxy configuration back to the client, including kubernetes public address and ssh listen address. Clients updates profile accordingly to configuration received from the proxy.	2018-08-06 11:57:36 -07:00
Sasha Klizhentas	fbb5aa2986	Initiate self-shutdown of a failed forked process. This commit fixes #1970. Original process is started but has failed to join the cluster and repeatedly connects to it. This process is not ready yet, but can process HUP (reload events). 1. HUP event is sent to a parent process. 2. The parent process forks a child process and awaits a message from the child on the signal pipe. 3. If the child process fails to connect to the cluster as well, it does not emit Ready event and a message is never sent to the parent. 4. Parent process fails to receive a message and assumes that parent process has failed to start. As a result of this, there are two processes both trying to connect to the cluster. This commit changes behavior by adding extra step - if the child process fails to enter ready stat and it is aware that it was forked by the parent process, it initates self-shutdown.	2018-07-23 13:24:18 -07:00
Russell Jones	6439f7f973	Support configurable cipher suites.	2018-07-23 10:29:28 -07:00
Sasha Klizhentas	031168bbd4	Add readyz endpoint and clusters metrics. This commit fixes #1610. New readyz endpoint is added to existing /metrics and /healthz endpoints activated by diag addr-flag: `teleport start --diag-addr=127.0.0.1:1234` Readyz endpoint will report 503 if node or proxy failed to connect to the cluster and 200 OK otherwise. Additional prometheus gagues report connection count for trusted and remote clusters: ``` remote_clusters{cluster="one"} 1 remote_clusters{cluster="two"} 1 trusted_clusters{cluster="one",state="connected"} 0 trusted_clusters{cluster="one",state="connecting"} 0 trusted_clusters{cluster="one",state="disconnected"} 0 trusted_clusters{cluster="one",state="discovered"} 1 trusted_clusters{cluster="one",state="discovering"} 0 ```	2018-07-20 19:01:15 -07:00
Sasha Klizhentas	f699bd1a76	Fix error handling in audit sessions and events. This commit fixes #2084	2018-07-18 15:54:51 -07:00
Sasha Klizhentas	e595c3793d	Log events to multiple destinations This commit implements #2070 ```yaml teleport: storage: type: dir audit_events_uri: [file:///var/lib/teleport/events, dynamodb://test_grv8_events] audit_sessions_uri: s3://testgrv8records ```	2018-07-16 18:34:13 -07:00
Sasha Klizhentas	66fa34bcde	Add framework for trusted cluster K8s access	2018-06-22 12:56:58 -07:00
Russell Jones	a62102c3e8	Add ability to detect when a proxy has been removed forever to discovery protocol.	2018-06-21 23:14:52 +00:00
Sasha Klizhentas	03069a2aad	Kubernetes proxy integration tests. This PR contains Kubernetes proxy integration tests and associated internal changes.	2018-06-14 16:47:52 -07:00
Sasha Klizhentas	cece4be212	Initial implementation of Kubernetes support This issue updates #1986. This is intial, experimental implementation that will be updated with tests and edge cases prior to production 2.7.0 release. Teleport proxy adds support for Kubernetes API protocol. Auth server uses Kubernetes API to receive certificates issued by Kubernetes CA. Proxy intercepts and forwards API requests to the Kubernetes API server and captures live session traffic, making recordings available in the audit log. Tsh login now updates kubeconfig configuration to use Teleport as a proxy server.	2018-06-03 12:55:13 -07:00
Russell Jones	19b2936514	Update default cryptographic primitives.	2018-05-08 14:47:07 -07:00
Sasha Klizhentas	ddd5150dd3	Introduce additional phase to CA rotation. Flaky tests in teleport integration suite uncovered a problem. It is possible that main cluster rotates certificate authority, and will try to dial to the remote cluster with new credentials before the remote cluster could fetch the new CA to trust. To fix this, phase "update_clients" was split in two phases: * Init and Update clients Init phase does nothing on the main cluster except generating new certificate authorities, that are trusted but not used in the cluster. This phase exists to give remote clusters opporunity to update the list of trusted certificate authorities of the main cluster, before main cluster reconnects with new clients in "Update clients" phase.	2018-05-07 15:23:28 -07:00
Sasha Klizhentas	074961892a	Precompute keys only for auth and proxies. This commit fixes #1886 Previously the code was precomputing keys even for SSH nodes, that do not need precomputed private keys pool.	2018-05-04 13:41:13 -07:00
Russell Jones	876e04af07	* Push window size changes to clients instead of polling. * Cache services.ClusterConfig within srv.ServerContext for the duration of a connection. * Create a single websocket between the browser and the proxy for all * terminal bytes and events.	2018-05-04 18:28:36 +00:00
Sasha Klizhentas	350ccc3ecd	Delete code deprecated in 2.6.0 This commit fixes #1805	2018-05-03 16:44:39 -07:00
Sasha Klizhentas	daff8de6ef	Switch to default dir backend. This commit fixes #1741 * If bolt backend was used as a default, new teleport continues using it as a default to prevent regressions on start. * Otherwise, dir backend is used as a default.	2018-05-03 11:06:08 -07:00
Sasha Klizhentas	a4c86e0603	Add public_addr support for auth and ssh services. This commit fixes #1803, fixes #1889 * Adds support for public_addr for Proxy and Auth * Parameter advertise_ip now supports host:port format * Fixes incorrect output for tctl get proxies * Fixes duplicate output of some error messages.	2018-05-02 18:04:05 -07:00
Sasha Klizhentas	3e144cb900	Teleport certificate authority rotation. This commit implements #1860 During the the rotation procedure issuing TLS and SSH certificate authorities are re-generated and all internal components of the cluster re-register to get new credentials. The rotation procedure is based on a distributed state machine algorithm - certificate authorities have explicit rotation state and all parts of the cluster sync local state machines by following transitions between phases. Operator can launch CA rotation in auto or manual modes. In manual mode operator moves cluster bewtween rotation states and watches the states of the components to sync. In auto mode state transitions are happening automatically on a specified schedule. The design documentation is embedded in the code: lib/auth/rotate.go	2018-04-30 12:58:57 -07:00
Russell Jones	6be8af16c5	Removed depreciated code and re-factored tests to use golang.org/x/crypto.	2018-04-05 23:14:20 +00:00
Sasha Klizhentas	2b1175fea5	Write PID file before signal the parent process. This fixes the race with systemd reload. P - parent, C - child During live reload scenario, the following happens: P -> forks C P -> blocks on pipe read C -> writes to pipe C -> writes pid file P < - reads message from pipe P <- shuts down However, there is a race: P -> forks C P -> blocks on pipe read C -> writes to pipe P < - reads message from pipe P <- shuts down C -> writes pid file In this case parent process exited before child process writes new pid file what makes systemd think that main process is down and stop both processes. This fix changes the sequence to: P -> forks C P -> blocks on pipe read C -> writes pid file C -> writes to pipe P < - reads message from pipe P <- shuts down to make sure the race can't happen any more.	2018-04-04 16:21:23 -07:00
Sasha Klizhentas	533b45bdff	Use signal pipe to make live reload better. This commit allows teleport parent process to track the status of the forked child process using os.Pipe. Child process signals success to parent process by writing to Pipe. This allows HUP and USR2 to be more intelligent as they can now detect the failure or success of the process.	2018-04-03 17:25:43 -07:00
Sasha Klizhentas	9af093e6f6	Introduce new upload API. This PR improves session recording: * Nodes and proxies always buffer recorded sessions to disk during the session what improves performance and makes the recording more resilient to network failures. * Async uploader running on proxy or node always uploads the session tarball to the audit log server. * Audit log server is the only component uploading to the S3 or any other API.	2018-03-29 15:15:05 -07:00
Sasha Klizhentas	8898f4235d	Add support for paths in S3 URI uploader.	2018-03-26 09:18:36 -07:00
Sasha Klizhentas	0f43c4935d	Turn off proxy support when no-tls is used. Fixes #1800	2018-03-20 17:55:39 -07:00
Russell Jones	785967e37f	Added PAM support to Teleport.	2018-03-20 14:20:43 -07:00
Sasha Klizhentas	7d05c05b5b	Fix logging, collect status of forked processes fixes #1785, fixes #1776 This commit fixes several issues with output: First teleport start now prints output matching quickstart guide and sets default console logging to ERROR. SIGCHLD handler now only collects processes PID forked during live restart to avoid confusing other wait calls that have no process status to collect any more.	2018-03-19 16:46:10 -07:00
Russell Jones	8de02770ef	Include nodename in the host certificate.	2018-03-16 22:01:44 +00:00
Ev Kontsevoy	78139cc512	Updated log message	2018-03-15 17:06:32 -07:00
Sasha Klizhentas	bad1b0498d	External events and sessions storage. Updates #1755 Design ------ This commit adds support for pluggable events and sessions recordings and adds several plugins. In case if external sessions recording storage is used, nodes or proxies depending on configuration store the session recordings locally and then upload the recordings in the background. Non-print session events are always sent to the remote auth server as usual. In case if remote events storage is used, auth servers download recordings from it during playbacks. DynamoDB event backend ---------------------- Transient DynamoDB backend is added for events storage. Events are stored with default TTL of 1 year. External lambda functions should be used to forward events from DynamoDB. Parameter audit_table_name in storage section turns on dynamodb backend. The table will be auto created. S3 sessions backend ------------------- If audit_sessions_uri is specified to s3://bucket-name node or proxy depending on recording mode will start uploading the recorded sessions to the bucket. If the bucket does not exist, teleport will attempt to create a bucket with versioning and encryption turned on by default. Teleport will turn on bucket-side encryption for the tarballs using aws:kms key. File sessions backend --------------------- If audit_sessions_uri is specified to file:///folder teleport will start writing tarballs to this folder instead of sending records to the file server. This is helpful for plugin writers who can use fuse or NFS mounted storage to handle the data. Working dynamic configuration.	2018-03-15 12:42:43 -07:00
Ev Kontsevoy	0c95f9f613	User-visible errors polish Fixes #1779	2018-03-14 17:00:44 -07:00
Sasha Klizhentas	61de96f45c	Add failsafe for bolt, fixes #1729	2018-03-02 12:35:49 -08:00
Sasha Klizhentas	e809a7eb2c	Better signal handling and pools for gzip. Fixes #1698. * Added sync.Pool to take care of many gzip.Writer allocating a lot of large objects on the heap. * Reshuffled signal handling, SIGQUIT is now graceful shutdown, just like in Nginx. * Signal USR1 prints hepful diagnostic info to stderr. * Removed gops endpoint and flags. * Fixed logs in some places. * Debug flag now adds extra pprof handlers to diagnostic endpoint.	2018-02-19 10:57:26 -08:00
Russell Jones	1a343de853	Fixed incorrect access of logger.	2018-02-15 21:23:34 +00:00
Russell Jones	b139f72cab	Create single instance of keygen per process. Use cache of precomputed certificates when using recording proxy.	2018-02-15 21:23:30 +00:00
Sasha Klizhentas	11672b8493	Terraform improvements. Fixes #1671 * Add notes about TOS agreements for AMI * Use specific UID for Teleport instances * Use encrypted EFS for session storage * Default scale up auto scaling groups to amount of AZs * Move dashboard to local file * Fix dynamo locking bug * Move PID writing fixing enterprise pid-file * Add reload method for teleport units	2018-02-14 15:09:56 -08:00
Sasha Klizhentas	7b1b29be80	Add go-client initial example.	2018-02-13 18:55:24 -08:00
Sasha Klizhentas	68b65f5b24	Teleport signal handling and live reload. This commit introduces signal handling. Parent teleport process is now capable of forking the child process and passing listeners file descriptors to the child. Parent process then can gracefully shutdown by tracking the amount of current connections and closing listeners once the amount goes to 0. Here are the signals handled: * USR2 signal will cause the parent to fork a child process and pass listener file descriptors to it. Child process will close unused file descriptors and will bind to the used ones. At this moment two processes - the parent and the forked child process will be serving requests. After looking at the traffic and the log files, administrator can either shut down the parent process or the child process if the child process is not functioning as expected. * TERM, INT signals will trigger graceful process shutdown. Auth, node and proxy processes will wait until the amount of active connections goes down to 0 and will exit after that. * KILL, QUIT signals will cause immediate non-graceful shutdown. * HUP signal combines USR2 and TERM signals in a convenient way: parent process will fork a child process and self-initate graceful shutdown. This is a more convenient than USR2/TERM sequence, but less agile and robust as if the connection to the parent process drops, but the new process exits with error, administrators can lock themselves out of the environment. Additionally, boltdb backend has to be phased out, as it does not support read/writes by two concurrent processes. This had required refactoring of the dir backend to use file locking to allow inter-process collaboration on read/write operations.	2018-02-13 15:18:47 -08:00
Russell Jones	f2b8bbd1c1	Added *Ready events that indicate a service has started. Wait on these events in integration events before starting a test.	2018-02-06 16:52:46 -08:00
Sasha Klizhentas	bb9b00e451	Cache recently accessed items. Introduce cache for items that were accessed by proxies and nodes within 2 second window to reduce load on database under high load.	2018-01-31 16:35:18 -08:00
Sasha Klizhentas	f0da64fb63	UX and performance changes * Do not log EOF errors, avoid polluting logs * Trim space from tokens when reading from file * Do not use dir based caching The caching problem deserves a separate explanation. Directory backend is not concurrent friendly - it has a fundamental design flaw - multiple gorotuines writing to the same file corrupt cache data. This requires either redesign of the backend or switching to boltdb backend for caching. Boltdb backend uses transactions and is safe for concurrent access. This PR changes local cache to use boltdb instead of the dir backend that is now used only in tests.	2018-01-22 12:25:11 -08:00
Alexey Kontsevoy	583858d2cb	add ClusterConfiguration section to teleport cfg	2018-01-20 14:25:31 -05:00
Sasha Klizhentas	c1153734b0	Add support for extra principals, fixes #1174 Add support for extra principals for proxy. Proxy section already supports public_addr property that is used during tctl users add output. Use the value from this property to update host SSH certificate for proxy service. proxy_service: public_addr: example.com:3024 With the configuration above, proxy host certificate will contain example.com principal in the SSH principals list.	2018-01-08 20:36:34 -08:00
Sasha Klizhentas	ef473d809e	Join address for web, reverse tunnel, fixes #1544 Support configuration for web and reverse tunnel proxies to listen on the same port. * Default config are not changed for backwards compatibility. * If administrator configures web and reverse tunnel addresses to be on the same port, multiplexing is turned on * In trusted clusters configuration reverse_tunnel_addr defaults to web_addr.	2018-01-05 16:20:56 -08:00
Sasha Klizhentas	71c15e5835	Add support for NFS-friendly log protocol. * Session events are delivered in continuous batches in a guaranteed order with every event and print event ordered from session start. * Each auth server writes to a separate folder on disk to make sure that no two processes write to the same file at a time. * When retrieving sessions, auth servers fetch and merge results recorded by each auth server. * Migrations and compatibility modes are in place for older clients not aware of the new format, but compatibility mode is not NFS friendly. * On disk migrations are launched automatically during auth server upgrades.	2018-01-04 18:54:37 -08:00
Sasha Klizhentas	0130c6aa41	Mutual TLS Auth server and clients. This commit introduced mutual TLS authentication for auth server API server. Auth server multiplexes HTTP over SSH - existing protocol and HTTP over TLS - new protocol on the same listening socket. Nodes and users authenticate with 2.5.0 Teleport using TLS mutual TLS except backwards-compatibility cases.	2017-12-27 11:37:19 -08:00
Russell Jones	3bfe61dc0b	Added integration tests and minor fixes.	2017-12-19 17:40:05 -08:00
Russell Jones	37ab1596c4	Updated reverse tunnel to allow use to forwarding server.	2017-12-09 19:29:20 +00:00
Russell Jones	7018852c5d	Added forwarding SSH server.	2017-12-04 17:01:52 -08:00
Roman Tkachenko	143b834e57	Changes for the upcoming teleport pro: * Allow external audit log plugins * Add support for auth API server plugins * Add license file path configuration parameter (not used in open-source) * Extend audit log with user login events	2017-11-21 17:35:58 -08:00
Sasha Klizhentas	f8c715ef41	make audit accessible by admin group members If user running teleport is a member of adm group create the directory and all subdirectories accessible to admins. Remove obsolete migrations required for pre 2.3 releases.	2017-11-17 17:58:34 -08:00
Sasha Klizhentas	fed7d2f116	fix audit log file leak, fixes #1433 This is a fix for file leak in audit log server caused by design issue: Session file descriptors in audit log were opened on demand when the session event or byte stream chunk was reported. AuditLog server relied on SessionEnd event to close the file descriptors associated with the session. However, when SessionEnd event does not arrive (e.g. there is a timeout or disconnect), the file descriptors were not closed. This commit adds periodic clean up of inactive sessions. SessionEnd is now used as an optimization measure to close the files, but is not used as the only trigger to close files. Now, inactive idle sessions, will close file descriptors after periods of inactivity and will reopen the file descriptors when the session activity resumes. SessionLogger was not designed to open/close files multiple times as it was reseting offsets every time the session files were opened. This change fixes this condition as well.	2017-11-15 18:39:27 -08:00
Russell Jones	1eb6f6bd52	Refactored lib/srv to support multiple servers.	2017-11-09 16:58:58 -08:00
sokoow	56f778a19d	Fixes for https://github.com/gravitational/teleport/pull/1426	2017-11-01 21:03:20 +00:00
sokoow	a737326042	Adding disable-tls flag, fixing https://github.com/gravitational/teleport/issues/1304	2017-11-01 21:03:20 +00:00
Russell Jones	146220e3c9	Set default cluster configuration when not specified.	2017-10-31 11:03:29 -07:00
Maximilien Richer	cbca7fe984	Merge branch 'master' into fix-typo	2017-10-27 17:29:13 +02:00
Russell Jones	432a7ad787	Added services.ClusterConfig resource which controls where (and if) a session is recorded.	2017-10-25 21:09:21 +00:00
mricher	b58cb051e8	Correct various typos This was fixed running the `misspell` linter in fix mode using `gometalinter`. The exact command I ran was : ``` gometalinter --vendor --disable-all -E misspell --linter='misspell:misspell -w {path}:^(?P<path>.?\.go):(?P<line>\d+):(?P<col>\d+):\s(?P<message>.*)$' ./... ``` Some typo were fixed by hand on top of it.	2017-10-20 10:20:26 +02:00
Sasha Klizhentas	9543bf2208	Merge branch 'master' into sasha/curiosity	2017-10-12 16:57:41 -07:00
Russell Jones	23ecf797e7	Corrected static token handling.	2017-10-12 01:10:05 +00:00
Sasha Klizhentas	6e4d6b0cb2	more work, discovery works	2017-10-07 18:11:03 -07:00
Sasha Klizhentas	e12ec7422c	refactoring	2017-10-05 17:29:31 -07:00
Ev Kontsevoy	93f7dd3bf9	Better handling of "development mode" Instead of quietly changing behavior because `DEBUG` envar was set to true, Teleport now explicitly requires scary --insecure flag to enable this behavior.	2017-09-10 13:45:14 -07:00
Ev Kontsevoy	e9bc910f92	Removed the unused "dynamicConfig" flag I noticed we have this setting in code, which is always set to false and never evaluated.	2017-09-06 14:19:54 -07:00
Russell Jones	7bf2b5c28f	Use node name (defaults to hostname) instead of host UUID.	2017-08-31 01:08:44 +00:00
Russell Jones	c543067001	Removed namespaces and expires from user interface.	2017-08-30 18:11:13 +00:00
Sasha Klizhentas	8b81a0c384	Migrate to golang/dep for dependency management Update following packages: * Replace Sirupsen/log with sirupsen/log everywhere * Update etcd client to 3.2.4 * Update docker/term to moby/term * Update kr/pty to v1.0.0 release * Update K8s client to 2.0	2017-08-22 15:30:30 -07:00
Russell Jones	b4c805fe23	Re-factored cluster configuration.	2017-08-07 17:20:16 -07:00
Russell Jones	084c8274b4	Allow configuration of the ciphers, KEX algorithm, and MAC algorithms for node and proxy.	2017-06-11 12:16:10 -07:00
Sasha Klizhentas	68ae676c0a	Merge branch 'master' into sasha/exp2	2017-05-27 15:38:44 -07:00
Russell Jones	78ee5d09ee	Added support for allowing the reading of a users environment when creating a new child session from ~/.tsh/environment.	2017-05-26 17:06:49 -07:00
Sasha Klizhentas	e766a3c902	work	2017-05-25 18:56:32 -07:00
Russell Jones	2bf011cb3d	Fix ^C ignored issue on CentOS 6.8.	2017-05-24 10:31:05 -07:00
Sasha Klizhentas	8ecfe3acc1	fix and complete tests	2017-05-20 12:52:03 -07:00
Sasha Klizhentas	bf211f5764	integration test	2017-05-19 19:03:28 -07:00
Sasha Klizhentas	3c2570fa35	Sasha High Availability.	2017-04-07 16:54:15 -07:00
Russell Jones	00567f6d0c	Added public_address to proxy server configuration and heartbeat.	2017-03-17 11:38:40 -07:00
Russell Jones	2f70866e5a	Added TrustedCluster resource.	2017-03-09 13:49:44 -08:00
Russell Jones	54c7f1cd32	Added dynamic_config and removed seed_config.	2017-03-01 16:44:34 -08:00
Russell Jones	2033d8093c	Fix configuration file regressions.	2017-02-24 14:48:52 -08:00
Russell Jones	1dcd3e11e5	Refactored authentication configuration, created resources for dynamic configuration of authentication configuration, and updated documentation.	2017-02-22 11:48:06 -08:00
Russell Jones	6295213815	Host certificate now presents two principals: hostUUID.clusterName and nodeName.clusterName.	2017-02-08 18:34:29 -08:00
Ev Kontsevoy	2150cb31de	The web UI is not using the CLI client TODO: - Configure the CLI client to NOT use a keystore - Configure the CLI client to NOT use ssh-agent - Fix tests - Comments	2017-02-02 22:54:48 -08:00
Sasha Klizhentas	83d979d007	fixing tests	2017-01-30 20:18:15 -08:00
Ev Kontsevoy	ac205ad530	Finished cleaning up storage back-ends I hope this closes #688	2017-01-15 23:23:37 -08:00
Ev Kontsevoy	7040331660	Fixed all tests Also replaced mailgun.FrozenTime with `clockwork` in a few places (mailgun's frozen time still remains elsewhere)	2017-01-15 16:28:18 -08:00
Ev Kontsevoy	312af8f02d	Converted DynamoDB and etcd to common backend API TODO: - fix etcd tests - do some manual testing of all backends	2017-01-15 16:28:18 -08:00
Ev Kontsevoy	40caec6048	Converted boltbk to the new format BoltDB backend is now compatible with how all backends should initialize. Also all BoltDB-specific code/constants have been consolidated inside of `backend.boltbk` package.	2017-01-15 16:28:18 -08:00
Ev Kontsevoy	eee8bac224	Added filesystem backend configuration parsing +initialization	2017-01-15 16:28:18 -08:00
Sasha Klizhentas	6dc157985e	Merge branch 'master' into sasha/oidc	2016-12-30 16:51:13 -08:00
Sasha Klizhentas	5eedbea1ad	fix integration tests	2016-12-30 14:47:52 -08:00
Ev Kontsevoy	bd96ce9d52	Removed a bunch of dead/unused code Originally Teleport had facilities to configure events/recordings via two separate backends. In reality those two objects (session events and session recordings) need each other and currently there is only one implementaiton of it. The old structures were unused. This commit is 100% dead code removeal.	2016-12-27 21:07:16 -08:00
Ev Kontsevoy	4ed536a2f0	First pass at cleaning up DynamoDB and locks - Added ability to read AWS config from `~/.aws` directory for testing - Fixed TTL bug in DynamoDB back-end - Made FS back-end return similar error types as Boltdb does - Cleaned up buggy tests for DynamoDB - Removed unnecessary locks everywhere in code	2016-12-27 00:12:59 -08:00
Sasha Klizhentas	0bf50323a9	Merge branch 'master' into sasha/rbac	2016-12-21 15:44:25 -08:00
Sasha Klizhentas	c8217f6d35	add missing pieces	2016-12-21 14:58:26 -08:00
Ev Kontsevoy	91f0492b00	U2F is turned off by default (when teleport.yaml is missing)	2016-12-20 16:39:11 -08:00
Sasha Klizhentas	5abf6d44d5	continue fixing tests and code	2016-12-18 16:58:53 -08:00
Sasha Klizhentas	13d61781b7	recover auth server tests	2016-12-18 12:00:17 -08:00
Sasha Klizhentas	a187b37503	Add namespace configuration parameter	2016-12-16 11:48:16 -08:00
Sasha Klizhentas	2dceb42547	Merge branch 'master' into sasha/rbac	2016-12-14 16:36:55 -08:00
Sasha Klizhentas	7e97b10032	add support for namespaces almost everywhere	2016-12-14 15:48:36 -08:00
jcj83429	0274afba8d	group the u2f configs in auth_service OLD: auth_service: u2fappid: https://mycorp.com/appid.js u2ftrustedfacets: - https://proxy1.mycorp.com:3080 - https://proxy2.mycorp.com:3080 NEW: auth_service: u2f: enabled: yes appid: https://mycorp.com/appid.js facets: - https://proxy1.mycorp.com:3080 - https://proxy2.mycorp.com:3080	2016-12-07 19:37:22 -08:00
Jay	ade8b1dc7b	Fixed merge conflicts with original repository	2016-11-30 17:08:20 -08:00
Ev Kontsevoy	d29a88f524	Web assets are packed into teleport binary Functionality: `teleport` binary now serves web assets from its own binary file. Unless `DEBUG` environment variable is set to "1" or "true", in this case it will look for ../web/dist (as located in github repo) which can be used for development. Design: To avoid accumulating 3rd party dependencies with a ton of extra features and licenses, this implementation uses minimalistic implementation of http.FileSystem interface on top of the embedded ZIP archive. 1. The assets are zipped into assets.zip during build process 2. assets.zip gets appended to the end of `teleport` binary 3. The resulting file is converted into a self-extracting ZIP 4. Teleport opens itself using the built-in zip unarchiver, and loads the assets on demand. Notes: 1. LOC is tiny (dozens) 2. RAM consumption is CONSTANT regardless of the ZIP size, about 500Kb increase vs load-from-file, and most of it is linking zip archive code from the standard library. Tested with a 20MB ZIP archive.	2016-10-30 20:40:46 -07:00
Ev Kontsevoy	c1b14333c8	Intermediate implementation 1. Everything works. 2. No tests.	2016-10-30 19:35:57 -07:00
jcj83429	a122452345	Merge branch 'master' into u2f Conflicts: lib/config/fileconf.go web/src/app/services/auth.js	2016-10-30 17:39:50 -07:00
jcj83429	a237d22c15	make U2F AppId always lower case to match Chrome's behaviour	2016-10-26 23:40:51 -07:00
Adrien Pestel	436ee596b6	DynamoDB backend This backend can be enabled by optionally adding a new build flag. See lib/backend/dynamo/README.md for details. It should not affect default Teleport builds.	2016-10-25 23:26:35 -07:00
jcj83429	2cff2aaa66	Merge upstream 'master' into u2f Conflicts: lib/config/fileconf.go tool/tsh/main.go	2016-10-24 00:08:26 -07:00
Ev Kontsevoy	4c00ac4338	PR comments	2016-10-23 14:29:46 -07:00
jcj83429	739308c5ae	got u2f login working on the CLI client. also grouped the u2f webapi endpoints together, and fixed the default u2f AppID so it works out of the box	2016-10-22 20:43:44 -07:00
Ev Kontsevoy	d549d884bc	"Cluster snapshot" became "caching AP client" Instead of trying to achieve a full "offline" operation, this commit honestly converts previous attempts to a "caching access point client" behavior. Closes #554	2016-10-16 22:04:20 -07:00
jcj83429	0a7733ff52	add U2F Trusted Facets to configuration files. Trusted facets must include the domain names of all proxies that users will log in with U2F from.	2016-10-16 21:03:09 -07:00
jcj83429	b79c4cffba	Implmented U2F registration and some of authentication on the server side I know comments are very lacking right now. Once things are stable I will add proper comments. Minimal manual testing of the U2F registration API was done with a hardware U2F key. Some of the code may need to be cleaned up later to remove excessively long variable names... Currently we return an error rightaway if the username/password combo is wrong. It's difficult to do U2F without revealing either whether a user exists or whether the password is correct. Returning error immediately reveals whether the user/password combo is valid, while waiting until we get a signed response from the U2F device to announce whether the user/pass combo is valid can reveal which users exist since we need to return a keyHandle in the U2F SignRequest and generating fake keyHandles for nonexistent users can be difficult to get right since there is no rigid format for keyHandle.	2016-10-13 23:51:16 -07:00
Ev Kontsevoy	1dc2d9c414	Intermediate commit where "standalone mode" kind of works What works: 1. You have to start all 3: node, proxy and auth. 2. Login using 'tsh' (so it will create a cert) 3. Then you can shut 'auth' down. 4. Proxy and node will stay up and tsh will be able to login. What doesn't work: 1. Auth updates are not visible to proxy/node (like new servers) 2. Not sure if "trusted clusters" will work.	2016-10-10 21:19:55 -07:00
Ev Kontsevoy	1d0ec48dfa	Started implementing "cluster snapshot" At this stage I have an in-memory snapshot of a "cluster state" which can be kept by nodes in-memory not requiring the auth connection to be up 100% of the time. Node and proxy are now both using this snapshot instead of a live connection to the auth server. Next steps: - Make node and proxy continue to work after the auth is killed. - Make the snapshot persistent. - Make node & proxy use persistence and be able to restart with the auth server down. IMPORTANT: Also found an interesting case where process identity is generated (on first start). Right now there wasn't any kind of locking, and concurrent identity initialization was possible. While it's not clear if this can cause any real world issue, I have refactored it into a separate lock-protected function.	2016-10-09 19:29:54 -07:00
Ev Kontsevoy	baafe3a332	Renamed GetLocalDomain() to GetDomainName()	2016-10-09 17:27:56 -07:00
Ev Kontsevoy	263ec1ca1e	Added "samples" directory with sample configurations Also some minor changes around error reporting...	2016-10-09 16:33:18 -07:00
Ev Kontsevoy	c7b4934553	Implemented a new Teleport option: "no recording" Teleport configuration now has a new field: NoAudit (false by default, which means audit is always on). When this option is set, Teleport will not record events and will not record sessions. It's implemented by adding "DiscardLogger" which implements the same interface as teh real logger, and it's plugged into the system instead. NOTE: this option is not exposed in teleport in any way: no config file, no switch, etc. I quickly needed it for Telecast.	2016-09-05 22:12:57 -07:00
Ev Kontsevoy	b4a6a4f972	Cleaned up Teleport logging * Downgraded many messages from `Debug` to `Info` * Edited messages so they're not verbose and not too short * Added "context" to some * Added logical teleport component as [COMPONENT] at the beginning of many, making logs vastly easier to read. * Added one more logging level option when creating Teleport (only Teleconsole uses it for now) The output with 'info' severity now look extremely clean. This is startup, for example: ``` INFO[0000] [AUTH] Auth service is starting on turing:32829 file=utils/cli.go:107 INFO[0000] [SSH:auth] listening socket: 127.0.0.1:32829 file=sshutils/server.go:119 INFO[0000] [SSH:auth] is listening on 127.0.0.1:32829 file=sshutils/server.go:144 INFO[0000] [Proxy] Successfully registered with the cluster file=utils/cli.go:107 INFO[0000] [Node] Successfully registered with the cluster file=utils/cli.go:107 INFO[0000] [AUTH] keyAuth: 127.0.0.1:56886->127.0.0.1:32829, user=turing file=auth/tun.go:370 WARN[0000] unable to load the auth server cache: open /tmp/cluster-teleconsole-client781495771/authservers.json: no such file or directory file=auth/tun.go:594 INFO[0000] [SSH:auth] new connection 127.0.0.1:56886 -> 127.0.0.1:32829 vesion: SSH-2.0-Go file=sshutils/server.go:205 INFO[0000] [AUTH] keyAuth: 127.0.0.1:56888->127.0.0.1:32829, user=turing.teleconsole-client file=auth/tun.go:370 INFO[0000] [AUTH] keyAuth: 127.0.0.1:56890->127.0.0.1:32829, user=turing.teleconsole-client file=auth/tun.go:370 INFO[0000] [Node] turing connected to the cluster 'teleconsole-client' file=service/service.go:158 INFO[0000] [AUTH] keyAuth: 127.0.0.1:56892->127.0.0.1:32829, user=turing file=auth/tun.go:370 INFO[0000] [SSH:auth] new connection 127.0.0.1:56890 -> 127.0.0.1:32829 vesion: SSH-2.0-Go file=sshutils/server.go:205 INFO[0000] [SSH:auth] new connection 127.0.0.1:56888 -> 127.0.0.1:32829 vesion: SSH-2.0-Go file=sshutils/server.go:205 INFO[0000] [Node] turing.teleconsole-client connected to the cluster 'teleconsole-client' file=service/service.go:158 INFO[0000] [Node] turing.teleconsole-client connected to the cluster 'teleconsole-client' file=service/service.go:158 INFO[0000] [SSH] received event(SSHIdentity) file=service/service.go:436 INFO[0000] [SSH] received event(ProxyIdentity) file=service/service.go:563 ``` You can easily tell that auth, ssh node and proxy have successfully started.	2016-09-02 17:28:18 -07:00
Ev Kontsevoy	936d6a252b	DisableWebUI implementation We had this flag in the configuration forever, but apparently it was being ignored. It allows teleport proxy to start without HTTP UI enabled. This is useful for proxies that strictly proxy and do nothing else. I ran into this bug when I first time used this flag for Telecast, it did not work, so I fixed it.	2016-08-25 20:02:48 -07:00
Ev Kontsevoy	ec880ae700	Fixed resource leaks and removed dead code Refs #508	2016-08-21 23:19:09 -07:00
obivan	4257311d3e	Wrap error	2016-06-30 11:29:45 +06:00
obivan	55912b7c8a	err check	2016-06-29 18:16:25 +06:00
Ev Kontsevoy	4a635ec949	Added "seed_config" configuration flag Teleport YAML config now has a new configuration variable for internal use by Gravitational: ```yaml teleport: seed_config: true ``` If set to 'true', Teleport treats YAML configuration simply as a seed configuration on first start. If set to 'false' (default for OSS version), Teleport will throw away its back-end config, treating YAML config as the only source of truth. Specifically, for now, the following settings are thrown away if not found in YAML: - trusted authorities - reverse tunnels	2016-06-17 11:55:22 -07:00
Ev Kontsevoy	126a9e9ff8	Minor bugs regarding reverse tunnels - Friendly error messages when parsing configuration and establishing connection - Bugs related to "first start" vs subsequent starts (reverse tunnells added to YAML file won't be seen upon restart) - Nicer logging	2016-06-09 19:17:07 -07:00
Ev Kontsevoy	e6efb87126	Merge branch 'master' into ev/tunnels	2016-06-09 12:19:32 -07:00
Sasha Klizhentas	84c12af433	address code review comments	2016-06-08 16:57:40 -07:00
Sasha Klizhentas	2773a68da5	recover back AuthIdentityEvent	2016-06-08 13:08:41 -07:00
Ev Kontsevoy	6deab48ab6	Cluster certificate import/export work... 1. tctl auth export now dumps both user&host keys if --type key is missing 2. created fixtures for testing key imports: they're in fixtures/trusted_clusters 3. configuration parser reads "trusted_clusters" files expecting the output of tctl auth export	2016-06-07 18:57:54 -07:00
Ev Kontsevoy	cd135b899e	Merge remote-tracking branch 'origin/master' into ev/tunnels Conflicts: tool/tctl/main.go	2016-06-01 12:39:36 -07:00
Ev Kontsevoy	3b5231da85	Minor changes - some unused code removal - better error messages	2016-06-01 00:20:58 -07:00
Ev Kontsevoy	cddaf6e5c8	Some minor improvements - `tctl auth ls` lists all CAs by default - Documented `authorize_ip` better	2016-05-31 18:59:07 -07:00
Ev Kontsevoy	ed0948659b	Changed how self-signed HTTPS cert is generated Fixes #434 Changes: - Certificate is not "CA" anymore - Added "*" for CN field	2016-05-31 18:36:51 -07:00
Ev Kontsevoy	92b30c3c77	Configuration changes 1. data_dir is now a global setting in teleport.yaml (instead of being inside of "storage" sub-section) 2. changing data_dir in one place causes all of teleport to use it, not just bolt backends. 3. moving auth server to listen on non-default ports properly adjusts the global auth_servers setting 4. `tctl` now accepts -c flag just like Teleport, so you can pass `teleprot.yaml` to it. Fixes #432 Fixes #431 Fixes #430	2016-05-31 14:58:55 -07:00
Ev Kontsevoy	9b9c6901a5	Permissions adjustment for data dir Teleport's data dir (`/var/lib/teleport` by default) was created using umask. Now it's created with `0600` (readable only by Teleport user).	2016-05-30 14:23:58 -07:00
Ev Kontsevoy	c7902c6afe	Cleaned up SSH-HTTP tunnel auth integration	2016-05-30 13:52:23 -07:00
Ev Kontsevoy	3f0ba645a2	1st draft at passing SSH user into auth HTTP API	2016-05-30 01:27:33 -07:00
Ev Kontsevoy	dc87ef5aec	Clean error reporting for SSH exec - stdout and stderr are separated - exit status is inherited by tsh	2016-05-24 18:00:26 -07:00
Ev Kontsevoy	ab278f0a06	TunClient changes TunClient always tries to dial the statically configured auth server first, before trying "discovered" ones. The rationale is that --auth flag must override whatever dynamic auth servers have been discovered (because sometimes their IPs are wrong, if advertise-ip was misconfigured) Closes #416 Fixes #416	2016-05-20 19:38:20 -07:00
Ev Kontsevoy	d7f756cac1	Auth server heartbeat presence cleanup	2016-05-20 17:14:04 -07:00
Ev Kontsevoy	e9730cc925	Implemented automatic-advertise-ip routine Refs #416	2016-05-18 09:47:24 -07:00
Ev Kontsevoy	49256d1c23	Merge remote-tracking branch 'origin/master' into ev/multi-role Conflicts: lib/auth/tun.go	2016-05-17 10:53:36 -07:00

... 3 4 5 6 7 ...

607 commits