Commit graph

82 commits

Author SHA1 Message Date
Tiago Silva 037daad083
Introduce dedicated server type for Kubernetes resources (#14389)
## What

First part of the Kubernetes [Discovery RFD](https://github.com/gravitational/teleport/pull/13376/) to introduce a Kubernetes server per cluster. 

This PR introduces a separate Kubernetes server that uses the already introduced `KubernetesClusterV3`. 

## Compatibility

In previous versions, Kubernetes Clusters were part of regular `ServerV2` resource and this refactoring deprecates the `ServerV2` usage but keeps them for compatibility with previous version.

Everything is backward compatible, so v10 kubernetes agents and trusted clusters can connect fine.

## Next steps

Once this is merged, a new PR will introduce dynamic registration for Kubernetes Clusters discovered through EKS Discovery.
2022-08-04 14:21:11 +00:00
Noah Stride 02b4f8575f
Configure linter to catch British 🇬🇧 spellings 🇺🇸 🦅 📖 (#14363)
* configure golangci-lint misspell to check for anglicized spellings

* Americanize spellings

* fix aws constant value with british spelling 🇬🇧

* update api types with americanized spellings

* use american spellings .cloudbuild/scripts
2022-07-14 10:51:23 +00:00
Edoardo Spadolini 160df0086a
Support role bootstrapping in OSS (#11175)
* Add role bootstrapping

* Test coverage

* Better variable names

* Remove spurious log, add better error messages
2022-03-17 10:12:18 +00:00
rosstimothy c730778960
Replace golint with revive (#8613) 2021-10-19 14:00:24 -04:00
Nic Klaassen 2d10515f19
Implement Simplified Node Joining (#8250) 2021-10-08 10:41:28 -07:00
Roman Tkachenko 6502a12f1f
Add API and CLI for managing application resources (#8185) 2021-09-20 08:44:13 -07:00
Andrew Lytvynov ab062428b1 Windows desktop service boilerplate
Boilerplate for a new service and API objects:
- windows_desktop_service config section
- service registration and heartbeats
- static host registration and heartbeats
- caching, permissions, etc
- "tctl get" support

For new connections the service aborts after authentication, since the
RDP client implementation is not ready yet (pending in
https://github.com/gravitational/teleport/pull/7824).

Tested that the service starts, registers (both over a tunnel and
directly) and creates the API objects.
2021-08-18 18:44:41 +00:00
Roman Tkachenko 25aeeae3ac
Adding database resource API and tctl commands (#7792) 2021-08-09 07:14:39 -07:00
Brian Joerger 9b8b9d6d0c
rollback - Upgrade api version. (#7751) 2021-07-30 15:34:19 -07:00
Marek Smoliński 4de9e94a51
Add support for tctl get/rm DB resource (#7558) 2021-07-29 10:27:38 +02:00
Brian Joerger c040aca4c1
Upgrade api version. (#7609) 2021-07-28 13:51:21 -07:00
Eugene Yakubovich 67c0eb3b4c Add restricted session
Adds the ability to block network traffic on SSH sessions.
The deny/allow lists of IPs are specified in teleport.yaml file.
Supports both IPv4 and IPv6 communication.

This feature currently relies on enhanced recording for
cgroup management so that needs to be enabled as well.

-- Design rationale:
This patch uses Linux Security Module (LSM) hooks, specifically
security_socket_connect and security_socket_sendmsg, to control
egress traffic. The LSM provides two advantages over socket filtering
program types.
- It's executed early enough that the task information is available.
  This makes it easy to report PID, COMM, etc.
- It becomes a model for extending restrictions beyond networking.

The set of enforced cgroups is stored in a BPF hash map and the
deny/allow lists are stored in BPF trie maps. An IP address is
first checked against the allow list. If found, it's checked for
an override in the deny list. The policy is default deny. However,
the absence of the NetworkRestrictions API object is allow all.

IPv4 addresses are additionally registered in IPv6 trie (as mapped)
to account for dual stacks. However it is unclear if this is sufficient
as 4-to-6 transition methods utilize a multitude of translation and
tunneling methods.
2021-07-16 16:49:04 -07:00
Andrej Tokarčík 7c630ec960
Introduce Lock resource (#7430) 2021-07-07 18:20:53 +02:00
Brian Joerger bd07d7be20
CheckAndSetDefaults sets all defaults. (#6846) 2021-06-18 12:57:29 -07:00
Andrej Tokarčík 9410346b8d
Make SessionRecordingConfig resource dynamically configurable (#7054) 2021-06-08 08:30:44 -07:00
Brian Joerger 7bff7c41bd
Remove API aliases (#6983) 2021-06-04 13:29:31 -07:00
Andrej Tokarčík 3ca21aca9a
Make ClusterNetworkingConfig resource dynamically configurable (#7013) 2021-06-04 19:42:50 +02:00
Andrew Lytvynov cd2f4fceb7
Remove JSON schema validation (#6685)
* Remove JSON schema validation

Removing JSON schema validation from all resource unmarshalers.

--- what JSON schema gets us

Looking at the JSON schema spec and our usage, here are the supposed benefits:
- type validation - make sure incoming data uses the right types for the right fields
- required fields - make sure that mandatory fields are set
- defaulting - set defaults for fields
- documentation - schema definition for our API objects

Note that it does _not_ do:
- fail on unknown fields in data
- fail on a required field with an empty value

--- what replaces it

Based on the above, it may seem like JSON schema provides value.
But it's not the case, let's break it down one by one:
- type validation - unmarshaling JSON into a typed Go struct does this
- required fields - only checks that the field was provided, doesn't actually check that a value is set (e.g. `"name": ""` will pass the `required` check)
  - so it's pretty useless for any real validation
  - and we already have a separate place for proper validation - `CheckAndSetDefaults` methods
- defaulting - done in `CheckAndSetDefaults` methods
  - `Version` is the only annoying field, had to add it in a bunch of objects
- documentation - protobuf definitions are the source of truth for our API schema

--- the benefits

- performance - schema validation does a few rounds of `json.Marshal/Unmarshal` in addition to actual validation; now we simply skip all that
- maintenance - no need to keep protobuf and JSON schema definitions in sync anymore
- creating new API objects - one error-prone step removed
- (future) fewer dependencies - we can _almost_ remove the Go libraries for schema validation (one transient dependency keeping them around)

* Remove services.SkipValidation

No more JSON schema validation so this option is a noop.
2021-06-01 15:27:20 -07:00
Andrej Tokarčík 72c93ae86b
Handle tctl get's input ref more strictly (#5818)
For some resource kinds, `tctl get` used to always return the list of
all resources of that kind, irrespective of whether a specific resource
name was given in the argument to `tctl get`.

For instance, `tctl get node/node-id` would have always listed all the
nodes (just like mere `tctl get node`).

Here, all `tctl get` handlers are adjusted to filter by name when a name
is provided as part of the input ref.
2021-04-27 15:59:46 +02:00
a-palchikov 86908cc2f3
Web UI disconnects (#5276)
* Use fake clock consistently in units tests.
* Split web session management into two interfaces and implement them separately for clear separation
* Split session management into New/Validate to make it aparent where the sessions are created and where existing sessions are managed. Remove ttlmap in favor of a simple map and handle expirations
explicitly.
Add web session management to gRPC server for the cache.

* Reintroduce web sessions APIs under a getter interface.
* Add SubKind to WatchKind for gRPC and add conversions from/to protobuf. Fix web sessions unit tests.
* lib/web: create/insert session context in ValidateSession if the session has not yet been added to session cache.
lib/cache: add event filter for web session in auth cache.
lib/auth: propagate web session subkind in gRPC event.

* Add implicit migrations for legacy web session key path for queries.
* Integrate web token in lib/web
* Add a bearer token when upserting a web session
* Fix tests. Use fake clock wherever possible.

* Converge session cache handling in lib/web

* Clean up and add doc comments where necessary

* Use correct form of sessions/tokens controller for ServerWithRoles. Use fake time in web tests

* Converge the web sessions/tokens handling in lib/auth to match the old behavior w.r.t access checking (e.g. implicit handling of the local user identity).

* Use cached reads and waiters only when necessary. Query sessions/tokens using best-effort - first looking in the cache and falling back to a proxy client

* Properly propagate events about deletes for values with subkind.

* Update to retrofit changes after recent teleport API refactorings

* Update comment on removing legacy code to move the deadline to 7.x

* Do not close the resources on the session when it expires - this beats the purpose of this PR.
Also avoid a race between closing the cached clients and an existing reference to the session by letting the session linger for longer before removing it.

* Move web session/token request structs to the api client proto package

* Only set HTTP fs on the web handler if the UI is enabled

* Properly tear down web session test by releasing resources at the end. Fix the web UI assets configuration by removing DisableUI and instead use the presence of assets (HTTP file system) as an indicator that the web UI has been enabled.

* Decrease the expired session cache clean up threshold to 2m. Only log the expiration error message for errors other than not found

* Add test for terminal disconnect when using two proxies in HA mode
2021-02-04 16:50:18 +01:00
Brian Joerger ce87251ea0
api dependency reduction - marshalers (#5384)
Refactor Marshal logic on types, and move it into /lib/services to reduce dependencies in /api.
2021-02-01 10:26:50 -08:00
Brian Joerger 3c3ce160d9
Move API types and functionality from lib/services to api/types. (#5143) 2021-01-11 10:02:34 -08:00
Russell Jones 904b0d0488 Added Application Access.
Added support for an identity aware, RBAC enforcing, mutually
authenticated, web application proxy to Teleport.

* Updated services.Server to support an application servers.
* Updated services.WebSession to support application sessions.
* Added CRUD RPCs for "AppServers".
* Added CRUD RPCs for "AppSessions".
* Added RBAC support using labels for applications.
* Added JWT signer as a services.CertAuthority type.
* Added support for signing and verifying JWT tokens.
* Refactored dynamic label and heartbeat code into standalone packages.
* Added application support to web proxies and new "app_service" to
  proxy mutually authenticated connections from proxy to an internal
  application.
2020-11-03 14:32:13 -08:00
Andrew Lytvynov 5ec194cd0d
Implement kubernetes_service registration and startup (#4611)
* Implement kubernetes_service registration and sratup

The new service now starts, registers (locally or via a join token) and
heartbeats its presence to the auth server.

This service can handle k8s requests (like a proxy) but not to remote
teleport clusters. Proxies will be responsible for routing those.
The client (tsh) will not yet go to this service, until proxy routing is
implemented. I manually tweaked server addres in kubeconfig to test it.

You can also run `tctl get kube_service` to list all registered
instances. The self-reported info is currently limited - only listening
address is set.

* Address review feedback
2020-10-30 17:19:53 +00:00
Andrew Lytvynov bd974ef09a
golint: final batch of fixes (#4589)
And enable `golint` during `make lint`.
2020-10-22 00:13:09 +00:00
Forrest Marshall ae2336dfd0 concurrent session control
Adds support for Concurrent Session Control and a new
semaphore API.  Roles now support two new configuration
options, `max_ssh_connections` and `max_ssh_sessions`
which correspond to the total number of authenticated
ssh connections per cluster, and the number of ssh sessions
within a connection respectively.  Attempting to exceed
these limits generate variants of the `session.rejected`
audit event and cause the connection/session to be
rejected.
2020-09-17 11:02:35 -07:00
Sasha Klizhentas 0f4e82548f Initial work on semaphores 2020-09-17 11:02:35 -07:00
Forrest Marshall eefef4ddb7 improve label fmt enforcement 2020-07-23 22:51:05 -07:00
Andrew Lytvynov 2a93910fa2 gosimple: remove unnecessary blank identifiers in for loops 2020-05-15 16:32:45 +00:00
Forrest Marshall 4e9eed9ac0 cache event fanout & reversetunnel improvements
- cache now perforams in-memory fanout of events, eliminating
spurious event generation due to cache init/reset.

- removed old unused logic from reversetunnel agents.

- replaced seekpool with simpler ttl-cache and semaphore-like
lease system.

- add jittered backoff to agent connection attempts to
reduce "thundering herd" effect.

- improved reversetunnel logging.

- improved LB usage in tests.
2020-04-23 14:03:52 -07:00
Alexey Kontsevoy 3c670d5d58
Merge Teleport V4.3 UI branch to master (#3583)
* Add monorepo

* Add reset/passwd capability for local users (#3287)

* Add UserTokens to allow password resets

* Pass context down through ChangePasswordWithToken

* Rename UserToken to ResetPasswordToken

* Add auto formatting for proto files

* Add common Marshaller interfaces to reset password token

* Allow enterprise "tctl" reuse OSS user methods (#3344)

* Pass localAuthEnabled flag to UI (#3412)

* Added LocalAuthEnabled prop to WebConfigAuthSetting struct in webconfig.go
* Added LocalAuthEnabled state as part of webCfg in  apiserver.go

* update e-refs

* Fix a regression bug after merge

* Update tctl CLI output msgs (#3442)

* Use local user client when resolving user roles

* Update webapps ref

* Add and retrieve fields from Cluster struct (#3476)

* Set Teleport versions for node, auth, proxy init heartbeat
* Add and retrieve fields NodeCount, PublicURL, AuthVersion from Clusters
* Remove debug logging to avoid log pollution when getting public_addr of proxy
* Create helper func GuessProxyHost to get the public_addr of a proxy host
* Refactor newResetPasswordToken to use GuessProxyHost and remove publicUrl func

* Remove webapps submodule

* Add webassets submodule

* Replace webapps sub-module reference with webassets

* Update webassets path in Makefile

* Update webassets

1b11b26 Simplify and clean up Makefile (#62) https://github.com/gravitational/webapps/commit/1b11b26

* Retrieve cluster details for user context (#3515)

* Let GuessProxyHost also return proxy's version
* Unit test GuessProxyHostAndVersion & GetClusterDetails

* Update webassets

4dfef4e Fix build pipeline (#66) https://github.com/gravitational/webapps/commit/4dfef4e

* Update e-ref

* Update webassets

0647568 Fix OSS redirects https://github.com/gravitational/webapps/commit/0647568

* update e-ref

* Update webassets

e0f4189 Address security audit warnings Updates  "minimist" package which is used by 7y old "optimist". https://github.com/gravitational/webapps/commit/e0f4189

* Add new attr to Session struct (#3574)

* Add fields ServerHostname and ServerAddr
* Set these fields on newSession

* Ensure webassets submodule during build

* Update e-ref

* Ensure webassets before running unit-tests

* Update E-ref

Co-authored-by: Lisa Kim <lisa@gravitational.com>
Co-authored-by: Pierre Beaucamp <pierre@gravitational.com>
Co-authored-by: Jenkins <jenkins@gravitational.io>
2020-04-15 15:35:26 -04:00
Forrest Marshall 257274b26f Implement per-resource PluginData storage (#3286)
- Also addresses #3282 by adding retries for CompareAndSwap
on SetAccessRequestState and UpdatePluginData.
2020-01-30 14:27:40 -08:00
Forrest Marshall ec327b6e03 Implment access-request system (workflow API) 2019-12-02 14:05:51 -08:00
Forrest Marshall 05f3eeaf00 Support resource-based bootstrapping for backend. (#2871)
* Support resource-based bootstrapping for backend.

Outside of static configuration, most of the persistent state of an
auth server exists as a collection of resources, stored in its
backend.  The resource API also forms the basis of Teleport's more
advanced dynamic configuration options.

This commit extends the usefulness of the resource API by adding
the ability to bootstrap backend state with a set of previously
exported resources.  This allows the resource API to serve as a
rudimentary backup/migration tool.

Notes: This features is a work in progress, and very easy to misuse;
while it will prevent you from overwriting the state of an existing
auth server, it won't stop you from bootstrapping into a wildly
misconfigured state.  In general, resource-based bootstrapping is
not a complete solution for backup or migration.

* update e-ref
2019-08-29 16:16:03 -07:00
Forrest Marshall 3c93347470 Always allow usage of canonical resource names
Fixes a minor usability issue where some resources could not be
referred to by the name that appears in their `kind` field.
2019-07-03 15:29:48 -07:00
Gus Luxton 0b5bff73b6
Improve help text and error messages for tctl rm, fixes #2594 (#2724)
* Improve help text and error messages for tctl rm, fixes #2594
* Change 'kind' to 'type' for consistency
* Changed examples from role/admin to connector/github
* Added link to Teleport Enterprise
* Update e ref
2019-05-28 19:18:28 -03:00
Jérémy Clerc b2fd50b5e9 tctl: users add/ls and tokens ls json output 2019-04-25 14:22:49 -07:00
Sasha Klizhentas 8356ae6a74 Use in-memory cache for the auth server API.
This commit expands the usage of the caching layer
for auth server API:

* Introduces in-memory cache that is used to serve all
Auth server API requests. This is done to achieve scalability
on 10K+ node clusters, where each node fetches certificate authorities,
roles, users and join tokens. It is not possible to scale
DynamoDB backend or other backends on 10K reads per seconds
on a single shard or partition. The solution is to introduce
an in-memory cache of the backend state that is always used
for reads.

* In-memory cache has been expanded to support all resources
required by the auth server.

* Experimental `tctl top` command has been introduced to display
common single node metrics.

Replace SQLite Memory Backend with BTree

SQLite in memory backend was suffering from
high tail latencies under load (up to 8 seconds
in 99.9%-ile on load configurations).

This commit replaces the SQLite memory caching
backend with in-memory BTree backend that
brought down tail latencies to 2 seconds (99.9%-ile)
and brought overall performance improvement.
2019-04-12 14:23:09 -07:00
Sasha Klizhentas f40df845db Events and GRPC API
This commit introduces several key changes to
Teleport backend and API infrastructure
in order to achieve scalability improvements
on 10K+ node deployments.

Events and plain keyspace
--------------------------

New backend interface supports events,
pagination and range queries
and moves away from buckets to
plain keyspace, what better aligns
with DynamoDB and Etcd featuring similar
interfaces.

All backend implementations are
exposing Events API, allowing
multiple subscribers to consume the same
event stream and avoid polling database.

Replacing BoltDB, Dir with SQLite
-------------------------------

BoltDB backend does not support
having two processes access the database at the
same time. This prevented Teleport
using BoltDB backend to be live reloaded.

SQLite supports reads/writes by multiple
processes and makes Dir backend obsolete
as SQLite is more efficient on larger collections,
supports transactions and can detect data
corruption.

Teleport automatically migrates data from
Bolt and Dir backends into SQLite.

GRPC API and protobuf resources
-------------------------------

GRPC API has been introduced for
the auth server. The auth server now serves both GRPC
and JSON-HTTP API on the same TLS socket and uses
the same client certificate authentication.

All future API methods should use GRPC and HTTP-JSON
API is considered obsolete.

In addition to that some resources like
Server and CertificateAuthority are now
generated from protobuf service specifications in
a way that is fully backward compatible with
original JSON spec and schema, so the same resource
can be encoded and decoded from JSON, YAML
and protobuf.

All models should be refactored
into new proto specification over time.

Streaming presence service
--------------------------

In order to cut bandwidth, nodes
are sending full updates only when changes
to labels or spec have occured, otherwise
new light-weight GRPC keep alive updates are sent
over to the presence service, reducing
bandwidth usage on multi-node deployments.

In addition to that nodes are no longer polling
auth server for certificate authority rotation
updates, instead they subscribe to event updates
to detect updates as soon as they happen.

This is a new API, so the errors are inevitable,
that's why polling is still done, but
on a way slower rate.
2018-12-10 17:20:24 -08:00
Ev Kontsevoy 224c3bc148 Updated the docs (and UX a bit) for tctl get connectors
* Updated the docs (closes #2246)
* Added 'connector' as an alias to 'connectors' (closes #2425)
2018-12-09 14:46:32 -08:00
Roman Tkachenko f7424e5c95 Allow dash in resource labels. 2018-11-16 13:26:47 -08:00
Russell Jones ac22e588d9 Added support for "tctl get connectors". 2018-10-29 10:29:20 -07:00
Alexey Kontsevoy ab86a567ec add new resource - License 2018-09-12 16:17:28 -04:00
Russell Jones e1402dc009 Marshal bool as bool and fix schema validation. 2018-08-08 14:33:52 -07:00
Sasha Klizhentas 1f3b4e2c96 Kubernetes configuration, fetch proxy settings.
This commit moves proxy kubernetes configuration
to a separate nested block to provide more fine
grained settings:

```yaml
auth:
  kubernetes_ca_cert_path: /tmp/custom-ca
proxy:
  enabled: yes
  kubernetes:
    enabled: yes
    public_addr: [custom.example.com:port]
    api_addr: kuberentes.example.com:443
    listen_addr: localhost:3026
```

1. Kubernetes config section is explicitly enabled
and disabled. It is disabled by default.

2. Public address in kubernetes section
is propagated to tsh profile

The other part of the commit updates Ping
endpoint to send proxy configuration back to
the client, including kubernetes public address
and ssh listen address.

Clients updates profile accordingly to configuration
received from the proxy.
2018-08-06 11:57:36 -07:00
Russell Jones ce1c7476b9 Updated dir backend to a flat keyspace. Added UpsertItems endpoint to
all backends to support bulk insertion. Added UpsertNodes endpoint,
which is used by the state cache to speed up GetNodes.
2018-07-13 20:12:34 +00:00
Sasha Klizhentas 3e144cb900 Teleport certificate authority rotation.
This commit implements #1860

During the the rotation procedure issuing TLS and SSH
certificate authorities are re-generated and all internal
components of the cluster re-register to get new
credentials.

The rotation procedure is based on a distributed
state machine algorithm - certificate authorities have
explicit rotation state and all parts of the cluster sync
local state machines by following transitions between phases.

Operator can launch CA rotation in auto or manual modes.

In manual mode operator moves cluster bewtween rotation states
and watches the states of the components to sync.

In auto mode state transitions are happening automatically
on a specified schedule.

The design documentation is embedded in the code:

lib/auth/rotate.go
2018-04-30 12:58:57 -07:00
Sasha Klizhentas e114fbd46c Add support for remote_cluster, implements #1526
This commit adds remote cluster resource that specifies
connection and trust of the remote trusted cluster to the local
cluster. Deleting remote cluster resource deletes trust
established between clusters on the local cluster side
and terminates all reverse tunnel connections.

Migrations make sure that remote cluster resources exist
after upgrade of the auth server.
2017-12-28 17:48:30 -08:00
Roman Tkachenko e94675a94e Fix typos and some review comments 2017-12-14 17:19:57 -08:00
Roman Tkachenko c0cf7df7c9 Github connector 2017-12-14 13:41:38 -08:00