Part of https://github.com/gravitational/teleport/pull/18274
This commit introduces a new hidden `wait` CLI subcommand:
- `teleport wait no-resolve <domain-name>` resolves a domain name and exits only when no IPs are resolved. This CLI command should be used in the Helm chart, as an init-container, to block proxies from rolling out until all auth pods have been successfully rolled-out.
- `teleport wait duration 30s` has the same behaviour as `sleep 30`. Due to image hardening we won't have `sleep` available, but waiting 30 seconds in a preStop hook is required to ensure a 100% seamless pod rollout on kube-proxy-based clusters.
Add support for device extensions for TLS and SSH issued certificates.
This is a first step in issuing certificates augmented with device extensions.
gravitational/teleport.e#514
* [draft] Add a new usage reporter
This adds a new usage reporter service to the auth server. It's
disabled by default in OSS and can only be turned on via startup hook
in Cloud / Enterprise. In OSS, the audit log wrapper is never
configured and any usage events are sent to a no-op discard reporter.
Usage events are defined in prehog and can be sent to the new
UsageReporter Service on the auth server. An audit event wrapper is
used to capture certain events that are otherwise difficult to hook.
Events are anonymized before submission, then held in a non-blocking
queue for batching and submission purposes.
* Remove dead code
* Add SubmitUsageEvent RPC to Auth.
This adds a new SubmitUsageEvent RPC to the Auth API that external
clients (e.g. the UI) can use to submit usage events externally.
* Slight refactor for unit testing
* Add Prometheus metrics and add initial working prehog submitter
* Add more metrics, tweak prehog client, and add unit tests
* Further tweak http transport settings based on Teleport defaults
* Add missing metrics
* Fix goimports
* Add new UI usage events
* Update e ref
* Add prehog directly for now. Improve logging.
* update prehog
* Add new prehog events; use username from request identity
* add HTTP server for user events
* Add username back to pre-onboard events
* unauthenticated user events
* Fix userevent build error
* Use event-provided username where appropriate
* Move barebones prehog reqs to lib/prehog and generate here.
Also, use prod tunable values.
* Fix license lints
* De-flake tests by adding unfortunate amounts of synchronization.
* Add missing license header
* Misc PR cleanup for review
* Update lib/events/usageevents/usageevents.go
Co-authored-by: Edoardo Spadolini <edoardo.spadolini@goteleport.com>
* Address a batch of review comments
Adds `anonymizer.AnonymizeString` and parent loggers
* Update e ref
* Clean up comments
* Remove onboard prefix from recovery code event
* Address another batch of feedback
* Use defaults.HTTPClient()
* Remove a noisy log message
* Demote noisy log message to debug
* Temporarily revert e ref for merge
Co-authored-by: Michelle Bergquist <michelle.bergquist@goteleport.com>
Co-authored-by: Edoardo Spadolini <edoardo.spadolini@goteleport.com>
- Fix an incorrect link
- Be more specific about what roles/permissions are required
- Remove some text and an image that didn't meaningfully contribute
* switch underlying protocol used for 'tsh scp' to SFTP
* address TODO
* appease linter
* add method to make it easier for other callers to transfer files
* add tests
* print transfer progress with progress bar by default
Also allow a SIGINT to gracefully stop the SFTP connection. This is
necessary because the progress bar will ignore signals and prevent the
process from exiting.
* address SFTP fork issues
* make tests less flakey
* fix specifying dir for dst not copying files to correct paths
* make tests less flakey (again)
* don't check file access times, often differs when run in CI
* few small fixes from review, simplify Create method now that HTTP FS isn't needed
* create dst files and dirs with src mode
* improved error messages when doing file operations
* expand home dirs in remote paths
* addressed more feedback
* add license to get_home_dir.go
* address minor feedback of tests, add home dir expansion test
* update sftp fork to point to latest commit on master branch
* addressed feedback
* don't cache home dir lookups, only one remote path can ever be used
* Allow tsh to retrieve cluster details in one request
Prior to connecting to a node via `tsh ssh` we need two bits of
information about the cluster:
1) The session recording mode
2) Whether FIPS is enabled
In order to retrieve the information `tsh` first would send the
global ssh request `RecordingProxyReqType` to determine the
session recording mode. Later on `tsh` would Ping the auth
server to determine if the cluster was running in FIPS mode.
In an effort to reduce the number of round trips to retrieve
this data, a new global ssh request `ClusterDetailsReqType` is
introduced that returns both the session recording mode and
whether FIPS is enabled. This allows `tsh` to leverage the new
request to get all the information it needs, and is extensible
in case more information is needed` in one request which helps
reduce latency for `tsh ssh`.
While looking up github.com/gokyle/hotp I found some old deprecation warnings
and decided to address them.
* Remove HOTP support
* Update comment on checkOTP
* Remove OTPType
* Remove a few more HOTP references
* Add initial version of installer
* Resolve comments
- Use aws waiters when checking commands
- Use SSMRunRequest rather than passing instances
- General comments
* Resolve comments, (rebase) pass scriptname parameter
This resolves comments regarding running on multiple ec2 instances at
once by adding state to the instances cache to check if the instance
is known about and how far into installation it is
* Revert cache
* Dont cache on non discovery nodes
* Resolve some comments
* Move discovery out to its own serviec
* Add a `discovery_service` section
* Fix messed up conflict merge
* Make starting a standalone discovery agent work
* Resolve comments
* Resolve comments
- use a regular events.Emitter
- resolve a thousand typos :)
* Resolve comments
* resolve comments, fix a bad merge
* Fail when a non ec2 matcher type is configured
* fix lint-go
* Resolve comments
* Resolve comments, add initial test (currently broken)
* Fix log string so only 1 pair of [] are used
* Chunk instances for sending commands
* add 'isInitialized' to watchers
* Add test for chunked discovery, log output
* lints
* explicetly set matcher.Tags to "*":"*" if its unset
This PR implements the SSH Tester for ConnectionDiagnostic feature.
This feature is also known as Test Connection, part of Teleport
Discover.
The goal here is to provide immediante feedback about a newly added
resource. Can the user connect to it?
We are targetting SSH Nodes as a first ResourceKind.
To test the access to an SSH Node we require the ResourceName and a
login username (ssh principal).
Then a series of checks will occur in two places:
- SSH client in the Web server
- SSH server in the SSH agent
The ssh client creates a new Connection Diagnostic with some initial
state.
Then it tries to build up the necessary SSH config
This already gives us a couple of things to check for:
- does the node exist and does the current user (inherited from
websession) can access it?
- is the node accepting TCP connections (in the specific port)?
- is the node accepting SSH protocol on top of the TCP connection?
Then, the ConnectionDiagnosticID is inject into the certificate and the
SSH Server receives it and will also Append traces into it:
- is the requested principal allowed for the current user?
- does the requested principal exist in the target node?
This is not an exhaustive list of checks.
For a complete list of which checks are verified please see the
TestDiagnoseSSHConnection test.
After all those checks, it returns if the Connection was successful and
what all of the traces generated along the way.
Demo:
![image](https://user-images.githubusercontent.com/689271/187976940-55522fd9-f581-4c6d-9bfc-f6e501c1ed72.png)
![image](https://user-images.githubusercontent.com/689271/187976957-35075112-2b42-4726-8d50-19d02fab2464.png)
![image](https://user-images.githubusercontent.com/689271/187976967-81406e2c-0517-474b-b323-dad1f8be1571.png)
* WebAPI: update user traits
Web API only supports updating the roles property for a given User.
This PR adds the possibility of updating User's traits
- Logins
- DB Users
- DB Names
- Kube Users
- Kube Groups
- Windows Logins
- AWS Role ARNs
It only updates if the requests contains a non-nil value for the trait's
list.
It deduplicates the trait's list before applying it.
* Add support for automatic user provisioning
* Add UID parker to reexec
* Add a `teleport park` subcommand that does nothing
Co-authored-by: Edoardo Spadolini <edoardo.spadolini@goteleport.com>
This adds proxy peering support. A configurable setting that allows for agents
to connect to a subset of proxies and be reachable through any proxy in the
cluster. This is achieved by creating grpc connections between each proxy
server. Client connections can then be passed between proxies to the desired
agent.
* Add tracing service and configuration
Provides a new tracing configuration block, which can be
used to configure if and how spans are exported to a
telemetry backend. In the example below, the tracing
service is enabled and will export spans to
`collector.example.com:4317` via gRPC with mTLS enabled.
```yaml
tracing_service:
enabled: yes
exporter_url: collector.example.com:4317
sampling_rate_per_million: 1000000
ca_certs:
- /certs/rootCA.pem
keypairs:
- key_file: /certs/example.com-client-key.pem
cert_file: /certs/example.com-client.pem
```
This configuration ends up being consumed by the `TeleportProcess`
and passed to `tracing.NewTraceProvider` which sets up the OpenTelemetry
Exporter, TracerProvider, Propagator and Sampler. In order for spans to
be exported, the `tracing_service` must be enabled **and** have a
`sampling_rate_per_million` value > 0.
Using the OIDC connector with Okta would fail due to an issue in our
fork of go-oidc. Update this dependency to get the fix.
Additionally, clean up the logic for syncing the connector
configuration, which was using a context.Context in order to implement
a timeout. This can be expressed in a simpler way with time.After()
* Add certificate renewal bot
This adds a new `tbot` tool to continuously renew a set of
certificates after registering with a Teleport cluster using a
similar process to standard node joining.
This makes some modifications to user certificate generation to allow
for certificates that can be renewed beyond their original TTL, and
exposes new gRPC endpoints:
* `CreateBotJoinToken` creates a join token for a bot user
* `GenerateInitialRenewableUserCerts` exchanges a token for a set of
certificates with a new `renewable` flag set
A new `tctl` command, `tctl bots add`, creates a bot user and calls
`CreateBotJoinToken` to issue a token. A bot instance can then be
started using a provided command.
* Cert bot refactoring pass
* Use role requests to split renewable certs from end-user certs
* Add bot configuration file
* Use `teleport.dev/bot` label
* Remove `impersonator` flag on initial bot certs
* Remove unnecessary `renew` package
* Misc other cleanup
* Do not pass through `renewable` flag when role requests are set
This adds additional restrictions on when a certificate's `renewable`
flag is carried over to a new certificate. In particular, it now also
denies the flag when either role requests are present, or the
`disallowReissue` flag has been previously set.
In practice `disallow-reissue` would have prevented any undesired
behavior but this improves consistency and resolves a TODO.
* Various tbot UX improvements; render SSH config
* Fully flesh out config template rendering
* Fix rendering for SSH configuration templates
* Added `String()` impls for destination types
* Improve certificate renewal logging; show more detail
* Properly fall back to default (all) roles
* Add mode hints for files
* Add/update copyright headers
* Add stubs for tbot init and watch commands
* Add gRPC endpoints for managing bots
* Add `CreateBot`, `DeleteBot`, and `GetBotUsers` gRPC endpoints
* Replace `tctl bot (add|rm|ls)` implementations with gRPC calls
* Define a few new constants, `DefaultBotJoinTTL`, `BotLabel`,
`BotGenerationLabel`
* Fix outdated destination flag in example tbot command
* Bugfix pass for demo
* Fixed a few nil pointer derefs when using config from CLI args
* Properly create destination if `--destination-dir` flag is used
* Remove improper default on CLI flag
* `DestinationConfig` is now a list of pointers
* Address first wave of review feedback
Fixes the majority of smaller issues caught by reviewers, thanks all!
* Add doc comments for bot.go functions
* Return the token TTL from CreateBot
* Split initial user cert issuance from `generateUserCerts()`
Issuing initial renewable certificate ended up requiring a lot of
hacks to skip checks that prevented anonymous bots from getting
certs even though we'd verified their identity elsewhere (via token).
This reverts all those hacks and splits initial bot cert logic into a
dedicated `generateInitialRenewableUserCerts()` function which should
make the whole process much easier to follow.
* Set bot traits to silence log messages
* tbot log message consistency pass
* Resolve lints
* Add config tests
* Remove CreateBotJoinToken endpoint
Users should instead use the CreateBot/DeleteBot endpoints.
* Create a fresh private key for every impersonated identity renewal
* Hide `config` subcommand
* Rename bot label prefix to `teleport.internal/`
* Use types.NewRole() to create bot roles
* Clean up error handling in custom YAML unmarshallers
Also, add notes about the supported YAML shapes.
* Fetch proxy host via gRPC Ping() instead of GetProxies()
* Update lib/auth/bot.go
Co-authored-by: Zac Bergquist <zmb3@users.noreply.github.com>
* Fix some review comments
* Add renewable certificate generation checks (#10098)
* Add renewable certificate generation checks
This adds a new validation check for renewable certificates that
maintains a renewal counter as both a certificate extension and a
user label. This counter is used to ensure only a single certificate
lineage can exist: for example, if a renewable certificate is stolen,
only one copy of the certificate can be renewed as the generation
counter will not match
When renewing a certificate, first the generation counter presented
by the user (via their TLS identity) is compared to a value stored
with the associated user (in a new `teleport.dev/bot-generation`
label field). If they aren't equal, the renewal attempt fails.
Otherwise, the generation counter is incremented by 1, stored to the
database using a `CompareAndSwap()` to ensure atomicity, and set on
the generated certificate for use in future renewals.
* Add unit tests for the generation counter
This adds new unit tests to exercise the generation counter checks.
Additionally, it fixes two other renewable cert tests that were
failing.
* Remove certRequestGeneration() function
* Emit audit event when cert generations don't match
* Fully implement `tctl bots lock`
* Show bot name in `tctl bots ls`
* Lock bots when a cert generation mismatch is found
* Make CompareFailed respones from validateGenerationLabel() more actionable
* Update lib/services/local/users.go
Co-authored-by: Nic Klaassen <nic@goteleport.com>
* Backend changes for tbot IoT and AWS joining (#10360)
* backend changes
* add token permission check
* pass ctx from caller
Co-authored-by: Roman Tkachenko <roman@goteleport.com>
* fix comment typo
Co-authored-by: Roman Tkachenko <roman@goteleport.com>
* use UserMetadata instead of Identity in RenewableCertificateGenerationMismatch event
* Client changes for tbot IoT joining (#10397)
* client changes
* delete replaced APIs
* delete unused tbot/auth.go
* add license header
* don't unecessarily fetch host CA
* log fixes
* s/tunnelling/tunneling/
Co-authored-by: Zac Bergquist <zmb3@users.noreply.github.com>
* auth server addresses may be proxies
Co-authored-by: Zac Bergquist <zmb3@users.noreply.github.com>
* comment typo fix
Co-authored-by: Zac Bergquist <zmb3@users.noreply.github.com>
* move *Server methods out of auth_with_roles.go (#10416)
Co-authored-by: Tim Buckley <tim@goteleport.com>
Co-authored-by: Zac Bergquist <zmb3@users.noreply.github.com>
Co-authored-by: Tim Buckley <tim@goteleport.com>
Co-authored-by: Roman Tkachenko <roman@goteleport.com>
Co-authored-by: Tim Buckley <tim@goteleport.com>
Co-authored-by: Zac Bergquist <zmb3@users.noreply.github.com>
Co-authored-by: Nic Klaassen <nic@goteleport.com>
Co-authored-by: Roman Tkachenko <roman@goteleport.com>
Co-authored-by: Zac Bergquist <zmb3@users.noreply.github.com>
* Address another batch of review feedback
* Addres another batch of review feedback
Add `Role.SetMetadata()`, simplify more `trace.WrapWithMessage()`
calls, clear some TODOs and lints, and address other misc feedback
items.
* Fix lint
* Add missing doc comments to SaveIdentity / LoadIdentity
* Remove pam tag from tbot build
* Update note about bot lock deletion
* Another pass of review feedback
Ensure all requestable roles exist when creating a bot, adjust the
default renewable cert TTL down to 1 hour, and check types during
`CompareAndSwapUser()`
Co-authored-by: Zac Bergquist <zmb3@users.noreply.github.com>
Co-authored-by: Nic Klaassen <nic@goteleport.com>
Co-authored-by: Roman Tkachenko <roman@goteleport.com>
* go get google.golang.org/api
go get: upgraded cloud.google.com/go v0.60.0 => v0.100.2
go get: upgraded github.com/golang/snappy v0.0.1 => v0.0.3
go get: upgraded github.com/googleapis/gax-go/v2 v2.0.5 => v2.1.1
go get: upgraded go.opencensus.io v0.22.5 => v0.23.0
go get: upgraded golang.org/x/oauth2 v0.0.0-20200107190931-bf48bf16ab8d => v0.0.0-20211104180415-d3ed0bb246c8
go get: upgraded google.golang.org/api v0.29.0 => v0.65.0
* Optionally fetch transitive groups in the Google OIDC connector
* Refactor the google workspace parts of the OIDC code
* Further refactoring
This undoes the user account impersonation changes, and always requires
an admin account again.
* Test coverage
* Address review comments
* Minor refactor and name changes
* Allow domain filtering, tests now bypass addGoogleWorkspaceClaims
* Update `OIDCConnectorV2` to `OIDCConnectorV3`
* Backwards compatibility for OIDCConnector v2
This also removes the extra boolean flag that was added previously.
* Update e-ref
Enterprise builds will break unless gravitational/teleport.e#385
is included.
* Allow impersonation of roles without users
This adds the ability to impersonate one or more roles without
impersonating a particular user.
In Teleport today, when creating an impersonator role, both users and
roles must be specified as impersonation is fundamentally tied to an
existing Teleport user:
```yaml
allow:
impersonate:
users: ['jenkins']
roles: ['jenkins']
```
This is inconvenient for two reasons:
1. A user must exist for each set of roles you'd like to
impersonate, creating a UX burden.
2. It makes it difficult to use impersonation to reduce one's
permissions as you always inherit all of the roles granted to the
target user.
For the [certificate bot][bot] we'd instead like to use impersonation
to generate end-user (impersonated) certificates with a reduced set
of permissions. For example, given the following role:
```yaml
allow:
impersonate:
roles: ['jenkins', 'deploy']
```
We can then use `GenerateUserCerts` to issue certifices for a subset
of the allowed roles, e.g. one set of certificates with only the
`jenkins` role attached, and another with only `deploy`.
To that end, this patch:
1. Removes the requirement that roles define both `users` and
`roles` in impersonate conditions
2. Introduces a new `RoleRequests` field in `UserCertsRequest`
3. Modifies `generateUserCerts` to gather `roles` from
`RoleRequests` if allowed by an `allow` (with no `users`)
[bot]: https://github.com/gravitational/teleport/pull/7986
* Add `determineDesiredRolesAndTraits`; audit log on role impersonation
This splits initial role and trait determination into a new function,
`determineDesiredRolesAndTraits`, to improve control flow and clarity
given the new branches introduced for role impersonation.
Additionally, this moves the call to `CheckRoleImpersonation` down
to match regular user impersonation's flow, and emits an audit log
event on failure.
* Formatting fix
* Unit testing for role requests
This adds a new set of unit tests for role requests.
Also discovered the `impersonator` field wasn't being set for
role impersonation, so it's now set to the user's own username.
In other words, role impersonation will appear (in the audit log and
elsewhere) as self-impersonation.
* Clean up testing users between runs
* Deny most reimpersonation cases and add tests
This attempts to deny most cases of reimpersonation, where an
impersonated certificate might be used to generate certificates for
other roles the user is allowed to impersonate.
One test case is currently failing pending a solution.
* Add new DisallowReissue certificate extension
This adds a new DisallowReissue certificate extension that, if set,
prevents that identity from interacting with `GenerateUserCerts`.
This flag is always set when RoleRequests are used to prevent
unintended privilege escalation (while avoiding breaking changes to
Teleport's existing certificate generation behavior).
* Fix test lints
* Fix typo
* Fix test doc typo, add testcase for user impersonation misuse
* Apply suggestions from code review
Co-authored-by: rosstimothy <39066650+rosstimothy@users.noreply.github.com>
* Accept context in CreateRole per review feedback
* Fix misleading comment
Co-authored-by: rosstimothy <39066650+rosstimothy@users.noreply.github.com>
* updates endpoints
* Exposes an endpoint for fetching a single desktop by name
* Apply suggestions from code review
Co-authored-by: Zac Bergquist <zmb3@users.noreply.github.com>
* changes inaccurate desktopUUID to desktopName
Co-authored-by: Zac Bergquist <zmb3@users.noreply.github.com>
- Add --windows-logins flag to tctl users add command
- Support {{internal.windows_logins}} and external traits from IDP
This allows one to define a role allowing desktop access without
hard coding all allowed/denied Windows logins.
Updates #7761Fixes#8578
* Add RBAC for Windows desktop access
This commit adds RBAC checks for Windows Desktops as described in
RFD 33 and RFD 34:
- add Windows desktop logins & labels to role definition
- introduce new file config for host labels based on a regexp match
- auth server API performs access checking for Windows desktop resources
- add RDP client callback to authorize the user
- support user/role locks
- respect the client idle timeout setting
Note: in cases where an connection is terminated to to RBAC, the web UI
currently displays "websocket connection failed" because the connection
is closed from the server. We'll need to follow up with a nice error
message for the client side to improve the UX here.
Other changes:
* Remove OSS RBAC migration marked for deletion
* Stop creating a default admin role
* add wildcard desktop access to the preset access role
Updates #7761
Boilerplate for a new service and API objects:
- windows_desktop_service config section
- service registration and heartbeats
- static host registration and heartbeats
- caching, permissions, etc
- "tctl get" support
For new connections the service aborts after authentication, since the
RDP client implementation is not ready yet (pending in
https://github.com/gravitational/teleport/pull/7824).
Tested that the service starts, registers (both over a tunnel and
directly) and creates the API objects.