Updates compatibility guarantee to match latest RFD. Adds klizhentas as a docs reviewer.
77 KiB
title | description |
---|---|
Teleport Admin Manual | Admin manual for how to configure identity-aware SSH, certificate-based SSH authentication, set up SSO for SSH, SSO for Kubernetes, and more. |
Teleport Admin Manual
This manual covers the installation and configuration of Teleport and the ongoing management of a Teleport cluster. It assumes that the reader has good understanding of Linux administration.
Installing
Please visit our installation page for instructions on downloading and installing Teleport.
Definitions
Before diving into configuring and running Teleport, it helps to take a look at the Teleport Architecture and review the key concepts this document will be referring to:
Concept | Description |
---|---|
Node | Synonym to "server" or "computer", something one can "SSH to". A node must be running the teleport daemon with "node" role/service turned on. |
Certificate Authority (CA) | A pair of public/private keys Teleport uses to manage access. A CA can sign a public key of a user or node, establishing their cluster membership. |
Teleport Cluster | A Teleport Auth Service contains two CAs. One is used to sign user keys and the other signs node keys. A collection of nodes connected to the same CA is called a "cluster". |
Cluster Name | Every Teleport cluster must have a name. If a name is not supplied via teleport.yaml configuration file, a GUID will be generated.IMPORTANT: renaming a cluster invalidates its keys and all certificates it had created. |
Trusted Cluster | Teleport Auth Service can allow 3rd party users or nodes to connect if their public keys are signed by a trusted CA. A "trusted cluster" is a pair of public keys of the trusted CA. It can be configured via teleport.yaml file. |
Teleport Daemon
The Teleport daemon is called teleport
and it supports
the following commands:
Command | Description |
---|---|
start | Starts the Teleport daemon. |
configure | Dumps a sample configuration file in YAML format into standard output. |
version | Shows the Teleport version. |
status | Shows the status of a Teleport connection. This command is only available from inside of an active SSH session. |
help | Shows help. |
When experimenting, you can quickly start teleport
with verbose logging by typing teleport start -d
.
!!! danger "WARNING"
Teleport stores data in `/var/lib/teleport` . Make sure that
regular/non-admin users do not have access to this folder on the Auth
server.
Systemd Unit File
In production, we recommend starting teleport daemon via an init system like
systemd
. Here's the recommended Teleport service unit file for systemd:
[Unit]
Description=Teleport SSH Service
After=network.target
[Service]
Type=simple
Restart=on-failure
ExecStart=/usr/local/bin/teleport start --config=/etc/teleport.yaml --pid-file=/run/teleport.pid
ExecReload=/bin/kill -HUP $MAINPID
PIDFile=/run/teleport.pid
[Install]
WantedBy=multi-user.target
Graceful Restarts
If using the systemd service unit file above, executing systemctl reload teleport
will perform a graceful restart, i.e.the Teleport daemon will fork a
new process to handle new incoming requests, leaving the old daemon process
running until existing clients disconnect.
!!! warning "Version warning"
Graceful restarts only work if Teleport is
deployed using network-based storage like DynamoDB or etcd 3.3+. Future
versions of Teleport will not have this limitation.
You can also perform restarts/upgrades by sending kill
signals to a Teleport
daemon manually.
Signal | Teleport Daemon Behavior |
---|---|
USR1 |
Dumps diagnostics/debugging information into syslog. |
TERM , INT or KILL |
Immediate non-graceful shutdown. All existing connections will be dropped. |
USR2 |
Forks a new Teleport daemon to serve new connections. |
HUP |
Forks a new Teleport daemon to serve new connections and initiates the graceful shutdown of the existing process when there are no more clients connected to it. |
Ports
Teleport services listen on several ports. This table shows the default port numbers.
Port | Service | Description |
---|---|---|
3022 | Node | SSH port. This is Teleport's equivalent of port #22 for SSH. |
3023 | Proxy | SSH port clients connect to. A proxy will forward this connection to port #3022 on the destination node. |
3024 | Proxy | SSH port used to create "reverse SSH tunnels" from behind-firewall environments into a trusted proxy server. |
3025 | Auth | SSH port used by the Auth Service to serve its API to other nodes in a cluster. |
3080 | Proxy | HTTPS connection to authenticate tsh users and web users into the cluster. The same connection is used to serve a Web UI. |
3026 | Kubernetes | HTTPS Kubernetes proxy proxy_service.kube_listen_addr |
3027 | Kubernetes | Kubernetes Service kubernetes_service.listen_addr |
Filesystem Layout
By default, a Teleport node has the following files present. The location of all of them is configurable.
Full path | Purpose |
---|---|
/etc/teleport.yaml |
Teleport configuration file (optional). |
/usr/local/bin/teleport |
Teleport daemon binary. |
/usr/local/bin/tctl |
Teleport admin tool. It is only needed for auth servers. |
/var/lib/teleport |
Teleport data directory. Nodes keep their keys and certificates there. Auth servers store the audit log and the cluster keys there, but the audit log storage can be further configured via auth_service section in the config file. |
Configuration
You should use a configuration file to configure the
teleport
daemon. For simple experimentation, you can
use command line flags with the teleport start
command. Read about all the allowed flags in the CLI
Docs or run teleport start --help
Configuration File
Teleport uses the YAML file format for configuration. A sample configuration
file is shown below. By default, it is stored in /etc/teleport.yaml
, below is
an expanded and commented version from teleport configure
.
The default path Teleport uses to look for a config file is /etc/teleport.yaml
. You can override
this path and set it explicitly using the -c
or --config
flag to teleport start
:
$ teleport start --config=/etc/teleport.yaml
For a complete reference, see our Configuration Reference - teleport.yaml
!!! note "IMPORTANT"
When editing YAML configuration, please pay attention to how your
editor handles white space. YAML requires consistent handling of
tab characters.
#
# Sample Teleport configuration file
# Creates a single proxy, auth and node server.
#
# Things to update:
# 1. ca_pin: Obtain the CA pin hash for joining more nodes by running 'tctl status'
# on the auth server once Teleport is running.
# 2. license-if-using-teleport-enterprise.pem: If you are an Enterprise customer,
# obtain this from https://dashboard.gravitational.com/web/login
#
teleport:
# nodename allows to assign an alternative name this node can be reached by.
# by default it's equal to hostname
nodename: NODE_NAME
data_dir: /var/lib/teleport
# Invitation token used to join a cluster. it is not used on
# subsequent starts
auth_token: xxxx-token-xxxx
# Optional CA pin of the auth server. This enables more secure way of adding new
# nodes to a cluster. See "Adding Nodes" section above.
ca_pin: "sha256:ca-pin-hash-goes-here"
# list of auth servers in a cluster. you will have more than one auth server
# if you configure teleport auth to run in HA configuration.
# If adding a node located behind NAT, use the Proxy URL. e.g.
# auth_servers:
# - teleport-proxy.example.com:3080
auth_servers:
- 10.1.0.5:3025
- 10.1.0.6:3025
# Logging configuration. Possible output values to disk via '/var/lib/teleport/teleport.log',
# 'stdout', 'stderr' and 'syslog'. Possible severity values are INFO, WARN
# and ERROR (default).
log:
output: stderr
severity: INFO
auth_service:
enabled: "yes"
# A cluster name is used as part of a signature in certificates
# generated by this CA.
#
# We strongly recommend to explicitly set it to something meaningful as it
# becomes important when configuring trust between multiple clusters.
#
# By default an automatically generated name is used (not recommended)
#
# IMPORTANT: if you change cluster_name, it will invalidate all generated
# certificates and keys (may need to wipe out /var/lib/teleport directory)
cluster_name: "teleport-aws-us-east-1"
# IP and the port to bind to. Other Teleport nodes will be connecting to
# this port (AKA "Auth API" or "Cluster API") to validate client
# certificates
listen_addr: 0.0.0.0:3025
tokens:
- proxy,node:xxxx-token-xxxx
# license_file: /path/to/license-if-using-teleport-enterprise.pem
authentication:
# default authentication type. possible values are 'local' and 'github' for OSS
# and 'oidc', 'saml' and 'false' for Enterprise.
type: local
# second_factor can be off, otp, or u2f
second_factor: otp
ssh_service:
enabled: "yes"
labels:
teleport: static-label-example
commands:
- name: hostname
command: [/usr/bin/hostname]
period: 1m0s
- name: arch
command: [/usr/bin/uname, -p]
period: 1h0m0s
proxy_service:
enabled: "yes"
listen_addr: 0.0.0.0:3023
web_listen_addr: 0.0.0.0:3080
tunnel_listen_addr: 0.0.0.0:3024
# Expose a k8s listening port on the proxy if using Kubernetes
kube_listen_addr: 0.0.0.0:3026
# The DNS name of the proxy HTTPS endpoint as accessible by cluster users.
# Defaults to the proxy's hostname if not specified. If running multiple
# proxies behind a load balancer, this name must point to the load balancer
# (see public_addr section below)
public_addr: TELEPORT_PUBLIC_DNS_NAME:3080
# TLS certificate for the HTTPS connection. Configuring these properly is
# critical for Teleport security.
https_keypairs:
- key_file: /var/lib/teleport/webproxy_key.pem
cert_file: /var/lib/teleport/webproxy_cert.pem
Public Addr
Notice that all three Teleport services (proxy, auth, node) have an optional
public_addr
property. The public address can take an IP or a DNS name. It can
also be a list of values:
public_addr: ["proxy-one.example.com", "proxy-two.example.com"]
Specifying a public address for a Teleport service may be useful in the following use cases:
- You have multiple identical services, like proxies, behind a load balancer.
- You want Teleport to issue SSH certificate for the service with the additional principals, e.g.host names.
Authentication
Teleport uses the concept of "authentication connectors" to authenticate users
when they execute tsh login
command. There are three
types of authentication connectors:
Local Connector
Local authentication is used to authenticate against a local Teleport user
database. This database is managed by tctl users
command. Teleport also supports second factor authentication (2FA) for the local
connector. There are three possible values (types) of 2FA:
-
otp
is the default. It implements TOTP standard. You can use Google Authenticator or Authy or any other TOTP client. -
u2f
implements U2F standard for utilizing hardware (USB) keys for second factor. You can use YubiKeys, SoloKeys or any other hardware token which implements the FIDO U2F standard. -
off
turns off second factor authentication.
Here is an example of this setting in the teleport.yaml
:
auth_service:
authentication:
type: local
second_factor: off
Github OAuth 2.0 Connector
This connector implements Github OAuth 2.0 authentication flow. Please refer to Github documentation on Creating an OAuth App to learn how to create and register an OAuth app.
Here is an example of this setting in the teleport.yaml
:
auth_service:
authentication:
type: github
See Github OAuth 2.0 for details on how to configure it.
SAML
This connector type implements SAML authentication. It can be configured against any external identity manager like Okta or Auth0. This feature is only available for Teleport Enterprise.
Here is an example of this setting in the teleport.yaml
:
auth_service:
authentication:
type: saml
OIDC
Teleport implements OpenID Connect (OIDC) authentication, which is similar to SAML in principle. This feature is only available for Teleport Enterprise.
Here is an example of this setting in the teleport.yaml
:
auth_service:
authentication:
type: oidc
Hardware Keys - YubiKey FIDO U2F
Teleport supports FIDO U2F hardware keys as a second authentication factor. By default U2F is disabled. To start using U2F:
-
Enable U2F in Teleport configuration
/etc/teleport.yaml
. -
For CLI-based logins you have to install u2f-host utility.
-
For web-based logins you have to use Google Chrome and Firefox 67 or greater, are the only supported U2F browsers at this time.
# snippet from /etc/teleport.yaml to show an example configuration of U2F:
auth_service:
authentication:
type: local
second_factor: u2f
# this section is needed only if second_factor is set to 'u2f'
u2f:
# app_id must point to the URL of the Teleport Web UI (proxy) accessible
# by the end users
app_id: https://localhost:3080
# facets must list all proxy servers if there are more than one deployed
facets:
- https://localhost:3080
For single-proxy setups, the app_id
setting can be equal to the domain name of
the proxy, but this will prevent you from adding more proxies without changing
the app_id
. For multi-proxy setups, the app_id
should be an HTTPS URL
pointing to a JSON file that mirrors facets
in the auth config.
!!! warning "Warning"
The `app_id` must never change in the lifetime of the
cluster. If the App ID changes, all existing U2F key registrations will
become invalid and all users who use U2F as the second factor will need to
re-register. When adding a new proxy server, make sure to add it to the list
of "facets" in the configuration file, but also to the JSON file referenced
by `app_id`
Logging in with U2F
For logging in via the CLI, you must first install u2f-host. Installing:
# OSX:
$ brew install libu2f-host
# Ubuntu 16.04 LTS:
$ apt-get install u2f-host
Then invoke tsh ssh
as usual to authenticate:
$ tsh --proxy <proxy-addr> ssh <hostname>
!!! tip "Version Warning"
External user identities are only supported in [Teleport Enterprise](enterprise/introduction.md).
Please reach out to [sales@gravitational.com](mailto:sales@gravitational.com) for more information.
Adding and Deleting Users
This section covers internal user identities, i.e. user accounts created and stored in Teleport's internal storage. Most production users of Teleport use external users via Github or Okta or any other SSO provider (Teleport Enterprise supports any SAML or OIDC compliant identity provider).
A user identity in Teleport exists in the scope of a cluster. The member nodes of a cluster have multiple OS users on them. A Teleport administrator creates Teleport user accounts and maps them to the allowed OS user logins they can use.
Let's look at this table:
Teleport User | Allowed OS Logins | Description |
---|---|---|
joe | joe, root | Teleport user 'joe' can login into member nodes as OS user 'joe' or 'root' |
bob | bob | Teleport user 'bob' can login into member nodes only as OS user 'bob' |
ross | If no OS login is specified, it defaults to the same name as the Teleport user - 'ross'. |
To add a new user to Teleport, you have to use the tctl
tool on the same node where the auth server is running, i.e.
teleport
was started with --roles=auth
.
$ tctl users add joe joe,root
Teleport generates an auto-expiring token (with a TTL of 1 hour) and prints the token URL which must be used before the TTL expires.
Signup token has been created. Share this URL with the user:
https://<proxy>:3080/web/newuser/xxxxxxxxxxxx
NOTE: make sure the <proxy> host is accessible.
The user completes registration by visiting this URL in their web browser,
picking a password and configuring the 2nd factor authentication. If the
credentials are correct, the auth server generates and signs a new certificate
and the client stores this key and will use it for subsequent logins. The key
will automatically expire after 12 hours by default after which the user will
need to log back in with her credentials. This TTL can be configured to a
different value. Once authenticated, the account will become visible via tctl
:
$ tctl users ls
User Allowed Logins
---- --------------
admin admin,root
ross ross
joe joe,root
Joe would then use the tsh
client tool to log in to member node "luna" via
bastion "work" as root:
$ tsh --proxy=work --user=joe root@luna
To delete this user:
$ tctl users rm joe
Editing Users
Users entries can be manipulated using the generic resource
commands via tctl
. For example, to see the
full list of user records, an administrator can execute:
$ tctl get users
To edit the user "joe":
# dump the user definition into a file:
$ tctl get user/joe > joe.yaml
# ... edit the contents of joe.yaml
# update the user record:
$ tctl create -f joe.yaml
Some fields in the user record are reserved for internal use. Some of them will
be finalized and documented in the future versions. Fields like is_locked
or
traits/logins
can be used starting in version 2.3
Adding Nodes to the Cluster
Teleport is a "clustered" system, meaning it only allows access to nodes (servers) that had been previously granted cluster membership.
A cluster membership means that a node receives its own host certificate signed
by the cluster's auth server. To receive a host certificate upon joining a
cluster, a new Teleport host must present an "invite token". An invite token
also defines which role a new host can assume within a cluster: auth
, proxy
or node
.
There are two ways to create invitation tokens:
- Static Tokens are easy to use and somewhat less secure.
- Short-lived Dynamic Tokens are more secure but require more planning.
Static Tokens
Static tokens are defined ahead of time by an administrator and stored in the auth server's config file:
# Config section in `/etc/teleport.yaml` file for the auth server
auth_service:
enabled: true
tokens:
# This static token allows new hosts to join the cluster as "proxy" or "node"
- "proxy,node:secret-token-value"
# A token can also be stored in a file. In this example the token for adding
# new auth servers is stored in /path/to/tokenfile
- "auth:/path/to/tokenfile"
Short-lived Dynamic Tokens
A more secure way to add nodes to a cluster is to generate tokens as they are needed. Such token can be used multiple times until its time to live (TTL) expires.
Use the tctl
tool to register a new invitation token (or
it can also generate a new token for you). In the following example a new token
is created with a TTL of 5 minutes:
$ tctl nodes add --ttl=5m --roles=node,proxy --token=secret-value
The invite token: secret-value
If --token
is not provided, tctl
will generate one:
# generate a short-lived invitation token for a new node:
$ tctl nodes add --ttl=5m --roles=node,proxy
The invite token: e94d68a8a1e5821dbd79d03a960644f0
# you can also list all generated non-expired tokens:
$ tctl tokens ls
Token Type Expiry Time
--------------- ----------- ---------------
e94d68a8a1e5821dbd79d03a960644f0 Node 25 Sep 18 00:21 UTC
# ... or revoke an invitation before it's used:
$ tctl tokens rm e94d68a8a1e5821dbd79d03a960644f0
Using Node Invitation Tokens
Both static and short-lived dynamic tokens are used the same way. Execute the following command on a new node to add it to a cluster:
# adding a new regular SSH node to the cluster:
$ teleport start --roles=node --token=secret-token-value --auth-server=10.0.10.5
# adding a new regular SSH node using Teleport Node Tunneling:
$ teleport start --roles=node --token=secret-token-value --auth-server=teleport-proxy.example.com:3080
# adding a new proxy service on the cluster:
$ teleport start --roles=proxy --token=secret-token-value --auth-server=10.0.10.5
As new nodes come online, they start sending ping requests every few seconds to the CA of the cluster. This allows users to explore cluster membership and size:
$ tctl nodes ls
Node Name Node ID Address Labels
--------- ------- ------- ------
turing d52527f9-b260-41d0-bb5a-e23b0cfe0f8f 10.1.0.5:3022 distro:ubuntu
dijkstra c9s93fd9-3333-91d3-9999-c9s93fd98f43 10.1.0.6:3022 distro:debian
Untrusted Auth Servers
Teleport nodes use the HTTPS protocol to offer the join tokens to the auth
server running on 10.0.10.5
in the example above. In a zero-trust environment,
you must assume that an attacker can hijack the IP address of the auth server
e.g. 10.0.10.5
.
To prevent this from happening, you need to supply every new node with an additional bit of information about the auth server. This technique is called "CA Pinning". It works by asking the auth server to produce a "CA Pin", which is a hashed value of its public key, i.e. for which an attacker can't forge a matching private key.
On the auth server:
$ tctl status
Cluster staging.example.com
User CA never updated
Host CA never updated
CA pin sha256:7e12c17c20d9cb504bbcb3f0236be3f446861f1396dcbb44425fe28ec1c108f1
The "CA pin" at the bottom needs to be passed to the new nodes when they're starting for the first time, i.e. when they join a cluster:
Via CLI:
$ teleport start \
--roles=node \
--token=1ac590d36493acdaa2387bc1c492db1a \
--ca-pin=sha256:7e12c17c20d9cb504bbcb3f0236be3f446861f1396dcbb44425fe28ec1c108f1 \
--auth-server=10.12.0.6:3025
or via /etc/teleport.yaml
on a node:
teleport:
auth_token: "1ac590d36493acdaa2387bc1c492db1a"
ca_pin: "sha256:7e12c17c20d9cb504bbcb3f0236be3f446861f1396dcbb44425fe28ec1c108f1"
auth_servers:
- "10.12.0.6:3025"
!!! warning "Warning"
If a CA pin is not provided, Teleport node will join a
cluster but it will print a `WARN` message (warning) into its standard
error output.
!!! warning "Warning"
The CA pin becomes invalid if a Teleport administrator
performs the CA rotation by executing
[ `tctl auth rotate` ](cli-docs.md#tctl-auth-rotate) .
Revoking Invitations
As you have seen above, Teleport uses tokens to invite users to a cluster (sign-up tokens) or to add new nodes to it (provisioning tokens).
Both types of tokens can be revoked before they can be used. To see a list of outstanding tokens, run this command:
$ tctl tokens ls
Token Role Expiry Time (UTC)
----- ---- -----------------
eoKoh0caiw6weoGupahgh6Wuo7jaTee2 Proxy never
696c0471453e75882ff70a761c1a8bfa Node 17 May 16 03:51 UTC
6fc5545ab78c2ea978caabef9dbd08a5 Signup 17 May 16 04:24 UTC
In this example, the first token has a "never" expiry date because it is a static token configured via a config file.
The 2nd token with "Node" role was generated to invite a new node to this cluster. And the 3rd token was generated to invite a new user.
The latter two tokens can be deleted (revoked) via tctl tokens del
command:
$ tctl tokens del 696c0471453e75882ff70a761c1a8bfa
Token 696c0471453e75882ff70a761c1a8bfa has been deleted
Adding a node located behind NAT
!!! note This feature is sometimes called "Teleport IoT" or node tunneling.
With the current setup, you've only been able to add nodes that have direct access to the auth server and within the internal IP range of the cluster. We recommend setting up a Trusted Cluster if you have workloads split across different networks/clouds.
Teleport Node Tunneling lets you add a remote node to an existing Teleport Cluster via tunnel. This can be useful for IoT applications, or for managing a couple of servers in a different network.
Similar to Adding Nodes to the Cluster, use tctl
to
create a single-use token for a node, but this time you'll replace the auth
server IP with the URL of the proxy server. In the example below, we've
replaced the auth server IP with the proxy web endpoint teleport-proxy.example.com:3080
.
$ sudo tctl nodes add
The invite token: n92bb958ce97f761da978d08c35c54a5c
Run this on the new node to join the cluster:
teleport start --roles=node --token=n92bb958ce97f761da978d08c35c54a5c --auth-server=teleport-proxy.example.com:3080
Using the ports in the default configuration, the node needs to be able to talk to ports 3080 and 3024 on the proxy. Port 3080 is used to initially fetch the credentials (SSH and TLS certificates) and for discovery (where is the reverse tunnel running, in this case 3024). Port 3024 is used to establish a connection to the auth server through the proxy.
To enable multiplexing so only one port is used, simply set the tunnel_listen_addr
the same as the
web_listen_addr
respectively within the proxy_service
. Teleport will automatically recognize using the same port and enable multiplexing. If the log setting is set to DEBUG you will see multiplexing enabled in the server log.
DEBU [PROC:1] Setup Proxy: Reverse tunnel proxy and web proxy listen on the same port, multiplexing is on. service/service.go:1944
!!! tip "Load Balancers"
The setup above also works even if the cluster uses multiple proxies behind
a load balancer (LB) or a DNS entry with multiple values. This works by
the node establishing a tunnel to _every_ proxy. This requires that an LB
uses round-robin or a similar balancing algorithm. Do not use sticky load
balancing algorithms (a.k.a. "session affinity") with Teleport proxies.
Labeling Nodes and Applications
In addition to specifying a custom nodename, Teleport also allows for the application of arbitrary key:value pairs to each node or app, called labels. There are two kinds of labels:
-
static labels
do not change over time, whileteleport
process is running. Examples of static labels are physical location of nodes, name of the environment (staging vs production), etc. -
dynamic labels
also known as "label commands" allow to generate labels at runtime. Teleport will execute an external command on a node at a configurable frequency and the output of a command becomes the label value. Examples include reporting load averages, presence of a process, time after last reboot, etc.
There are two ways to configure node labels.
- Via command line, by using
--labels
flag toteleport start
command. - Using
/etc/teleport.yaml
configuration file on the nodes.
To define labels as command line arguments, use --labels
flag like shown
below. This method works well for static labels or simple commands:
$ teleport start --labels uptime=[1m:"uptime -p"],kernel=[1h:"uname -r"]
Alternatively, you can update labels
via a configuration file:
ssh_service:
enabled: "yes"
# ...
# Static labels are simple key/value pairs:
labels:
environment: test
app_service:
# ..
labels:
environment: test
To configure dynamic labels via a configuration file, define a commands
array
as shown below:
ssh_service:
enabled: "yes"
# Dynamic labels AKA "commands":
commands:
- name: hostname
command: [hostname]
period: 1m0s
- name: arch
command: [uname, -p]
# this setting tells teleport to execute the command above
# once an hour. this value cannot be less than one minute.
period: 1h0m0s
app_service:
enabled: "yes"
# ...
# Dynamic labels (historically called "commands"):
commands:
- name: hostname
command: [hostname]
period: 1m0s
/path/to/executable
must be a valid executable command (i.e. executable bit
must be set) which also includes shell scripts with a proper shebang
line.
Important: notice that command
setting is an array where the first element
is a valid executable and each subsequent element is an argument, i.e:
# valid syntax:
command: ["/bin/uname", "-m"]
# INVALID syntax:
command: ["/bin/uname -m"]
# if you want to pipe several bash commands together, here's how to do it:
# notice how ' and " are interchangeable and you can use it for quoting:
command: ["/bin/sh", "-c", "uname -a | egrep -o '[0-9]+\\.[0-9]+\\.[0-9]+'"]
Audit Log
Teleport logs every SSH event into its audit log. There are two components of the audit log:
-
SSH Events: Teleport logs events like successful user logins along with the metadata like remote IP address, time and the session ID.
-
Recorded Sessions: Every SSH shell session is recorded and can be replayed later. The recording is done by the nodes themselves, by default, but can be configured to be done by the proxy.
-
Optional: Enhanced Session Recording
Refer to the "Audit Log" chapter in the Teleport Architecture to learn more about how the audit log and session recording are designed.
Events
Teleport supports multiple storage back-ends for storing the SSH, Application and Kubernetes events.
The section below uses the dir
backend as an example. dir
backend uses the local
filesystem of an auth server using the configurable data_dir
directory.
For highly available (HA) configurations, users can refer to our
DynamoDB or Firestore chapters for information
on how to configure the SSH events and recorded sessions to be stored on
network storage. It is even possible to store the audit log in multiple places at the
same time - see audit_events_uri
setting in the sample configuration file above for
how to do that.
Let's examine the Teleport audit log using the dir
backend. The event log is
stored in data_dir
under log
directory, usually /var/lib/teleport/log
.
Each day is represented as a file:
$ ls -l /var/lib/teleport/log/
total 104
-rw-r----- 1 root root 31638 Jan 22 20:00 2017-01-23.00:00:00.log
-rw-r----- 1 root root 91256 Jan 31 21:00 2017-02-01.00:00:00.log
-rw-r----- 1 root root 15815 Feb 32 22:54 2017-02-03.00:00:00.log
The log files use JSON format. They are human-readable but can also be programmatically parsed. Each line represents an event and has the following format:
{
// Event type. See below for the list of all possible event types
"event": "session.start",
// uid: A unique ID for the event log. Useful for deduplication.
"uid": "59cf8d1b-7b36-4894-8e90-9d9713b6b9ef",
// Teleport user name
"user": "ekontsevoy",
// OS login
"login": "root",
// Server namespace. This field is reserved for future use.
"namespace": "default",
// Unique server ID.
"server_id": "f84f7386-5e22-45ff-8f7d-b8079742e63f",
// Server Labels.
"server_labels": {
"datacenter": "us-east-1",
"label-b": "x"
}
// Session ID. Can be used to replay the session.
"sid": "8d3895b6-e9dd-11e6-94de-40167e68e931",
// Address of the SSH node
"addr.local": "10.5.l.15:3022",
// Address of the connecting client (user)
"addr.remote": "73.223.221.14:42146",
// Terminal size
"size": "80:25",
// Timestamp
"time": "2017-02-03T06:54:05Z"
}
The possible event types are:
Event Type | Description |
---|---|
auth | Authentication attempt. Adds the following fields: {"success": "false", "error": "access denied"} |
session.start | Started an interactive shell session. |
session.end | An interactive shell session has ended. |
session.join | A new user has joined the existing interactive shell session. |
session.leave | A user has left the session. |
session.disk | A list of files opened during the session. Requires Enhanced Session Recording. |
session.network | A list of network connections made during the session. Requires Enhanced Session Recording. |
session.command | A list of commands ran during the session. Requires Enhanced Session Recording. |
exec | Remote command has been executed via SSH, like tsh ssh root@node ls / . The following fields will be logged: {"command": "ls /", "exitCode": 0, "exitError": ""} |
scp | Remote file copy has been executed. The following fields will be logged: {"path": "/path/to/file.txt", "len": 32344, "action": "read" } |
resize | Terminal has been resized. |
user.login | A user logged into web UI or via tsh. The following fields will be logged: {"user": "alice@example.com", "method": "local"} . |
app.session.start | A user accessed an application |
app.session.chunk | A record of activity during an app session |
Recorded Sessions
In addition to logging session.start
and session.end
events, Teleport also
records the entire stream of bytes going to/from standard input and standard
output of an SSH session.
Teleport can store the recorded sessions in an AWS S3 bucket or in a local filesystem (including NFS).
The recorded sessions are stored as raw bytes in the sessions
directory under
log
. Each session consists of two files, both are named after the session ID:
-
.bytes
file or.chunks.gz
compressed format represents the raw session bytes and is somewhat human-readable, although you are better off usingtsh play
or the Web UI to replay it. -
.log
file or.events.gz
compressed file contains the copies of the event log entries that are related to this session.
$ ls /var/lib/teleport/log/sessions/default
-rw-r----- 1 root root 506192 Feb 4 00:46 4c146ec8-eab6-11e6-b1b3-40167e68e931.session.bytes
-rw-r----- 1 root root 44943 Feb 4 00:46 4c146ec8-eab6-11e6-b1b3-40167e68e931.session.log
To replay this session via CLI:
$ tsh --proxy=proxy play 4c146ec8-eab6-11e6-b1b3-40167e68e931
Resources
A Teleport administrator has two tools to configure a Teleport cluster:
-
The configuration file is used for static configuration like the cluster name.
-
The
tctl
admin tool is used for manipulating dynamic records like Teleport users.
tctl
has convenient subcommands for dynamic
configuration, like tctl users
or tctl nodes
. However, for dealing with
more advanced topics, like connecting clusters together or troubleshooting
trust, tctl
offers the more powerful, although
lower-level CLI interface called resources
.
The concept is borrowed from the REST programming pattern. A cluster is composed
of different objects (aka, resources) and there are just three common operations
that can be performed on them: get
, create
, remove
.
A resource is defined as a YAML file. Every resource in Teleport has three required fields:
Kind
- The type of resourceName
- A required field in themetadata
to uniquely identify the resourceVersion
- The version of the resource format
Everything else is resource-specific and any component of a Teleport cluster can be manipulated with just 3 CLI commands:
Command | Description | Examples |
---|---|---|
tctl get |
Get one or multiple resources | tctl get users or tctl get user/joe |
tctl rm |
Delete a resource by type/name | tctl rm user/joe |
tctl create |
Create a new resource from a YAML file. Use -f to override / update |
tctl create -f joe.yaml |
!!! warning "YAML Format"
By default Teleport uses [YAML format](https://en.wikipedia.org/wiki/YAML)
to describe resources. YAML is a
wonderful and very human-readable alternative to JSON or XML, but it's
sensitive to white space. Pay attention to spaces vs tabs!
Here's an example how the YAML resource definition for a user Joe might look
like. It can be retrieved by executing tctl get user/joe
kind: user
version: v2
metadata:
name: joe
spec:
roles: admin
status:
# users can be temporarily locked in a Teleport system, but this
# functionality is reserved for internal use for now.
is_locked: false
lock_expires: 0001-01-01T00:00:00Z
locked_time: 0001-01-01T00:00:00Z
traits:
# these are "allowed logins" which are usually specified as the
# last argument to `tctl users add`
logins:
- joe
- root
# any resource in Teleport can automatically expire.
expires: 0001-01-01T00:00:00Z
# for internal use only
created_by:
time: 0001-01-01T00:00:00Z
user:
name: builtin-Admin
!!! tip "Note"
Some of the fields you will see when printing resources are used
only internally and are not meant to be changed. Others are reserved for
future use.
Here's the list of resources currently exposed via tctl
:
Resource Kind | Description |
---|---|
user | A user record in the internal Teleport user DB. |
node | A registered SSH node. The same record is displayed via tctl nodes ls |
cluster | A trusted cluster. See here for more details on connecting clusters together. |
role | A role assumed by users. The open source Teleport only includes one role: "admin", but Enterprise teleport users can define their own roles. |
connector | Authentication connectors for single sign-on (SSO) for SAML, OIDC and Github. |
Examples:
# list all connectors:
$ tctl get connectors
# dump a SAML connector called "okta":
$ tctl get saml/okta
# delete a SAML connector called "okta":
$ tctl rm saml/okta
# delete an OIDC connector called "gsuite":
$ tctl rm oidc/gsuite
# delete a github connector called "myteam":
$ tctl rm github/myteam
# delete a local user called "admin":
$ tctl rm users/admin
!!! note
Although tctl get connectors
will show you every connector, when working with an individual
connector you must use the correct kind
, such as saml
or oidc
. You can see each
connector's kind
at the top of its YAML output from tctl get connectors
.
Trusted Clusters
As explained in the architecture document, Teleport can partition compute infrastructure into multiple clusters. A cluster is a group of nodes connected to the cluster's auth server, acting as a certificate authority (CA) for all users and nodes.
To retrieve an SSH certificate, users must authenticate with a cluster through a
proxy server. So, if users want to connect to nodes belonging to different
clusters, they would normally have to use a different --proxy
flag for each
cluster. This is not always convenient.
The concept of trusted clusters allows Teleport administrators to connect multiple clusters together and establish trust between them. Trusted clusters allow users of one cluster to seamlessly SSH into the nodes of another cluster without having to "hop" between proxy servers. Moreover, users don't even need to have a direct connection to other clusters' proxy servers. Trusted clusters also have their own restrictions on user access.
To learn more about Trusted Clusters please visit our Trusted Cluster Guide
Github OAuth 2.0
Teleport supports authentication and authorization via external identity providers such as Github. You can watch the video for how to configure Github as an SSO provider, or you can follow the documentation below.
First, the Teleport auth service must be configured to use Github for authentication:
# snippet from /etc/teleport.yaml
auth_service:
authentication:
type: github
Next step is to define a Github connector:
# Create a file called github.yaml:
kind: github
version: v3
metadata:
# connector name that will be used with `tsh --auth=github login`
name: github
spec:
# client ID of Github OAuth app
client_id: <client-id>
# client secret of Github OAuth app
client_secret: <client-secret>
# connector display name that will be shown on web UI login screen
display: Github
# callback URL that will be called after successful authentication
redirect_url: https://<proxy-address>/v1/webapi/github/callback
# mapping of org/team memberships onto allowed logins and roles
teams_to_logins:
- organization: octocats # Github organization name
team: admins # Github team name within that organization
# allowed logins for users in this org/team
logins:
- root
# List of Kubernetes groups this Github team is allowed to connect to
# (see Kubernetes integration for more information)
kubernetes_groups: ["system:masters"]
!!! note
For open-source Teleport the `logins` field contains a list of allowed
OS logins. For the commercial Teleport Enterprise offering, which supports
role-based access control, the same field is treated as a list of _roles_
that users from the matching org/team assume after going through the
authorization flow.
To obtain client ID and client secret, please follow Github documentation on
how to create and register an OAuth
app.
Be sure to set the "Authorization callback URL" to the same value as
redirect_url
in the resource spec. Teleport will request only the read:org
OAuth scope, you can read more about Github OAuth scopes.
Finally, create the connector using tctl
resource management command:
$ tctl create github.yaml
!!! tip
When going through the Github authentication flow for the first time,
the application must be granted the access to all organizations that are
present in the "teams to logins" mapping, otherwise Teleport will not be
able to determine team memberships for these orgs.
HTTP CONNECT Proxies
Some networks funnel all connections through a proxy server where they can be audited and access control rules are applied. For these scenarios Teleport supports HTTP CONNECT tunneling.
To use HTTP CONNECT tunneling, simply set either the HTTPS_PROXY
or
HTTP_PROXY
environment variables and when Teleport builds and establishes the
reverse tunnel to the main cluster, it will funnel all traffic though the proxy.
Specifically, if using the default configuration, Teleport will tunnel ports
3024
(SSH, reverse tunnel) and 3080
(HTTPS, establishing trust) through the
proxy.
The value of HTTPS_PROXY
or HTTP_PROXY
should be in the format
scheme://host:port
where scheme is either https
or http
. If the value is
host:port
, Teleport will prepend http
.
It's important to note that in order for Teleport to use HTTP CONNECT
tunnelling, the HTTP_PROXY
and HTTPS_PROXY
environment variables must be set
within Teleport's environment. You can also optionally set the NO_PROXY
environment variable to avoid use of the proxy when accessing specified
hosts/netmasks. When launching Teleport with systemd, this will probably involve
adding some lines to your systemd unit file:
[Service]
Environment="HTTP_PROXY=http://proxy.example.com:8080/"
Environment="HTTPS_PROXY=http://proxy.example.com:8080/"
Environment="NO_PROXY=localhost,127.0.0.1,192.168.0.0/16,172.16.0.0/12,10.0.0.0/8"
!!! tip "Note"
`localhost` and `127.0.0.1` are invalid values for the proxy
host. If for some reason your proxy runs locally, you'll need to provide
some other DNS name or a private IP address for it.
PAM Integration
Review our dedicated Using Teleport with PAM guide.
Using Teleport with OpenSSH
Review our dedicated Using Teleport with OpenSSH guide.
Certificate Rotation
Take a look at the Certificates chapter in the architecture document to learn how the certificate rotation works. This section will show you how to implement certificate rotation in practice.
The easiest way to start the rotation is to execute this command on a cluster's auth server:
$ tctl auth rotate
This will trigger a rotation process for both hosts and users with a grace period of 48 hours.
This can be customized, i.e.
# rotate only user certificates with a grace period of 200 hours:
$ tctl auth rotate --type=user --grace-period=200h
# rotate only host certificates with a grace period of 8 hours:
$ tctl auth rotate --type=host --grace-period=8h
The rotation takes time, especially for hosts, because each node in a cluster needs to be notified that a rotation is taking place and request a new certificate for itself before the grace period ends.
!!! warning "Warning"
Be careful when choosing a grace period when rotating
host certificates. The grace period needs to be long enough for all nodes in
a cluster to request a new certificate. If some nodes go offline during the
rotation and come back only after the grace period has ended, they will be
forced to leave the cluster, i.e. users will no longer be allowed to SSH
into them.
To check the status of certificate rotation:
$ tctl status
!!! warning "CA Pinning Warning"
If you are using [CA Pinning](#untrusted-auth-servers) when adding new
nodes, the CA pin will changes after the rotation. Make sure you use the
_new_ CA pin when adding nodes after rotation.
Ansible Integration
Ansible uses the OpenSSH client by default. This makes it compatible with Teleport without any extra work, except configuring OpenSSH client to work with Teleport Proxy:
- configure your OpenSSH to connect to Teleport proxy and use
ssh-agent
socket - enable scp mode in the Ansible config file (default is
/etc/ansible/ansible.cfg
):
scp_if_ssh = True
Kubernetes Integration
Teleport can be configured as a compliance gateway for Kubernetes clusters.
This allows users to authenticate against a Teleport proxy using tsh login
and tsh kube login
command to retrieve credentials for both SSH and
Kubernetes API.
Follow our Kubernetes guide which contains some more specific examples and instructions.
High Availability
!!! tip "Tip"
Before continuing, please make sure to take a look at the
[Cluster State section](architecture/nodes.md#cluster-state) in the Teleport
Architecture documentation.
Usually there are two ways to achieve high availability. You can "outsource" this function to the infrastructure. For example, using a highly available network-based disk volumes (similar to AWS EBS) and by migrating a failed VM to a new host. In this scenario, there's nothing Teleport-specific to be done.
If high availability cannot be provided by the infrastructure (perhaps you're running Teleport on a bare metal cluster), you can still configure Teleport to run in a highly available fashion.
Auth Server HA
In order to run multiple instances of Teleport Auth Server, you must switch to a highly available secrets back-end first. Also, you must tell each node in a cluster that there is more than one auth server available. There are two ways to do this:
- Use a load balancer to create a single auth API access point (AP) and
specify this AP in
auth_servers
section of Teleport configuration for all nodes in a cluster. This load balancer should do TCP level forwarding.
- If a load balancer is not an option, you must specify each instance of an
auth server in
auth_servers
section of Teleport configuration.
IMPORTANT: with multiple instances of the auth servers running, special
attention needs to be paid to keeping their configuration identical. Settings
like cluster_name
, tokens
, storage
, etc must be the same.
Teleport Proxy HA
The Teleport Proxy is stateless which makes running multiple instances trivial.
If using the default configuration, configure your load balancer to
forward ports 3023
and 3080
to the servers that run the Teleport proxy. If
you have configured your proxy to use non-default ports, you will need to
configure your load balancer to forward the ports you specified for
listen_addr
and web_listen_addr
in teleport.yaml
. The load balancer for
web_listen_addr
can terminate TLS with your own certificate that is valid for
your users, while the remaining ports should do TCP level forwarding, since
Teleport will handle its own SSL on top of that with its own certificates.
!!! tip "NOTE"
If you terminate TLS with your own certificate at a load
balancer you'll need to run Teleport with `--insecure-no-tls`
If your load balancer supports HTTP health checks, configure it to hit the
/readyz
diagnostics endpoint on machines running Teleport. This endpoint
must be enabled by using the --diag-addr
flag to teleport start: teleport start --diag-addr=127.0.0.1:3000
The http://127.0.0.1:3000/readyz endpoint will reply {"status":"ok"}
if the Teleport service
is running without problems.
!!! tip "NOTE"
As the new auth servers get added to the cluster and the old
servers get decommissioned, nodes and proxies will refresh the list of
available auth servers and store it in their local cache
`/var/lib/teleport/authservers.json` - the values from the cache file will take
precedence over the configuration file.
We'll cover how to use etcd
, DynamoDB and Firestore storage back-ends to make Teleport
highly available below.
Teleport Scalability Tweaks
When running Teleport at scale (for example in the case where there are 10,000+ nodes connected to a cluster via node tunnelling mode, the following settings should be set on Teleport auth and proxies:
Proxy Servers
These settings alter Teleport's default connection limit from 15000 to 65000.
# Teleport Proxy
teleport:
cache:
# use an in-memory cache to speed up the connection of many teleport nodes
# back to proxy
type: in-memory
# set up connection limits to prevent throttling of many IoT nodes connecting to proxies
connection_limits:
max_connections: 65000
max_users: 1000
Auth Servers
# Teleport Auth
teleport:
connection_limits:
max_connections: 65000
max_users: 1000
Using etcd
Teleport can use etcd as a storage backend to
achieve highly available deployments. You must take steps to protect access to
etcd
in this configuration because that is where Teleport secrets like keys
and user records will be stored.
!!! warning "IMPORTANT"
`etcd` can only currently be used to store Teleport's internal database in a highly-available
way. This will allow you to have multiple auth servers in your cluster for an HA deployment,
but it will not also store Teleport audit events for you in the same way that
[DynamoDB](#using-dynamodb) or [Firestore](#using-firestore) will.
To configure Teleport for using etcd as a storage back-end:
- Make sure you are using etcd version 3.3 or newer.
- Install etcd and configure peer and client TLS authentication using the etcd
security guide.
- You can use this script provided by etcd if you don't already have a TLS setup.
- Configure all Teleport Auth servers to use etcd in the "storage" section of the config file as shown below.
- Deploy several auth servers connected to etcd back-end.
- Deploy several proxy nodes that have
auth_servers
pointed to list of auth servers to connect to.
teleport:
storage:
type: etcd
# list of etcd peers to connect to:
peers: ["https://172.17.0.1:4001", "https://172.17.0.2:4001"]
# required path to TLS client certificate and key files to connect to etcd
#
# to create these, follow
# https://coreos.com/os/docs/latest/generate-self-signed-certificates.html
# or use the etcd-provided script
# https://github.com/etcd-io/etcd/tree/master/hack/tls-setup
tls_cert_file: /var/lib/teleport/etcd-cert.pem
tls_key_file: /var/lib/teleport/etcd-key.pem
# optional file with trusted CA authority
# file to authenticate etcd nodes
#
# if you used the script above to generate the client TLS certificate,
# this CA certificate should be one of the other generated files
tls_ca_file: /var/lib/teleport/etcd-ca.pem
# alternative password based authentication, if not using TLS client
# certificate
#
# See https://etcd.io/docs/v3.4.0/op-guide/authentication/ for setting
# up a new user
username: username
password_file: /mnt/secrets/etcd-pass
# etcd key (location) where teleport will be storing its state under.
# make sure it ends with a '/'!
prefix: /teleport/
# NOT RECOMMENDED: enables insecure etcd mode in which self-signed
# certificate will be accepted
insecure: false
# Optionally sets the limit on the client message size.
# This is usually used to increase the default which is 2MiB
# (1.5MiB server's default + gRPC overhead bytes).
# Make sure this does not exceed the value for the etcd
# server specified with `--max-request-bytes` (1.5MiB by default).
# Keep the two values in sync.
#
# See https://etcd.io/docs/v3.4.0/dev-guide/limit/ for details
#
# This bumps the size to 15MiB as an example:
etcd_max_client_msg_size_bytes: 15728640
Using Amazon S3
!!! tip "Tip"
Before continuing, please make sure to take a look at the
[cluster state section](architecture/nodes.md#cluster-state) in Teleport
Architecture documentation.
!!! tip "AWS Authentication"
The configuration examples below contain AWS
access keys and secret keys. They are optional, they exist for your
convenience but we DO NOT RECOMMEND using them in production. If Teleport is
running on an AWS instance it will automatically use the instance IAM role.
Teleport also will pick up AWS credentials from the `~/.aws` folder, just
like the AWS CLI tool.
S3 buckets can only be used as a storage for the recorded sessions. S3 cannot store the audit log or the cluster state. Below is an example of how to configure a Teleport auth server to store the recorded sessions in an S3 bucket.
teleport:
storage:
# The region setting sets the default AWS region for all AWS services
# Teleport may consume (DynamoDB, S3)
region: us-east-1
# Path to S3 bucket to store the recorded sessions in.
audit_sessions_uri: "s3://Example_TELEPORT_S3_BUCKET/records"
# Teleport assumes credentials. Using provider chains, assuming IAM role or
# standard .aws/credentials in the home folder.
The AWS authentication settings above can be omitted if the machine itself is running on an EC2 instance with an IAM role.
Using DynamoDB
!!! tip "Tip"
Before continuing, please make sure to take a look at the
[cluster state section](architecture/nodes.md#cluster-state) in Teleport Architecture documentation.
If you are running Teleport on AWS, you can use DynamoDB as a storage back-end to achieve high availability. DynamoDB back-end supports two types of Teleport data:
- Cluster state
- Audit log events
DynamoDB cannot store the recorded sessions. You are advised to use AWS S3 for that as shown above. To configure Teleport to use DynamoDB:
- Make sure you have AWS access key and a secret key which give you access to DynamoDB account. If you're using (as recommended) an IAM role for this, the policy with necessary permissions is listed below.
- Configure all Teleport Auth servers to use DynamoDB back-end in the "storage"
section of
teleport.yaml
as shown below. - Deploy several auth servers connected to DynamoDB storage back-end.
- Deploy several proxy nodes.
- Make sure that all Teleport nodes have
auth_servers
configuration setting populated with the auth servers.
teleport:
storage:
type: dynamodb
# Region location of dynamodb instance, https://docs.aws.amazon.com/en_pv/general/latest/gr/rande.html#ddb_region
region: us-east-1
# Name of the DynamoDB table. If it does not exist, Teleport will create it.
table_name: Example_TELEPORT_DYNAMO_TABLE_NAME
# This setting configures Teleport to send the audit events to three places:
# To keep a copy in DynamoDB, a copy on a local filesystem, and also output the events to stdout.
# NOTE: The DynamoDB events table has a different schema to the regular Teleport
# database table, so attempting to use same table for both will result in errors.
# When using highly available storage like DynamoDB, you should make sure that the list always specifies
# the HA storage method first, as this is what the Teleport web UI uses as its source of events to display.
audit_events_uri: ['dynamodb://events_table_name', 'file:///var/lib/teleport/audit/events', 'stdout://']
# This setting configures Teleport to save the recorded sessions in an S3 bucket:
audit_sessions_uri: s3://Example_TELEPORT_S3_BUCKET/records
- Replace
us-east-1
andExample_TELEPORT_DYNAMO_TABLE_NAME
with your own settings. Teleport will create the table automatically. Example_TELEPORT_DYNAMO_TABLE_NAME
andevents_table_name
must be different DynamoDB tables. The schema is different for each. Using the same table name for both will result in errors.- The AWS authentication setting above can be omitted if the machine itself is running on an EC2 instance with an IAM role.
- Audit log settings above are optional. If specified, Teleport will store the
audit log in DynamoDB and the session recordings must be stored in an S3
bucket, i.e. both
audit_xxx
settings must be present. If they are not set, Teleport will default to a local file system for the audit log, i.e./var/lib/teleport/log
on an auth server. - If DynamoDB is used for the audit log, the logged events will be stored with a TTL of 1 year. Currently this TTL is not configurable.
!!! warning "Access to DynamoDB"
Make sure that the IAM role assigned to
Teleport is configured with the sufficient access to DynamoDB. Below is the
example of the IAM policy you can use:
{
"Version": "2012-10-17",
"Statement": [{
"Sid": "AllAPIActionsOnTeleportAuth",
"Effect": "Allow",
"Action": "dynamodb:*",
"Resource": "arn:aws:dynamodb:eu-west-1:123456789012:table/prod.teleport.auth"
},
{
"Sid": "AllAPIActionsOnTeleportStreams",
"Effect": "Allow",
"Action": "dynamodb:*",
"Resource": "arn:aws:dynamodb:eu-west-1:123456789012:table/prod.teleport.auth/stream/*"
}
]
}
Using GCS
!!! tip "Tip"
Before continuing, please make sure to take a look at the
[cluster state section](architecture/nodes.md#cluster-state) in Teleport
Architecture documentation.
Google Cloud Storage (GCS) can only be used as a storage for the recorded sessions. GCS cannot store the audit log or the cluster state. Below is an example of how to configure a Teleport auth server to store the recorded sessions in a GCS bucket.
teleport:
storage:
# Path to GCS to store the recorded sessions in.
audit_sessions_uri: "gs://Example_TELEPORT_STORAGE/records"
credentials_path: /var/lib/teleport/gcs_creds
Using Firestore
!!! tip "Tip"
Before continuing, please make sure to take a look at the
[cluster state section](architecture/nodes.md#cluster-state) in Teleport Architecture documentation.
If you are running Teleport on GCP, you can use Firestore as a storage back-end to achieve high availability. Firestore back-end supports two types of Teleport data:
- Cluster state
- Audit log events
Firestore cannot store the recorded sessions. You are advised to use Google Cloud Storage (GCS) for that as shown above. To configure Teleport to use Firestore:
- Configure all Teleport Auth servers to use Firestore back-end in the "storage"
section of
teleport.yaml
as shown below. - Deploy several auth servers connected to Firestore storage back-end.
- Deploy several proxy nodes.
- Make sure that all Teleport nodes have
auth_servers
configuration setting populated with the auth servers or use a load balancer for the auth servers in high availability mode.
teleport:
storage:
type: firestore
# Project ID https://support.google.com/googleapi/answer/7014113?hl=en
project_id: Example_GCP_Project_Name
# Name of the Firestore table. If it does not exist, Teleport won't start
collection_name: Example_TELEPORT_FIRESTORE_TABLE_NAME
credentials_path: /var/lib/teleport/gcs_creds
# This setting configures Teleport to send the audit events to three places:
# To keep a copy in Firestore, a copy on a local filesystem, and also write the events to stdout.
# NOTE: The Firestore events table has a different schema to the regular Teleport
# database table, so attempting to use same table for both will result in errors.
# When using highly available storage like Firestore, you should make sure that the list always specifies
# the HA storage method first, as this is what the Teleport web UI uses as its source of events to display.
audit_events_uri: ['firestore://Example_TELEPORT_FIRESTORE_EVENTS_TABLE_NAME', 'file:///var/lib/teleport/audit/events', 'stdout://']
# This setting configures Teleport to save the recorded sessions in GCP storage:
audit_sessions_uri: gs://Example_TELEPORT_S3_BUCKET/records
-
Replace
Example_GCP_Project_Name
andExample_TELEPORT_FIRESTORE_TABLE_NAME
with your own settings. Teleport will create the table automatically. -
Example_TELEPORT_FIRESTORE_TABLE_NAME
andExample_TELEPORT_FIRESTORE_EVENTS_TABLE_NAME
must be different Firestore tables. The schema is different for each. Using the same table name for both will result in errors. -
The GCP authentication setting above can be omitted if the machine itself is running on a GCE instance with a Service Account that has access to the Firestore table.
-
Audit log settings above are optional. If specified, Teleport will store the audit log in Firestore and the session recordings must be stored in a GCP bucket, i.e.both
audit_xxx
settings must be present. If they are not set, Teleport will default to a local file system for the audit log, i.e./var/lib/teleport/log
on an auth server.
Upgrading Teleport
Teleport is always a critical component of the infrastructure it runs on. This is why upgrading to a new version must be performed with caution.
Teleport is a much more capable system than a bare bones SSH server. While it offers significant benefits on a cluster level, it also adds some complexity to cluster upgrades. To ensure robust operation Teleport administrators must follow the upgrade rules listed below.
Production Releases
First of all, avoid running pre-releases (release candidates) in production environments. Teleport development team uses Semantic Versioning which makes it easy to tell if a specific version is recommended for production use.
Component Compatibility
When running multiple binaries of Teleport within a cluster (nodes, proxies, clients, etc), the following rules apply:
Before 5.0.0
-
Only patch versions are always compatible, for example any 4.0.1 component will work with any 4.0.3 component.
-
Minor versions are always compatible with the previous minor release. This means you must not attempt to upgrade from 4.1.x straight to 4.3.x. You must upgrade to 4.2.x first.
-
Teleport clients
tsh
for users andtctl
for admins may not be compatible with different versions of theteleport
service.
After 5.0.0
-
Patch and minor versions are always compatible, for example any 5.0.1 component will work with any 5.0.3 component and 6.1.0 component will work with any 6.7.0 component.
-
Major versions are always compatible with the previous major release. This means you must not attempt to upgrade from 5.x.x straight to 7.x.x. You must upgrade to 6.x.x first.
-
Teleport clients
tsh
for users andtctl
for admins may not be compatible with different versions of theteleport
service.
As an extra precaution you might want to backup your application prior to upgrading. We provide more instructions in Backup before upgrading.
!!! warning "Upgrading to Teleport 4.0+"
Teleport 4.0+ switched to GRPC and HTTP/2 as an API protocol. The HTTP/2 spec bans
two previously recommended ciphers. `tls-rsa-with-aes-128-gcm-sha256` & `tls-rsa-with-aes-256-gcm-sha384`, make sure these are removed from `teleport.yaml`
[Visit our community for more details](https://community.gravitational.com/t/drop-ciphersuites-blacklisted-by-http-2-spec/446)
If upgrading you might want to consider rotating CA to SHA-256 or SHA-512 for RSA
SSH certificate signatures. The previous default was SHA-1, which is now considered
weak against brute-force attacks. SHA-1 certificate signatures are also no longer
accepted by OpenSSH versions 8.2 and above. All new Teleport clusters will default
to SHA-512 based signatures. To upgrade an existing cluster, set the following in
your teleport.yaml:
```bash
teleport:
ca_signature_algo: "rsa-sha2-512"
```
After updating to 4.3+ rotate the cluster CA [following these docs](#certificate-rotation).
Backup Before Upgrading
As an extra precaution you might want to backup your application prior to upgrading. We have more instructions in Backing up Teleport.
Upgrade Sequence
When upgrading a single Teleport cluster:
-
Upgrade the auth server first. The auth server keeps the cluster state and if there are data format changes introduced in the new version this will perform necessary migrations.
-
Then, upgrade the proxy servers. The proxy servers are stateless and can be upgraded in any sequence or at the same time.
-
Finally, upgrade the SSH nodes in any sequence or at the same time.
!!! warning "Warning"
If several auth servers are running in HA configuration
(for example, in AWS auto-scaling group) you have to shrink the group to
**just one auth server** prior to performing an upgrade. While Teleport
will attempt to perform any necessary migrations, we recommend users
create a backup of their backend before upgrading the Auth Server, as a
precaution. This allows for a safe rollback in case the migration itself
fails.
When upgrading multiple clusters:
- First, upgrade the main cluster, i.e. the one which other clusters trust.
- Upgrade the trusted clusters.
Backing Up Teleport
When planning a backup of Teleport, it's important to know what is where and the
importance of each component. Teleport's Proxies and Nodes are stateless, and thus
only teleport.yaml
should be backed up.
The Auth server is Teleport's brains, and depending on the backend should be backed up regularly.
For example a customer running Teleport on AWS with DynamoDB have these key items of data:
What | Where ( Example AWS Customer ) |
---|---|
Local Users ( not SSO ) | DynamoDB |
Certificate Authorities | DynamoDB |
Trusted Clusters | DynamoDB |
Connectors: SSO | DynamoDB / File System |
RBAC | DynamoDB / File System |
teleport.yaml | File System |
teleport.service | File System |
license.pem | File System |
TLS key/certificate | ( File System / Outside Scope ) |
Audit log | DynamoDB |
Session recordings | S3 |
For this customer, we would recommend using AWS best practices for backing up DynamoDB. If DynamoDB is used for the audit log, logged events have a TTL of 1 year.
Backend | Recommended backup strategy |
---|---|
dir ( local filesystem ) | Backup /var/lib/teleport/storage directory and the output of tctl get all . |
DynamoDB | Follow AWS Guidelines for Backup & Restore |
etcd | Follow etcD Guidleines for Disaster Recovery |
Firestore | Follow GCP Guidlines for Automated Backups |
Teleport Resources
Teleport uses YAML resources for roles, trusted clusters, local users and auth connectors.
These could be created via tctl
or via the UI.
GitOps
If running Teleport at scale, it's important for teams to have an automated way to restore Teleport. At a high level, this is our recommended approach:
- Persist and backup your backend
- Share that backend among auth servers
- Store your configs as discrete files in VCS
- Have your CI run
tctl create -f *.yaml
from that git directory
Migrating Backends.
As of version v4.1 you can now quickly export a collection of resources from Teleport. This feature was designed to help customers migrate from local storage to etcd.
Using tctl get all
will retrieve the below items:
- Users
- Certificate Authorities
- Trusted Clusters
- Connectors:
- Github
- SAML [Teleport Enterprise]
- OIDC [Teleport Enterprise]
- Roles [Teleport Enterprise]
When migrating backends, you should back up your auth server's data_dir/storage
directly.
Example of backing up and restoring a cluster.
# export dynamic configuration state from old cluster
$ tctl get all > state.yaml
# prepare a new uninitialized backend (make sure to port
# any non-default config values from the old config file)
$ mkdir fresh && cat > fresh.yaml << EOF
teleport:
data_dir: fresh
EOF
# bootstrap fresh server (kill the old one first!)
$ teleport start --config fresh.yaml --bootstrap state.yaml
# from another terminal, verify state transferred correctly
$ tctl --config fresh.yaml get all
# <your state here!>
The --bootstrap
flag has no effect, except during backend initialization (performed
by auth server on first start), so it is safe for use in supervised/HA contexts.
Limitations
- All the same limitations around modifying the config file of an existing cluster also apply to a new cluster being bootstrapped from the state of an old cluster. Of particular note:
- Changing cluster name will break your CAs (this will be caught and teleport will refuse to start).
- Some user authentication mechanisms (e.g. u2f) require that the public endpoint of the web ui remains the same (this can't be caught by teleport, be careful!).
- Any node whose invite token is defined statically (in the config file of the auth server) will be able to join automatically, but nodes that were added dynamically will need to be re-invited
Daemon Restarts
As covered in the Graceful Restarts section, Teleport supports graceful restarts. To upgrade a host to a newer Teleport version, an administrator must:
This will perform a graceful restart, i.e.the Teleport daemon will fork a new process to handle new incoming requests, leaving the old daemon process running until existing clients disconnect.
License File
Commercial Teleport subscriptions require a valid license. The license file can be downloaded from the Teleport Customer Portal.
The Teleport license file contains a X.509 certificate and the corresponding
private key in PEM format. Place the downloaded file on Auth servers and set the
license_file
configuration parameter of your teleport.yaml
to point to the
file location:
auth_service:
license_file: /var/lib/teleport/license.pem
The license_file
path can be either absolute or relative to the configured
data_dir
. If license file path is not set, Teleport will look for the
license.pem
file in the configured data_dir
.
!!! tip "NOTE"
Only Auth servers require the license. Proxies and Nodes that do
not also have Auth role enabled do not need the license.
Troubleshooting
To diagnose problems you can configure teleport
to
run with verbose logging enabled by passing it -d
flag.
!!! tip "NOTE"
It is not recommended to run Teleport in production with verbose
logging as it generates a substantial amount of data.
Sometimes you may want to reset teleport
to a clean
state. This can be accomplished by erasing everything under "data_dir"
directory. Assuming the default location, rm -rf /var/lib/teleport/*
will do.
Teleport also supports HTTP endpoints for monitoring purposes. They are disabled by default, but you can enable them:
$ teleport start --diag-addr=127.0.0.1:3000
Now you can see the monitoring information by visiting several endpoints:
-
http://127.0.0.1:3000/metrics
is the list of internal metrics Teleport is tracking. It is compatible with Prometheus collectors. For a full list of metrics review our metrics reference. -
http://127.0.0.1:3000/healthz
returns "OK" if the process is healthy or503
otherwise. -
http://127.0.0.1:3000/readyz
is similar to/healthz
, but it returns "OK" only after the node successfully joined the cluster, i.e.it draws the difference between "healthy" and "ready". -
http://127.0.0.1:3000/debug/pprof/
is Golang's standard profiler. It's only available when-d
flag is given in addition to--diag-addr
Getting Help
If you need help, please ask on our community forum. You can also open an issue on Github.
For commercial support, you can create a ticket through the customer dashboard.
For more information about custom features, or to try our Enterprise edition of Teleport, please reach out to us at sales@gravitational.com.