knowledge/technology/applications/development/TimescaleDB.md

122 lines
4.7 KiB
Markdown
Raw Permalink Normal View History

2024-09-30 11:37:38 +00:00
---
obj: application
repo: https://github.com/timescale/timescaledb
website: https://www.timescale.com
rev: 2024-09-30
---
# TimescaleDB
TimescaleDB is an open-source time-series database built on [PostgreSQL](./Postgres.md), designed to handle large volumes of time-series data efficiently. It provides powerful data management features, making it suitable for applications in various domains such as IoT, finance, and analytics.
Features:
- Hypertables: The backbone of TimescaleDB, hypertables, facilitate automatic data partitioning across time, streamlining the management of vast datasets.
- Continuous Aggregates: This feature enables the pre-computation and storage of aggregate data, significantly speeding up query times for common analytical operations.
- Data Compression: TimescaleDB employs sophisticated compression techniques to reduce storage footprint without compromising query performance.
- Optimized Indexing: With its advanced indexing strategies, including multi-dimensional and time-based indexing, TimescaleDB ensures rapid query responses, making it highly efficient for time-series data.
## Installation
**Create the extension in your database**:
```sql
CREATE EXTENSION IF NOT EXISTS timescaledb;
```
## Hypertables
Hypertables are PostgreSQL tables that automatically partition your data by time. You interact with hypertables in the same way as regular PostgreSQL tables, but with extra features that makes managing your time-series data much easier.
In Timescale, hypertables exist alongside regular PostgreSQL tables. Use hypertables to store time-series data. This gives you improved insert and query performance, and access to useful time-series features. Use regular PostgreSQL tables for other relational data.
With hypertables, Timescale makes it easy to improve insert and query performance by partitioning time-series data on its time parameter. Behind the scenes, the database performs the work of setting up and maintaining the hypertable's partitions. Meanwhile, you insert and query your data as if it all lives in a single, regular PostgreSQL table.
**Create a hypertable:**
- Create a standard PostgreSQL table:
```sql
CREATE TABLE conditions (
time TIMESTAMPTZ NOT NULL,
location TEXT NOT NULL,
device TEXT NOT NULL,
temperature DOUBLE PRECISION NULL,
humidity DOUBLE PRECISION NULL
);
```
- Convert the table to a hypertable. Specify the name of the table you want to convert, and the column that holds its time values.
```sql
SELECT create_hypertable('conditions', by_range('time'));
```
## Hyperfunctions
Hyprfunctions allow you to query and aggregate your time data.
### delta
The `delta` function computes the change in a value over time. It helps in understanding how a metric (e.g., temperature, stock price, etc.) changes between readings.
Example: Calculate Temperature Change Over a Day
```sql
SELECT
delta(temperature) AS temp_change
FROM temperature_readings
WHERE time BETWEEN '2023-09-01' AND '2023-09-02';
```
### derivative
The `derivative` function calculates the rate of change (derivative) of a series over time.
Example: Calculate the Rate of Temperature Change Per Hour
```sql
SELECT
derivative(avg(temperature), '1 hour') AS temp_rate_change
FROM temperature_readings
GROUP BY time_bucket('1 hour', time);
```
### first & last
The `first` and `last` hyperfunctions return the first and last recorded values within a specified period.
```sql
SELECT
time_bucket('1 day', time) AS day,
first(stock_price, time) AS opening_price,
last(stock_price, time) AS closing_price
FROM stock_prices
GROUP BY day
ORDER BY day;
```
### locf
The `locf` (Last Observation Carried Forward) function fills missing data by carrying the last known observation forward to the missing timestamps.
```sql
SELECT
time_bucket('1 hour', time) AS hour,
locf(last(temperature, time)) AS filled_temperature
FROM temperature_readings
GROUP BY hour
ORDER BY hour;
```
### interpolated_avg
The `interpolated_avg` hyperfunction computes the average of a series with values interpolated at regular time intervals.
```sql
SELECT
time_bucket('1 hour', time) AS hour,
interpolated_avg('linear', time, power_usage) AS interpolated_power
FROM power_data
WHERE time BETWEEN '2023-09-01' AND '2023-09-07'
GROUP BY hour;
```
### time_bucket
The `time_bucket` hyperfunction is essential when you want to analyze or summarize data over time-based intervals, such as calculating daily averages, hourly sums, or other time-bound statistics.
```sql
SELECT
time_bucket('1 hour', time) AS bucketed_time,
avg(cpu_usage) AS avg_cpu_usage
FROM server_metrics
WHERE time BETWEEN '2023-09-01' AND '2023-09-30'
GROUP BY bucketed_time
ORDER BY bucketed_time;
```