6.2 KiB
6.2 KiB
Time-Series Database Schema Specification
Overview
This document specifies the time-series database schema using ClickHouse or TimescaleDB for storing mempool data, metrics, and analytics time-series data.
Technology Choice
Option 1: TimescaleDB (PostgreSQL extension)
- Pros: PostgreSQL compatibility, SQL interface, easier integration
- Cons: Less optimized for very high throughput
Option 2: ClickHouse
- Pros: Very high performance, columnar storage, excellent compression
- Cons: Different SQL dialect, separate infrastructure
Recommendation: Start with TimescaleDB for easier integration, migrate to ClickHouse if needed for scale.
TimescaleDB Schema
Mempool Transactions Table
Table: mempool_transactions
CREATE TABLE mempool_transactions (
time TIMESTAMPTZ NOT NULL,
chain_id INTEGER NOT NULL,
hash VARCHAR(66) NOT NULL,
from_address VARCHAR(42) NOT NULL,
to_address VARCHAR(42),
value NUMERIC(78, 0),
gas_price BIGINT,
max_fee_per_gas BIGINT,
max_priority_fee_per_gas BIGINT,
gas_limit BIGINT,
nonce BIGINT,
input_data_length INTEGER,
first_seen TIMESTAMPTZ NOT NULL,
status VARCHAR(20) DEFAULT 'pending', -- 'pending', 'confirmed', 'dropped'
confirmed_block_number BIGINT,
confirmed_at TIMESTAMPTZ,
PRIMARY KEY (time, chain_id, hash)
);
SELECT create_hypertable('mempool_transactions', 'time');
CREATE INDEX idx_mempool_chain_hash ON mempool_transactions(chain_id, hash);
CREATE INDEX idx_mempool_chain_from ON mempool_transactions(chain_id, from_address);
CREATE INDEX idx_mempool_chain_status ON mempool_transactions(chain_id, status, time);
Retention Policy: 7 days for detailed data, aggregates for longer periods
Network Metrics Table
Table: network_metrics
CREATE TABLE network_metrics (
time TIMESTAMPTZ NOT NULL,
chain_id INTEGER NOT NULL,
block_number BIGINT,
tps DOUBLE PRECISION, -- Transactions per second
gps DOUBLE PRECISION, -- Gas per second
avg_gas_price BIGINT,
pending_transactions INTEGER,
block_time_seconds DOUBLE PRECISION,
PRIMARY KEY (time, chain_id)
);
SELECT create_hypertable('network_metrics', 'time');
CREATE INDEX idx_network_metrics_chain_time ON network_metrics(chain_id, time DESC);
Aggregation: Pre-aggregate to 1-minute, 5-minute, 1-hour intervals
Gas Price History Table
Table: gas_price_history
CREATE TABLE gas_price_history (
time TIMESTAMPTZ NOT NULL,
chain_id INTEGER NOT NULL,
block_number BIGINT,
min_gas_price BIGINT,
max_gas_price BIGINT,
avg_gas_price BIGINT,
p25_gas_price BIGINT, -- 25th percentile
p50_gas_price BIGINT, -- 50th percentile (median)
p75_gas_price BIGINT, -- 75th percentile
p95_gas_price BIGINT, -- 95th percentile
p99_gas_price BIGINT, -- 99th percentile
PRIMARY KEY (time, chain_id)
);
SELECT create_hypertable('gas_price_history', 'time');
Address Activity Metrics Table
Table: address_activity_metrics
CREATE TABLE address_activity_metrics (
time TIMESTAMPTZ NOT NULL,
chain_id INTEGER NOT NULL,
address VARCHAR(42) NOT NULL,
transaction_count INTEGER,
received_count INTEGER,
sent_count INTEGER,
total_received NUMERIC(78, 0),
total_sent NUMERIC(78, 0),
PRIMARY KEY (time, chain_id, address)
);
SELECT create_hypertable('address_activity_metrics', 'time',
chunk_time_interval => INTERVAL '1 day');
CREATE INDEX idx_address_activity_chain_address ON address_activity_metrics(chain_id, address, time DESC);
Aggregation: Pre-aggregate to hourly/daily for addresses
ClickHouse Schema (Alternative)
Mempool Transactions Table
CREATE TABLE mempool_transactions (
time DateTime('UTC') NOT NULL,
chain_id UInt32 NOT NULL,
hash String NOT NULL,
from_address String NOT NULL,
to_address Nullable(String),
value Decimal128(0),
gas_price UInt64,
max_fee_per_gas Nullable(UInt64),
max_priority_fee_per_gas Nullable(UInt64),
gas_limit UInt64,
nonce UInt64,
input_data_length UInt32,
first_seen DateTime('UTC') NOT NULL,
status String DEFAULT 'pending',
confirmed_block_number Nullable(UInt64),
confirmed_at Nullable(DateTime('UTC'))
) ENGINE = MergeTree()
PARTITION BY toYYYYMM(time)
ORDER BY (chain_id, time, hash)
TTL time + INTERVAL 7 DAY; -- Auto-delete after 7 days
Data Retention and Aggregation
Retention Policies
Raw Data:
- Mempool transactions: 7 days
- Network metrics: 30 days
- Gas price history: 90 days
- Address activity: 30 days
Aggregated Data:
- 1-minute aggregates: 90 days
- 5-minute aggregates: 1 year
- 1-hour aggregates: 5 years
- Daily aggregates: Indefinite
Continuous Aggregates (TimescaleDB)
-- 1-minute network metrics aggregate
CREATE MATERIALIZED VIEW network_metrics_1m
WITH (timescaledb.continuous) AS
SELECT
time_bucket('1 minute', time) AS bucket,
chain_id,
AVG(tps) AS avg_tps,
AVG(gps) AS avg_gps,
AVG(avg_gas_price) AS avg_gas_price,
AVG(pending_transactions) AS avg_pending_tx
FROM network_metrics
GROUP BY bucket, chain_id;
-- Add refresh policy
SELECT add_continuous_aggregate_policy('network_metrics_1m',
start_offset => INTERVAL '1 hour',
end_offset => INTERVAL '1 minute',
schedule_interval => INTERVAL '1 minute');
Query Patterns
Recent Mempool Transactions
SELECT * FROM mempool_transactions
WHERE chain_id = 138
AND time > NOW() - INTERVAL '1 hour'
AND status = 'pending'
ORDER BY time DESC
LIMIT 100;
Gas Price Statistics
SELECT
time_bucket('5 minutes', time) AS bucket,
AVG(avg_gas_price) AS avg_gas_price,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY avg_gas_price) AS median_gas_price
FROM gas_price_history
WHERE chain_id = 138
AND time > NOW() - INTERVAL '24 hours'
GROUP BY bucket
ORDER BY bucket DESC;
Network Throughput
SELECT
time_bucket('1 minute', time) AS bucket,
AVG(tps) AS avg_tps,
MAX(tps) AS max_tps
FROM network_metrics
WHERE chain_id = 138
AND time > NOW() - INTERVAL '1 hour'
GROUP BY bucket
ORDER BY bucket DESC;
References
- Mempool Service: See
../mempool/mempool-service.md - Observability: See
../observability/metrics-monitoring.md