16 KiB
Data Models Specification
Overview
This document specifies the data models used throughout the indexing pipeline and stored in the database. All models support multi-chain operation via a chain_id field.
Core Data Models
Block Schema
Table: blocks
Fields:
blocks (
id BIGSERIAL PRIMARY KEY,
chain_id INTEGER NOT NULL,
number BIGINT NOT NULL,
hash VARCHAR(66) NOT NULL,
parent_hash VARCHAR(66) NOT NULL,
nonce VARCHAR(18),
sha3_uncles VARCHAR(66),
logs_bloom TEXT,
transactions_root VARCHAR(66),
state_root VARCHAR(66),
receipts_root VARCHAR(66),
miner VARCHAR(42),
difficulty NUMERIC,
total_difficulty NUMERIC,
size BIGINT,
extra_data TEXT,
gas_limit BIGINT,
gas_used BIGINT,
timestamp TIMESTAMP NOT NULL,
transaction_count INTEGER DEFAULT 0,
base_fee_per_gas BIGINT, -- EIP-1559
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW(),
UNIQUE(chain_id, number),
UNIQUE(chain_id, hash)
)
Indexes:
idx_blocks_chain_numberON (chain_id, number)idx_blocks_chain_hashON (chain_id, hash)idx_blocks_chain_timestampON (chain_id, timestamp)
Relationships:
- One-to-many with
transactions - One-to-many with
logs
Transaction Schema
Table: transactions
Fields:
transactions (
id BIGSERIAL PRIMARY KEY,
chain_id INTEGER NOT NULL,
hash VARCHAR(66) NOT NULL,
block_number BIGINT NOT NULL,
block_hash VARCHAR(66) NOT NULL,
transaction_index INTEGER NOT NULL,
from_address VARCHAR(42) NOT NULL,
to_address VARCHAR(42), -- NULL for contract creation
value NUMERIC NOT NULL DEFAULT 0,
gas_price BIGINT,
max_fee_per_gas BIGINT, -- EIP-1559
max_priority_fee_per_gas BIGINT, -- EIP-1559
gas_limit BIGINT NOT NULL,
gas_used BIGINT,
nonce BIGINT NOT NULL,
input_data TEXT, -- Contract call data
status INTEGER, -- 0 = failed, 1 = success
contract_address VARCHAR(42), -- NULL if not contract creation
cumulative_gas_used BIGINT,
effective_gas_price BIGINT, -- Actual gas price paid
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW(),
FOREIGN KEY (chain_id, block_number) REFERENCES blocks(chain_id, number),
UNIQUE(chain_id, hash)
)
Indexes:
idx_transactions_chain_hashON (chain_id, hash)idx_transactions_chain_blockON (chain_id, block_number, transaction_index)idx_transactions_chain_fromON (chain_id, from_address)idx_transactions_chain_toON (chain_id, to_address)idx_transactions_chain_block_fromON (chain_id, block_number, from_address)
Relationships:
- Many-to-one with
blocks - One-to-many with
logs - One-to-many with
internal_transactions - One-to-many with
token_transfers
Receipt Schema
Note: Receipt data is stored denormalized in the transactions table for efficiency. If separate storage is needed:
Table: transaction_receipts
transaction_receipts (
id BIGSERIAL PRIMARY KEY,
chain_id INTEGER NOT NULL,
transaction_hash VARCHAR(66) NOT NULL,
transaction_index INTEGER NOT NULL,
block_number BIGINT NOT NULL,
block_hash VARCHAR(66) NOT NULL,
from_address VARCHAR(42) NOT NULL,
to_address VARCHAR(42),
gas_used BIGINT,
cumulative_gas_used BIGINT,
contract_address VARCHAR(42),
logs_bloom TEXT,
status INTEGER,
root VARCHAR(66), -- Pre-Byzantium
created_at TIMESTAMP DEFAULT NOW(),
FOREIGN KEY (chain_id, transaction_hash) REFERENCES transactions(chain_id, hash),
UNIQUE(chain_id, transaction_hash)
)
Log Schema
Table: logs
Fields:
logs (
id BIGSERIAL PRIMARY KEY,
chain_id INTEGER NOT NULL,
transaction_hash VARCHAR(66) NOT NULL,
block_number BIGINT NOT NULL,
block_hash VARCHAR(66) NOT NULL,
log_index INTEGER NOT NULL,
address VARCHAR(42) NOT NULL,
topic0 VARCHAR(66), -- Event signature
topic1 VARCHAR(66), -- First indexed parameter
topic2 VARCHAR(66), -- Second indexed parameter
topic3 VARCHAR(66), -- Third indexed parameter
data TEXT, -- Non-indexed parameters
decoded_data JSONB, -- Decoded event data (if ABI available)
created_at TIMESTAMP DEFAULT NOW(),
FOREIGN KEY (chain_id, transaction_hash) REFERENCES transactions(chain_id, hash),
UNIQUE(chain_id, transaction_hash, log_index)
)
Indexes:
idx_logs_chain_txON (chain_id, transaction_hash)idx_logs_chain_addressON (chain_id, address)idx_logs_chain_topic0ON (chain_id, topic0)idx_logs_chain_blockON (chain_id, block_number)idx_logs_chain_address_topic0ON (chain_id, address, topic0) -- For event filtering
Relationships:
- Many-to-one with
transactions
Trace Schema
Table: traces
Fields:
traces (
id BIGSERIAL PRIMARY KEY,
chain_id INTEGER NOT NULL,
transaction_hash VARCHAR(66) NOT NULL,
block_number BIGINT NOT NULL,
block_hash VARCHAR(66) NOT NULL,
trace_address INTEGER[], -- Array representing call hierarchy [0,1,2]
subtraces INTEGER, -- Number of child calls
action_type VARCHAR(20) NOT NULL, -- 'call', 'create', 'suicide', 'delegatecall'
action_from VARCHAR(42),
action_to VARCHAR(42),
action_value NUMERIC DEFAULT 0,
action_input TEXT,
action_gas BIGINT,
action_call_type VARCHAR(20), -- 'call', 'delegatecall', 'staticcall'
result_type VARCHAR(20), -- 'callresult', 'createresult'
result_gas_used BIGINT,
result_output TEXT,
result_address VARCHAR(42), -- For create results
result_code TEXT, -- For create results
error TEXT, -- Error message if trace failed
created_at TIMESTAMP DEFAULT NOW(),
FOREIGN KEY (chain_id, transaction_hash) REFERENCES transactions(chain_id, hash)
)
Indexes:
idx_traces_chain_txON (chain_id, transaction_hash)idx_traces_chain_blockON (chain_id, block_number)idx_traces_chain_fromON (chain_id, action_from)idx_traces_chain_toON (chain_id, action_to)
Note: Trace data can be large. Consider partitioning or separate storage for historical traces.
Internal Transaction Schema
Table: internal_transactions
Purpose: Track value transfers that occur within transactions (via calls).
Fields:
internal_transactions (
id BIGSERIAL PRIMARY KEY,
chain_id INTEGER NOT NULL,
transaction_hash VARCHAR(66) NOT NULL,
block_number BIGINT NOT NULL,
trace_address INTEGER[] NOT NULL,
from_address VARCHAR(42) NOT NULL,
to_address VARCHAR(42) NOT NULL,
value NUMERIC NOT NULL,
call_type VARCHAR(20), -- 'call', 'delegatecall', 'staticcall', 'create'
gas_limit BIGINT,
gas_used BIGINT,
input_data TEXT,
output_data TEXT,
error TEXT,
created_at TIMESTAMP DEFAULT NOW(),
FOREIGN KEY (chain_id, transaction_hash) REFERENCES transactions(chain_id, hash)
)
Indexes:
idx_internal_tx_chain_txON (chain_id, transaction_hash)idx_internal_tx_chain_fromON (chain_id, from_address)idx_internal_tx_chain_toON (chain_id, to_address)idx_internal_tx_chain_blockON (chain_id, block_number)
Relationships:
- Many-to-one with
transactions
Token Transfer Schema
Table: token_transfers
Purpose: Track ERC-20, ERC-721, and ERC-1155 token transfers.
Fields:
token_transfers (
id BIGSERIAL PRIMARY KEY,
chain_id INTEGER NOT NULL,
transaction_hash VARCHAR(66) NOT NULL,
block_number BIGINT NOT NULL,
log_index INTEGER NOT NULL,
token_address VARCHAR(42) NOT NULL,
token_type VARCHAR(10) NOT NULL, -- 'ERC20', 'ERC721', 'ERC1155'
from_address VARCHAR(42) NOT NULL,
to_address VARCHAR(42) NOT NULL,
amount NUMERIC, -- For ERC-20 and ERC-1155
token_id VARCHAR(78), -- For ERC-721 and ERC-1155 (can be large)
operator VARCHAR(42), -- For ERC-1155
created_at TIMESTAMP DEFAULT NOW(),
FOREIGN KEY (chain_id, transaction_hash) REFERENCES transactions(chain_id, hash),
FOREIGN KEY (chain_id, token_address) REFERENCES tokens(chain_id, address)
)
Indexes:
idx_token_transfers_chain_tokenON (chain_id, token_address)idx_token_transfers_chain_fromON (chain_id, from_address)idx_token_transfers_chain_toON (chain_id, to_address)idx_token_transfers_chain_txON (chain_id, transaction_hash)idx_token_transfers_chain_blockON (chain_id, block_number)idx_token_transfers_chain_token_fromON (chain_id, token_address, from_address)idx_token_transfers_chain_token_toON (chain_id, token_address, to_address)
Relationships:
- Many-to-one with
transactions - Many-to-one with
tokens
Token Schema
Table: tokens
Purpose: Store token metadata (ERC-20, ERC-721, ERC-1155).
Fields:
tokens (
id BIGSERIAL PRIMARY KEY,
chain_id INTEGER NOT NULL,
address VARCHAR(42) NOT NULL,
type VARCHAR(10) NOT NULL, -- 'ERC20', 'ERC721', 'ERC1155'
name VARCHAR(255),
symbol VARCHAR(50),
decimals INTEGER, -- For ERC-20
total_supply NUMERIC,
holder_count INTEGER DEFAULT 0,
transfer_count INTEGER DEFAULT 0,
logo_url TEXT,
website_url TEXT,
description TEXT,
verified BOOLEAN DEFAULT false,
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW(),
UNIQUE(chain_id, address)
)
Indexes:
idx_tokens_chain_addressON (chain_id, address)idx_tokens_chain_typeON (chain_id, type)idx_tokens_chain_symbolON (chain_id, symbol) -- For search
Relationships:
- One-to-many with
token_transfers - One-to-many with
token_holders(if maintained)
Contract Metadata Schema
Table: contracts
Purpose: Store verified contract information.
Fields:
contracts (
id BIGSERIAL PRIMARY KEY,
chain_id INTEGER NOT NULL,
address VARCHAR(42) NOT NULL,
name VARCHAR(255),
compiler_version VARCHAR(50),
optimization_enabled BOOLEAN,
optimization_runs INTEGER,
evm_version VARCHAR(20),
source_code TEXT,
abi JSONB,
constructor_arguments TEXT,
verification_status VARCHAR(20) NOT NULL, -- 'pending', 'verified', 'failed'
verified_at TIMESTAMP,
verification_method VARCHAR(50), -- 'standard_json', 'sourcify', 'multi_file'
license VARCHAR(50),
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW(),
UNIQUE(chain_id, address)
)
Indexes:
idx_contracts_chain_addressON (chain_id, address)idx_contracts_chain_verifiedON (chain_id, verification_status)
Relationships:
- One-to-one with
contract_abis(if separate ABI storage)
Contract ABI Schema
Table: contract_abis
Purpose: Store contract ABIs for decoding (can be separate from verification).
Fields:
contract_abis (
id BIGSERIAL PRIMARY KEY,
chain_id INTEGER NOT NULL,
address VARCHAR(42) NOT NULL,
abi JSONB NOT NULL,
source VARCHAR(50) NOT NULL, -- 'verification', 'sourcify', 'public', 'user_submitted'
verified BOOLEAN DEFAULT false,
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW(),
UNIQUE(chain_id, address)
)
Indexes:
idx_abis_chain_addressON (chain_id, address)
Address-Related Models
Address Labels Schema
Table: address_labels
Purpose: User-defined and public labels for addresses.
Fields:
address_labels (
id BIGSERIAL PRIMARY KEY,
chain_id INTEGER NOT NULL,
address VARCHAR(42) NOT NULL,
label VARCHAR(255) NOT NULL,
label_type VARCHAR(20) NOT NULL, -- 'user', 'public', 'contract_name'
user_id UUID, -- NULL for public labels
source VARCHAR(50), -- 'user', 'etherscan', 'blockscout', etc.
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW(),
UNIQUE(chain_id, address, label_type, user_id)
)
Indexes:
idx_labels_chain_addressON (chain_id, address)idx_labels_chain_userON (chain_id, user_id)
Address Tags Schema
Table: address_tags
Purpose: Categorize addresses (e.g., "exchange", "defi", "wallet").
Fields:
address_tags (
id BIGSERIAL PRIMARY KEY,
chain_id INTEGER NOT NULL,
address VARCHAR(42) NOT NULL,
tag VARCHAR(50) NOT NULL,
tag_type VARCHAR(20) NOT NULL, -- 'category', 'risk', 'protocol'
user_id UUID, -- NULL for public tags
created_at TIMESTAMP DEFAULT NOW(),
UNIQUE(chain_id, address, tag, user_id)
)
Indexes:
idx_tags_chain_addressON (chain_id, address)idx_tags_chain_tagON (chain_id, tag)
User-Related Models
User Accounts Schema
Table: users
Purpose: User accounts for watchlists, alerts, preferences.
Fields:
users (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
email VARCHAR(255) UNIQUE,
username VARCHAR(100) UNIQUE,
password_hash TEXT, -- If using password auth
api_key_hash TEXT, -- Hashed API key
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW(),
last_login_at TIMESTAMP
)
Indexes:
idx_users_emailON (email)idx_users_usernameON (username)
Watchlists Schema
Table: watchlists
Purpose: User-defined lists of addresses to monitor.
Fields:
watchlists (
id BIGSERIAL PRIMARY KEY,
user_id UUID NOT NULL,
chain_id INTEGER NOT NULL,
address VARCHAR(42) NOT NULL,
label VARCHAR(255),
created_at TIMESTAMP DEFAULT NOW(),
FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE,
UNIQUE(user_id, chain_id, address)
)
Indexes:
idx_watchlists_userON (user_id)idx_watchlists_chain_addressON (chain_id, address)
Data Type Definitions
Numeric Types
- BIGINT: Used for block numbers, gas values, nonces (64-bit integers)
- NUMERIC: Used for token amounts, ETH values (arbitrary precision decimals)
- Precision: 78 digits (sufficient for wei)
- Scale: 0 (integers) or configurable for token decimals
Address Types
- VARCHAR(42): Ethereum addresses (0x + 40 hex chars)
- Normalize to lowercase for consistency
Hash Types
- VARCHAR(66): Transaction/block hashes (0x + 64 hex chars)
- TEXT: For very long hashes or variable-length data
JSONB Types
- Used for: ABIs, decoded event data, complex nested structures
- Benefits: Indexing, querying, efficient storage
Multi-Chain Considerations
Chain ID Partitioning
All tables include chain_id as the first column after primary key:
- Enables efficient partitioning by chain_id
- Ensures data isolation between chains
- Simplifies multi-chain queries
Partitioning Strategy
Recommended: Partition large tables by chain_id:
blocks,transactions,logspartitioned by chain_id- Benefits: Faster queries, easier maintenance, parallel processing
Implementation (PostgreSQL):
-- Example partitioning
CREATE TABLE blocks (
-- columns
) PARTITION BY LIST (chain_id);
CREATE TABLE blocks_chain_138 PARTITION OF blocks FOR VALUES IN (138);
CREATE TABLE blocks_chain_1 PARTITION OF blocks FOR VALUES IN (1);
Data Consistency
Foreign Key Constraints
- Enforce referential integrity where possible
- Consider performance impact for high-throughput inserts
- May disable for initial backfill, enable after catch-up
Unique Constraints
- Prevent duplicate blocks, transactions, logs
- Enable idempotent processing
- Use ON CONFLICT for upserts
Indexing Strategy
Index Types
- B-tree: Default for most indexes (equality, range queries)
- Hash: For exact match lookups (addresses, hashes)
- GIN: For JSONB columns (ABIs, decoded data)
- BRIN: For large ordered columns (block numbers, timestamps)
Index Maintenance
- Regular VACUUM and ANALYZE
- Monitor index bloat
- Consider partial indexes for filtered queries
References
- Indexer Architecture: See
indexer-architecture.md - Database Schema: See
../database/postgres-schema.md - Search Index Schema: See
../database/search-index-schema.md