# PostgreSQL Database Schema Specification ## Overview This document specifies the complete PostgreSQL database schema for the explorer platform. The schema is designed to support multi-chain operation, high-performance queries, and data consistency. ## Schema Design Principles 1. **Multi-chain Support**: All tables include `chain_id` for chain isolation 2. **Normalization**: Normalized structure to avoid data duplication 3. **Performance**: Strategic indexing for common query patterns 4. **Consistency**: Foreign key constraints where appropriate 5. **Extensibility**: JSONB columns for flexible data storage 6. **Partitioning**: Large tables partitioned by `chain_id` ## Core Tables ### Blocks Table See `../indexing/data-models.md` for detailed block schema. **Partitioning**: Partition by `chain_id` for large deployments. **Key Indexes**: - Primary: `(chain_id, number)` - Unique: `(chain_id, hash)` - Index: `(chain_id, timestamp)` for time-range queries ### Transactions Table See `../indexing/data-models.md` for detailed transaction schema. **Key Indexes**: - Primary: `(chain_id, hash)` - Index: `(chain_id, block_number, transaction_index)` for block queries - Index: `(chain_id, from_address)` for address queries - Index: `(chain_id, to_address)` for address queries - Index: `(chain_id, block_number, from_address)` for compound queries ### Logs Table See `../indexing/data-models.md` for detailed log schema. **Key Indexes**: - Primary: `(chain_id, transaction_hash, log_index)` - Index: `(chain_id, address)` for contract event queries - Index: `(chain_id, topic0)` for event type queries - Index: `(chain_id, address, topic0)` for filtered event queries - Index: `(chain_id, block_number)` for block-based queries ### Traces Table See `../indexing/data-models.md` for detailed trace schema. **Key Indexes**: - Primary: `(chain_id, transaction_hash, trace_address)` - Index: `(chain_id, action_from)` for address queries - Index: `(chain_id, action_to)` for address queries - Index: `(chain_id, block_number)` for block queries ### Internal Transactions Table See `../indexing/data-models.md` for detailed internal transaction schema. **Key Indexes**: - Primary: `(chain_id, transaction_hash, trace_address)` - Index: `(chain_id, from_address)` - Index: `(chain_id, to_address)` - Index: `(chain_id, block_number)` ## Token Tables ### Tokens Table ```sql CREATE TABLE tokens ( id BIGSERIAL, chain_id INTEGER NOT NULL, address VARCHAR(42) NOT NULL, type VARCHAR(10) NOT NULL CHECK (type IN ('ERC20', 'ERC721', 'ERC1155')), name VARCHAR(255), symbol VARCHAR(50), decimals INTEGER CHECK (decimals >= 0 AND decimals <= 18), total_supply NUMERIC(78, 0), holder_count INTEGER DEFAULT 0, transfer_count INTEGER DEFAULT 0, logo_url TEXT, website_url TEXT, description TEXT, verified BOOLEAN DEFAULT false, created_at TIMESTAMP DEFAULT NOW(), updated_at TIMESTAMP DEFAULT NOW(), PRIMARY KEY (id), UNIQUE (chain_id, address) ) PARTITION BY LIST (chain_id); CREATE INDEX idx_tokens_chain_address ON tokens(chain_id, address); CREATE INDEX idx_tokens_chain_type ON tokens(chain_id, type); CREATE INDEX idx_tokens_chain_symbol ON tokens(chain_id, symbol); ``` ### Token Transfers Table ```sql CREATE TABLE token_transfers ( id BIGSERIAL, chain_id INTEGER NOT NULL, transaction_hash VARCHAR(66) NOT NULL, block_number BIGINT NOT NULL, log_index INTEGER NOT NULL, token_address VARCHAR(42) NOT NULL, token_type VARCHAR(10) NOT NULL CHECK (token_type IN ('ERC20', 'ERC721', 'ERC1155')), from_address VARCHAR(42) NOT NULL, to_address VARCHAR(42) NOT NULL, amount NUMERIC(78, 0), token_id VARCHAR(78), operator VARCHAR(42), created_at TIMESTAMP DEFAULT NOW(), PRIMARY KEY (id), FOREIGN KEY (chain_id, transaction_hash) REFERENCES transactions(chain_id, hash), FOREIGN KEY (chain_id, token_address) REFERENCES tokens(chain_id, address), UNIQUE (chain_id, transaction_hash, log_index) ) PARTITION BY LIST (chain_id); CREATE INDEX idx_token_transfers_chain_token ON token_transfers(chain_id, token_address); CREATE INDEX idx_token_transfers_chain_from ON token_transfers(chain_id, from_address); CREATE INDEX idx_token_transfers_chain_to ON token_transfers(chain_id, to_address); CREATE INDEX idx_token_transfers_chain_tx ON token_transfers(chain_id, transaction_hash); CREATE INDEX idx_token_transfers_chain_block ON token_transfers(chain_id, block_number); CREATE INDEX idx_token_transfers_chain_token_from ON token_transfers(chain_id, token_address, from_address); CREATE INDEX idx_token_transfers_chain_token_to ON token_transfers(chain_id, token_address, to_address); ``` ### Token Holders Table (Optional) **Purpose**: Maintain current token balances for efficient queries. ```sql CREATE TABLE token_holders ( id BIGSERIAL, chain_id INTEGER NOT NULL, token_address VARCHAR(42) NOT NULL, address VARCHAR(42) NOT NULL, balance NUMERIC(78, 0) NOT NULL DEFAULT 0, token_id VARCHAR(78), -- For ERC-721/1155 updated_at TIMESTAMP DEFAULT NOW(), PRIMARY KEY (id), FOREIGN KEY (chain_id, token_address) REFERENCES tokens(chain_id, address), UNIQUE (chain_id, token_address, address, COALESCE(token_id, '')) ) PARTITION BY LIST (chain_id); CREATE INDEX idx_token_holders_chain_token ON token_holders(chain_id, token_address); CREATE INDEX idx_token_holders_chain_address ON token_holders(chain_id, address); ``` ## Contract Tables ### Contracts Table ```sql CREATE TABLE contracts ( id BIGSERIAL, chain_id INTEGER NOT NULL, address VARCHAR(42) NOT NULL, name VARCHAR(255), compiler_version VARCHAR(50), optimization_enabled BOOLEAN, optimization_runs INTEGER, evm_version VARCHAR(20), source_code TEXT, abi JSONB, constructor_arguments TEXT, verification_status VARCHAR(20) NOT NULL CHECK (verification_status IN ('pending', 'verified', 'failed')), verified_at TIMESTAMP, verification_method VARCHAR(50), license VARCHAR(50), created_at TIMESTAMP DEFAULT NOW(), updated_at TIMESTAMP DEFAULT NOW(), PRIMARY KEY (id), UNIQUE (chain_id, address) ) PARTITION BY LIST (chain_id); CREATE INDEX idx_contracts_chain_address ON contracts(chain_id, address); CREATE INDEX idx_contracts_chain_verified ON contracts(chain_id, verification_status); CREATE INDEX idx_contracts_abi_gin ON contracts USING GIN (abi); -- For ABI queries ``` ### Contract ABIs Table ```sql CREATE TABLE contract_abis ( id BIGSERIAL, chain_id INTEGER NOT NULL, address VARCHAR(42) NOT NULL, abi JSONB NOT NULL, source VARCHAR(50) NOT NULL, verified BOOLEAN DEFAULT false, created_at TIMESTAMP DEFAULT NOW(), updated_at TIMESTAMP DEFAULT NOW(), PRIMARY KEY (id), UNIQUE (chain_id, address) ) PARTITION BY LIST (chain_id); CREATE INDEX idx_abis_chain_address ON contract_abis(chain_id, address); CREATE INDEX idx_abis_abi_gin ON contract_abis USING GIN (abi); ``` ### Contract Verifications Table ```sql CREATE TABLE contract_verifications ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), chain_id INTEGER NOT NULL, address VARCHAR(42) NOT NULL, status VARCHAR(20) NOT NULL CHECK (status IN ('pending', 'processing', 'verified', 'failed', 'partially_verified')), compiler_version VARCHAR(50), optimization_enabled BOOLEAN, optimization_runs INTEGER, evm_version VARCHAR(20), source_code TEXT, abi JSONB, constructor_arguments TEXT, verification_method VARCHAR(50), error_message TEXT, verified_at TIMESTAMP, version INTEGER DEFAULT 1, is_active BOOLEAN DEFAULT true, created_at TIMESTAMP DEFAULT NOW(), updated_at TIMESTAMP DEFAULT NOW(), FOREIGN KEY (chain_id, address) REFERENCES contracts(chain_id, address) ); CREATE INDEX idx_verifications_chain_address ON contract_verifications(chain_id, address); CREATE INDEX idx_verifications_status ON contract_verifications(status); ``` ## Address-Related Tables ### Address Labels Table ```sql CREATE TABLE address_labels ( id BIGSERIAL, chain_id INTEGER NOT NULL, address VARCHAR(42) NOT NULL, label VARCHAR(255) NOT NULL, label_type VARCHAR(20) NOT NULL CHECK (label_type IN ('user', 'public', 'contract_name')), user_id UUID, source VARCHAR(50), created_at TIMESTAMP DEFAULT NOW(), updated_at TIMESTAMP DEFAULT NOW(), PRIMARY KEY (id), UNIQUE (chain_id, address, label_type, user_id), FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE ); CREATE INDEX idx_labels_chain_address ON address_labels(chain_id, address); CREATE INDEX idx_labels_chain_user ON address_labels(chain_id, user_id); ``` ### Address Tags Table ```sql CREATE TABLE address_tags ( id BIGSERIAL, chain_id INTEGER NOT NULL, address VARCHAR(42) NOT NULL, tag VARCHAR(50) NOT NULL, tag_type VARCHAR(20) NOT NULL CHECK (tag_type IN ('category', 'risk', 'protocol')), user_id UUID, created_at TIMESTAMP DEFAULT NOW(), PRIMARY KEY (id), UNIQUE (chain_id, address, tag, user_id), FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE ); CREATE INDEX idx_tags_chain_address ON address_tags(chain_id, address); CREATE INDEX idx_tags_chain_tag ON address_tags(chain_id, tag); ``` ## User Tables ### Users Table ```sql CREATE TABLE users ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), email VARCHAR(255) UNIQUE, username VARCHAR(100) UNIQUE, password_hash TEXT, api_key_hash TEXT, created_at TIMESTAMP DEFAULT NOW(), updated_at TIMESTAMP DEFAULT NOW(), last_login_at TIMESTAMP ); CREATE INDEX idx_users_email ON users(email); CREATE INDEX idx_users_username ON users(username); ``` ### Watchlists Table ```sql CREATE TABLE watchlists ( id BIGSERIAL, user_id UUID NOT NULL, chain_id INTEGER NOT NULL, address VARCHAR(42) NOT NULL, label VARCHAR(255), created_at TIMESTAMP DEFAULT NOW(), PRIMARY KEY (id), UNIQUE (user_id, chain_id, address), FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE ); CREATE INDEX idx_watchlists_user ON watchlists(user_id); CREATE INDEX idx_watchlists_chain_address ON watchlists(chain_id, address); ``` ### API Keys Table ```sql CREATE TABLE api_keys ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), user_id UUID NOT NULL, key_hash TEXT NOT NULL UNIQUE, name VARCHAR(255), tier VARCHAR(20) NOT NULL CHECK (tier IN ('free', 'pro', 'enterprise')), rate_limit_per_second INTEGER, rate_limit_per_minute INTEGER, ip_whitelist TEXT[], -- Array of CIDR blocks last_used_at TIMESTAMP, expires_at TIMESTAMP, revoked BOOLEAN DEFAULT false, created_at TIMESTAMP DEFAULT NOW(), FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE ); CREATE INDEX idx_api_keys_user ON api_keys(user_id); CREATE INDEX idx_api_keys_hash ON api_keys(key_hash); ``` ## Multi-Chain Partitioning ### Partitioning Strategy **Large Tables**: Partition by `chain_id` using LIST partitioning. **Tables to Partition**: - `blocks` - `transactions` - `logs` - `traces` - `internal_transactions` - `token_transfers` - `tokens` - `token_holders` (if used) ### Partition Creation **Example for blocks table**: ```sql -- Create parent table CREATE TABLE blocks ( -- columns ) PARTITION BY LIST (chain_id); -- Create partitions CREATE TABLE blocks_chain_138 PARTITION OF blocks FOR VALUES IN (138); CREATE TABLE blocks_chain_1 PARTITION OF blocks FOR VALUES IN (1); -- Add indexes to partitions (inherited from parent) ``` **Benefits**: - Faster queries (partition pruning) - Easier maintenance (per-chain operations) - Parallel processing - Data isolation ## Indexing Strategy ### Index Types 1. **B-tree**: Default for most indexes (equality, range, sorting) 2. **Hash**: For exact match only (rarely used, B-tree usually better) 3. **GIN**: For JSONB columns (ABIs, decoded data) 4. **BRIN**: For large ordered columns (block numbers, timestamps) 5. **Partial**: For filtered indexes (e.g., verified contracts only) ### Index Maintenance **Regular Maintenance**: - `VACUUM ANALYZE` regularly (auto-vacuum enabled) - `REINDEX` if needed (bloat, corruption) - Monitor index usage (`pg_stat_user_indexes`) **Index Monitoring**: - Track index sizes - Monitor index bloat - Remove unused indexes ## Data Retention and Archiving ### Retention Policies **Hot Data**: Recent data (last 1 year) - Fast access required - All indexes maintained **Warm Data**: Older data (1-5 years) - Archive to slower storage - Reduced indexing **Cold Data**: Very old data (5+ years) - Archive to object storage - Minimal indexing ### Archiving Strategy **Approach**: 1. Partition tables by time ranges (monthly/yearly) 2. Move old partitions to archive storage 3. Query archive when needed (slower but available) **Implementation**: - Use PostgreSQL table partitioning by date range - Move partitions to archive storage (S3, etc.) - Query via foreign data wrappers if needed ## Migration Strategy ### Versioning **Migration Tool**: Use migration tool (Flyway, Liquibase, or custom). **Versioning Format**: `YYYYMMDDHHMMSS_description.sql` **Example**: ``` 20240101000001_initial_schema.sql 20240115000001_add_token_holders.sql 20240201000001_add_partitioning.sql ``` ### Migration Best Practices 1. **Backward Compatible**: Additive changes preferred 2. **Reversible**: All migrations should be reversible 3. **Tested**: Test on staging before production 4. **Documented**: Document breaking changes 5. **Rollback Plan**: Have rollback strategy ### Schema Evolution **Adding Columns**: - Use `ALTER TABLE ADD COLUMN` with default values - Avoid NOT NULL without defaults (use two-step migration) **Removing Columns**: - Mark as deprecated first - Remove after migration period **Changing Types**: - Create new column - Migrate data - Drop old column - Rename new column ## Performance Optimization ### Query Optimization **Common Query Patterns**: 1. Get block by number: Use `(chain_id, number)` index 2. Get transaction by hash: Use `(chain_id, hash)` index 3. Get address transactions: Use `(chain_id, from_address)` or `(chain_id, to_address)` index 4. Filter logs by address and event: Use `(chain_id, address, topic0)` index ### Connection Pooling **Configuration**: - Use connection pooler (PgBouncer, pgpool-II) - Pool size: 20-100 connections per application server - Statement-level pooling for better concurrency ### Read Replicas **Strategy**: - Primary: Write operations - Replicas: Read operations (load balanced) - Async replication (small lag acceptable) ## Backup and Recovery ### Backup Strategy **Full Backups**: Daily full database dumps **Incremental Backups**: Continuous WAL archiving **Point-in-Time Recovery**: Enabled via WAL archiving ### Recovery Procedures **RTO Target**: 1 hour **RPO Target**: 5 minutes (max data loss) ## References - Data Models: See `../indexing/data-models.md` - Indexer Architecture: See `../indexing/indexer-architecture.md` - Search Index Schema: See `search-index-schema.md` - Multi-chain Architecture: See `../multichain/multichain-indexing.md`