Files
explorer-monorepo/docs/specs/database/graph-schema.md

5.8 KiB

Graph Database Schema Specification

Overview

This document specifies the Neo4j graph database schema for storing cross-chain entity relationships, address clustering, and protocol interactions.

Schema Design

Node Types

Address Node

Labels: Address, Chain{chain_id} (e.g., Chain138)

Properties:

{
  address: "0x...",           // Unique identifier
  chainId: 138,                // Chain ID
  label: "My Wallet",          // Optional label
  isContract: false,           // Is contract address
  firstSeen: timestamp,        // First seen timestamp
  lastSeen: timestamp,         // Last seen timestamp
  transactionCount: 100,       // Transaction count
  balance: "1.5"               // Current balance (string for precision)
}

Constraints:

CREATE CONSTRAINT address_address_chain_id FOR (a:Address) 
REQUIRE (a.address, a.chainId) IS UNIQUE;

Contract Node

Labels: Contract, Address

Properties: Inherits from Address, plus:

{
  name: "MyToken",
  verificationStatus: "verified",
  compilerVersion: "0.8.19"
}

Token Node

Labels: Token, Contract

Properties: Inherits from Contract, plus:

{
  symbol: "MTK",
  decimals: 18,
  totalSupply: "1000000",
  type: "ERC20"  // ERC20, ERC721, ERC1155
}

Protocol Node

Labels: Protocol

Properties:

{
  name: "Uniswap V3",
  category: "DEX",
  website: "https://uniswap.org"
}

Relationship Types

TRANSFERRED_TO

Purpose: Track token transfers between addresses.

Properties:

{
  amount: "1000000000000000000",
  tokenAddress: "0x...",
  transactionHash: "0x...",
  blockNumber: 12345,
  timestamp: timestamp
}

Example:

(a1:Address {address: "0x..."})-[r:TRANSFERRED_TO {
  amount: "1000000000000000000",
  tokenAddress: "0x...",
  transactionHash: "0x..."
}]->(a2:Address {address: "0x..."})

CALLED

Purpose: Track contract calls between addresses.

Properties:

{
  transactionHash: "0x...",
  blockNumber: 12345,
  timestamp: timestamp,
  gasUsed: 21000,
  method: "transfer"
}

OWNS

Purpose: Track token ownership (current balances).

Properties:

{
  balance: "1000000000000000000",
  tokenId: "123",  // For ERC-721/1155
  updatedAt: timestamp
}

Example:

(a:Address)-[r:OWNS {
  balance: "1000000000000000000",
  updatedAt: timestamp
}]->(t:Token)

INTERACTS_WITH

Purpose: Track protocol interactions.

Properties:

{
  interactionType: "swap",  // swap, deposit, withdraw, etc.
  transactionHash: "0x...",
  timestamp: timestamp
}

Example:

(a:Address)-[r:INTERACTS_WITH {
  interactionType: "swap",
  transactionHash: "0x..."
}]->(p:Protocol)

CLUSTERED_WITH

Purpose: Link addresses that belong to the same entity (address clustering).

Properties:

{
  confidence: 0.95,  // Clustering confidence score
  method: "heuristic",  // Clustering method
  createdAt: timestamp
}

Purpose: Link transactions across chains via CCIP messages.

Properties:

{
  messageId: "0x...",
  sourceTxHash: "0x...",
  destTxHash: "0x...",
  status: "delivered",
  timestamp: timestamp
}

Example:

(srcTx:Transaction)-[r:CCIP_MESSAGE_LINK {
  messageId: "0x...",
  status: "delivered"
}]->(destTx:Transaction)

Query Patterns

Find Token Holders

MATCH (t:Token {address: "0x...", chainId: 138})-[r:OWNS]-(a:Address)
WHERE r.balance > "0"
RETURN a.address, r.balance
ORDER BY toFloat(r.balance) DESC
LIMIT 100;

Find Transfer Path

MATCH path = (a1:Address {address: "0x..."})-[:TRANSFERRED_TO*1..3]-(a2:Address {address: "0x..."})
WHERE ALL(r in relationships(path) WHERE r.tokenAddress = "0x...")
RETURN path
LIMIT 10;

Find Protocol Users

MATCH (a:Address)-[r:INTERACTS_WITH]->(p:Protocol {name: "Uniswap V3"})
RETURN a.address, count(r) as interactionCount
ORDER BY interactionCount DESC
LIMIT 100;

Address Clustering

MATCH (a1:Address)-[r:CLUSTERED_WITH]-(a2:Address)
WHERE a1.address = "0x..."
RETURN a2.address, r.confidence, r.method;
MATCH (srcTx:Transaction {hash: "0x..."})-[r:CCIP_MESSAGE_LINK]-(destTx:Transaction)
RETURN srcTx, r, destTx;

Data Ingestion

Transaction Ingestion

Process:

  1. Process transaction from indexer
  2. Create/update address nodes
  3. Create TRANSFERRED_TO relationships for token transfers
  4. Create CALLED relationships for contract calls
  5. Update OWNS relationships for token balances

Batch Ingestion

Strategy:

  • Use Neo4j Batch API for bulk inserts
  • Batch size: 1000-10000 operations
  • Use transactions for atomicity

Incremental Updates

Process:

  • Update relationships as new transactions processed
  • Maintain OWNS relationships (update balances)
  • Add new relationships for new interactions

Performance Optimization

Indexing

Indexes:

CREATE INDEX address_address FOR (a:Address) ON (a.address);
CREATE INDEX address_chain_id FOR (a:Address) ON (a.chainId);
CREATE INDEX transaction_hash FOR (t:Transaction) ON (t.hash);

Relationship Constraints

Uniqueness: Use MERGE to avoid duplicate relationships

Example:

MATCH (a1:Address {address: "0x...", chainId: 138})
MATCH (a2:Address {address: "0x...", chainId: 138})
MERGE (a1)-[r:TRANSFERRED_TO {
  transactionHash: "0x..."
}]->(a2)
ON CREATE SET r.amount = "1000000", r.timestamp = timestamp();

Data Retention

Strategy:

  • Keep all current relationships
  • Archive old relationships (older than 1 year) to separate database
  • Keep aggregated statistics (interaction counts) instead of all relationships

References

  • Entity Graph: See ../multichain/entity-graph.md
  • CCIP Integration: See ../ccip/ccip-tracking.md