Files
explorer-monorepo/docs/specs/indexing/verification-pipeline.md

13 KiB

Contract Verification Pipeline Specification

Overview

This document specifies the pipeline for verifying smart contracts on the explorer platform. Contract verification allows users to submit source code, which is compiled and compared against deployed bytecode to enable source code viewing, debugging, and ABI extraction.

Architecture

flowchart TB
    subgraph Submit[Submission]
        User[User Submits<br/>Source Code]
        UI[Verification UI]
        API[Verification API]
    end
    
    subgraph Validate[Validation]
        Val[Validate Input]
        Check[Check Contract Exists]
        Dup[Check Duplicate]
    end
    
    subgraph Compile[Compilation]
        Comp[Compiler Service]
        Versions[Compiler Version<br/>Registry]
        Build[Build Artifacts]
    end
    
    subgraph Verify[Verification]
        Match[Bytecode Matching]
        Construct[Constructor Args<br/>Extraction]
        MatchResult[Match Result]
    end
    
    subgraph Store[Storage]
        DB[(Database)]
        Artifacts[Artifact Storage<br/>S3/Immutable]
        ABI[ABI Registry]
    end
    
    User --> UI
    UI --> API
    API --> Val
    Val --> Check
    Check --> Dup
    Dup --> Comp
    Comp --> Versions
    Comp --> Build
    Build --> Match
    Match --> Construct
    Construct --> MatchResult
    MatchResult --> DB
    MatchResult --> Artifacts
    MatchResult --> ABI

Source Code Submission Workflow

Submission Methods

1. Standard JSON Input (Recommended)

  • Submit Solidity compiler's standard JSON input format
  • Includes source files, compiler settings, optimization
  • Most reliable for complex contracts

2. Multi-file Upload

  • Upload individual source files
  • Specify compiler version and settings
  • Compiler constructs standard JSON input

3. Sourcify Integration

  • Verify via Sourcify API
  • Automatic source code and metadata retrieval
  • Supports verified contracts from Sourcify registry

4. Flattened Source

  • Single flattened source file
  • All imports inlined
  • Simpler but less flexible

Submission API

Endpoint: POST /api/v1/contracts/{address}/verify

Request Body:

{
  "chain_id": 138,
  "address": "0x...",
  "compiler_version": "v0.8.19+commit.7dd6d404",
  "optimization_enabled": true,
  "optimization_runs": 200,
  "evm_version": "london",
  "source_code": "...", // or standard_json_input
  "constructor_arguments": "0x...",
  "library_addresses": {
    "Lib1": "0x..."
  },
  "verification_method": "standard_json"
}

Response:

{
  "status": "pending",
  "verification_id": "uuid",
  "message": "Verification submitted"
}

Input Validation

Validation Rules:

  1. Contract Address: Must be valid Ethereum address, must exist on chain
  2. Compiler Version: Must be supported compiler version
  3. Source Code: Must be valid Solidity/Vyper code
  4. Constructor Arguments: Must match deployed contract (if provided)
  5. Library Addresses: Must match deployed libraries (if provided)

Error Handling:

  • Invalid address: 400 Bad Request
  • Unsupported compiler: 400 Bad Request
  • Invalid source code: 400 Bad Request
  • Contract not found: 404 Not Found

Compiler Version Management

Compiler Registry

Purpose: Manage available compiler versions and their metadata.

Storage:

compiler_versions (
    id SERIAL PRIMARY KEY,
    version VARCHAR(50) UNIQUE NOT NULL,
    compiler_type VARCHAR(20) NOT NULL, -- 'solidity', 'vyper'
    evm_version VARCHAR(20),
    optimizer_available BOOLEAN DEFAULT true,
    download_url TEXT,
    checksum VARCHAR(64),
    installed BOOLEAN DEFAULT false,
    installed_path TEXT,
    created_at TIMESTAMP DEFAULT NOW()
)

Compiler Installation

Methods:

  1. Pre-installed: Common versions pre-installed on compilation servers
  2. On-demand: Download and install when needed
  3. Docker: Use compiler Docker images (isolated, reproducible)

Recommended: Docker-based compilation for isolation and reproducibility.

Docker Setup:

FROM ethereum/solc:0.8.19
# Or use solc-select for version management

Version Selection

Strategy:

  • Exact match: User specifies exact version
  • Pragma matching: Extract version from source code pragma
  • Latest compatible: Use latest compatible version if exact not available

Pragma Parsing:

  • Extract pragma solidity ^0.8.0; or >=0.8.0 <0.9.0
  • Resolve to specific compiler version
  • Handle caret (^), tilde (~), and range operators

Compilation Process

Standard JSON Input Format

Structure:

{
  "language": "Solidity",
  "sources": {
    "Contract.sol": {
      "content": "pragma solidity ^0.8.0; ..."
    }
  },
  "settings": {
    "optimizer": {
      "enabled": true,
      "runs": 200
    },
    "evmVersion": "london",
    "outputSelection": {
      "*": {
        "*": ["abi", "evm.bytecode", "evm.deployedBytecode"]
      }
    }
  }
}

Compilation Steps

  1. Prepare Input: Construct standard JSON input from user submission
  2. Select Compiler: Choose appropriate compiler version
  3. Resolve Imports: Handle import statements (local files, external URLs)
  4. Compile: Execute compiler with standard JSON input
  5. Extract Artifacts: Extract ABI, bytecode, deployed bytecode
  6. Handle Errors: Parse compilation errors and return to user

Import Resolution

Import Types:

  • Local Files: Included in submission
  • External URLs: Fetch from URL (GitHub, IPFS, etc.)
  • Standard Libraries: Known library addresses (OpenZeppelin, etc.)

Resolution Strategy:

  1. Check local files first
  2. Try external URL fetching
  3. Check standard library registry
  4. Fail if cannot resolve

Optimization Settings

Optimizer Configuration:

  • Enabled: Boolean flag
  • Runs: Optimization runs (affects bytecode size vs gas cost)
  • EVN Version: Target EVM version (affects bytecode generation)

Matching Strategy:

  • Must match deployed contract's optimization settings exactly
  • Try multiple optimization combinations if initial match fails

Bytecode Matching

Matching Process

Goal: Compare compiled bytecode with deployed bytecode.

Steps:

  1. Fetch deployed bytecode from chain via eth_getCode(address)
  2. Extract deployed bytecode from compilation artifacts
  3. Compare bytecodes (exact match required)
  4. Handle constructor arguments (trimmed from deployed bytecode)

Bytecode Normalization

Normalization Steps:

  1. Remove metadata hash (last 53 bytes)
  2. Remove constructor arguments (if contract creation)
  3. Compare remaining bytecode

Metadata Hash:

  • Solidity appends metadata hash to bytecode
  • Format: 0xa2646970667358221220... + 43 bytes
  • Should be excluded from comparison

Constructor Arguments Extraction

Purpose: Extract constructor arguments from deployed bytecode.

Process:

  1. Compiled bytecode: creation_code + constructor_args
  2. Deployed bytecode: runtime_code (constructor args removed)
  3. Extract constructor args: deployed_bytecode.length - runtime_code.length

Validation:

  • Verify extracted constructor args match user-provided args (if provided)
  • Decode constructor args if ABI available

Library Linking

Problem: Contracts using libraries have placeholders in bytecode.

Solution:

  1. Identify library placeholders in compiled bytecode
  2. Replace placeholders with actual library addresses
  3. Compare linked bytecode with deployed bytecode

Library Placeholder Format:

  • __$...$__ (Solidity)
  • Must match user-provided library addresses

Verification Status Tracking

Status States

States:

  1. pending: Verification submitted, queued for processing
  2. processing: Compilation/verification in progress
  3. verified: Bytecode matches, contract verified
  4. failed: Verification failed (mismatch, compilation error, etc.)
  5. partially_verified: Some source files verified (multi-file contracts)

Status Updates

Database Schema:

contract_verifications (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    chain_id INTEGER NOT NULL,
    address VARCHAR(42) NOT NULL,
    status VARCHAR(20) NOT NULL,
    compiler_version VARCHAR(50),
    optimization_enabled BOOLEAN,
    optimization_runs INTEGER,
    evm_version VARCHAR(20),
    source_code TEXT,
    abi JSONB,
    constructor_arguments TEXT,
    verification_method VARCHAR(50),
    error_message TEXT,
    verified_at TIMESTAMP,
    created_at TIMESTAMP DEFAULT NOW(),
    updated_at TIMESTAMP DEFAULT NOW(),
    FOREIGN KEY (chain_id, address) REFERENCES contracts(chain_id, address)
)

Status Transitions:

  • pendingprocessingverified or failed
  • Webhook/notification on status change (optional)

Build Artifact Storage

Artifact Types

Artifacts to Store:

  1. Source Code: Original submitted source files
  2. Standard JSON Input: Compiler input
  3. Compiler Output: Full compiler JSON output
  4. ABI: Extracted ABI
  5. Bytecode: Creation and runtime bytecode
  6. Metadata: Compiler metadata

Storage Strategy

Immutable Storage:

  • Use S3-compatible storage (AWS S3, MinIO, etc.)
  • Immutable after verification (no updates)
  • Versioned storage if updates needed

Storage Path Structure:

contracts/{chain_id}/{address}/verification_{id}/
  - source_code.sol
  - standard_json_input.json
  - compiler_output.json
  - abi.json
  - bytecode.txt
  - metadata.json

Database Reference:

  • Store artifact storage path in database
  • Link to contract record

Access Control

Public Access:

  • Verified contracts: Public read access
  • Source code: Public read access
  • Artifacts: Public read access

Private Access:

  • Pending verifications: Owner only
  • Failed verifications: Owner only (optional public)

Sourcify Integration

Sourcify API

Endpoint: GET /api/v1/verify/{chain_id}/{address}

Process:

  1. Query Sourcify API for contract verification
  2. Retrieve source files and metadata
  3. Verify match with deployed bytecode
  4. Store in our database if match

Benefits:

  • Leverage existing verified contracts
  • Automatic verification for popular contracts
  • Reduces manual verification workload

Sourcify Format

Structure:

contracts/
  - {chain_id}/
    - {address}/
      - metadata.json
      - sources/
        - Contract.sol

Metadata Format:

  • Compiler version
  • Settings
  • Source file mapping

Multi-Compiler Version Support

Supported Compilers

Solidity:

  • Versions: 0.4.x through latest
  • Multiple versions per contract (updates)

Vyper:

  • Versions: 0.1.x through latest
  • Similar workflow to Solidity

Version Compatibility

Handling:

  • Support multiple verification attempts with different versions
  • Store all verification attempts (history)
  • Mark latest successful verification as active

Database Schema:

contract_verifications (
    -- ... fields ...
    version INTEGER DEFAULT 1, -- Increment for each new verification
    is_active BOOLEAN DEFAULT true -- Latest successful verification
)

Error Handling

Compilation Errors

Error Types:

  • Syntax errors
  • Type errors
  • Import resolution errors
  • Optimization errors

Response:

  • Return detailed error messages to user
  • Include file and line number
  • Suggest fixes when possible

Verification Failures

Failure Reasons:

  • Bytecode mismatch
  • Constructor arguments mismatch
  • Library address mismatch
  • Optimization settings mismatch

Response:

  • Return specific mismatch reason
  • Suggest correct settings if possible
  • Allow retry with corrected input

Performance Considerations

Compilation Performance

Optimization:

  • Cache compilation results (same source + settings)
  • Parallel compilation for multiple contracts
  • Compiler server pool for load distribution

Queue Management

Queue System:

  • Use message queue (RabbitMQ, Kafka) for verification jobs
  • Priority queue: User submissions before automated checks
  • Rate limiting per user/IP

Processing Time:

  • Target: < 30 seconds for simple contracts
  • Target: < 5 minutes for complex contracts
  • Timeout: 10 minutes maximum

Security Considerations

Source Code Validation

Validation:

  • Validate source code size (max 10MB)
  • Sanitize input to prevent injection attacks
  • Validate compiler version (whitelist known versions)

Artifact Storage Security

Access Control:

  • Verify ownership before allowing updates
  • Audit log all verification submissions
  • Rate limit submissions per user/IP

API Endpoints

Submit Verification

POST /api/v1/contracts/{address}/verify

Check Status

GET /api/v1/contracts/{address}/verification/{verification_id}

Get Verified Contract

GET /api/v1/contracts/{address}

List Verification History

GET /api/v1/contracts/{address}/verifications

References

  • Indexer Architecture: See indexer-architecture.md
  • Data Models: See data-models.md
  • Database Schema: See ../database/postgres-schema.md
  • API Specification: See ../api/rest-api.md