This project manages ENS (Ethereum Name Service) related Google BigQuery data, providing comprehensive analytics on ENS domains, resolvers, registrations, and reverse records.
# Run the smart hybrid incremental pipeline
./scripts/run_incremental_pipeline.sh --project-id=web3-publicgoodsCost:
- First run: ~$30.75 (processes all historical data)
- Subsequent runs: ~$0.35 (incremental updates only)
- 98.9% cost reduction after first run!
# Set up daily scheduled queries with proper time intervals
./scripts/setup_daily_scheduled_queries.sh --project-id=web3-publicgoodsDaily Scheduling System:
- Incremental extraction: Raw events every 2 hours (already running)
- Daily updates: 14 separate queries with staggered execution times
- 2:00-2:45 AM: Event decoding (5 queries, 10-min intervals)
- 3:00-3:30 AM: State computation (4 queries, 5-10 min intervals)
- 4:00-4:30 AM: Production tables (5 queries, 5-10 min intervals)
- Individual queries: Each DDL file runs independently (easier debugging)
- Web UI monitoring in BigQuery console
- ~$1.20/day operating cost (96% reduction from $30.75!)
- Incremental stages: Raw event extraction (expensive operations)
- Full rebuild stages: State computation and aggregation (cost-effective)
- 17 total stages: From raw Ethereum logs to production analytics tables
labels- 134M+ keccak256 → plaintext mappings (enables name resolution)registry- 4M+ ENS domains with hierarchical namesresolvers- 3.8M+ resolver records with text records and addressesreverse_records- 421K+ validated reverse lookupsregistration_periods- 4.8M+ registration events with USD costsresolutions- Joined registry + resolver data
_raw_*- Raw event extractions from Ethereum logs_decoded_*- ABI-decoded event tables_state_*- Latest state computation tables_agg_*- Aggregation intermediate tables
# Validate all SQL without running queries
./scripts/run_incremental_pipeline.sh --dry-run# Option 1: Force full rebuild with incremental pipeline
./scripts/run_incremental_pipeline.sh --force-full
# Option 2: Use legacy full rebuild pipeline (guaranteed fresh start)
./scripts/run_pipeline.sh --project-id=web3-publicgoods# Check scheduled query status
bq ls --transfer_config --transfer_location=us --max_results=10
# Monitor pipeline checkpoint
bq query "SELECT * FROM \`web3-publicgoods.ens._pipeline_checkpoint\`"
# Check table counts
bq query "SELECT COUNT(*) FROM \`web3-publicgoods.ens.registry\`"
bq query "SELECT COUNT(*) FROM \`web3-publicgoods.ens.resolvers\`"
bq query "SELECT COUNT(*) FROM \`web3-publicgoods.ens.reverse_records\`"
# Web UI monitoring
echo "https://console.cloud.google.com/bigquery/scheduled-queries?project=web3-publicgoods"- Primary:
bigquery-public-data.goog_blockchain_ethereum_mainnet_us.* - ENS Labels:
preimagedb.preimages.keccak256(134M+ mappings) - Target Dataset:
web3-publicgoods.ens.*
- Checkpoint Table - Incremental processing tracker
- Functions - UDF functions (NAMEHASH, address extraction)
- Raw Events - Extract ENS events from Ethereum logs (INCREMENTAL)
- Controller Events - Decode NameRegistered/NameRenewed
- Base Registrar Events - Decode NameMigrated events (INCREMENTAL)
- Resolver Events - Decode all resolver events
- Historical Traces - Legacy resolver setName calls
- Registry Events - Decode NewOwner, Transfer events
- Resolver State - Latest values per node
- Registry State - Latest ownership per node
- Aggregated Resolver - Combine text records and addresses
- Resolver Table - Main resolver production table
- Aggregated Registry - Registry analytics
- Registry Table - Main registry production table
- Registration Periods - Registration events with USD costs
- Reverse Records - Validated reverse lookups
- Resolutions - Combined registry + resolver data
- ENS Registry (current + legacy)
- BaseRegistrarImplementation
- ETHRegistrarController
- PublicResolver (all versions)
- ReverseRegistrar
- NameWrapper
The pipeline includes automatic backup capabilities:
# Backup current data before major changes
./scripts/backup_dataset.sh web3-publicgoodsThe pipeline uses a smart hybrid approach:
- Expensive extraction (stages 3 & 5): Incremental processing
- Everything else: Full rebuild (more cost-effective for smaller data)
- Manual runs: $0.35 incremental vs $30.75 full rebuild (98.9% savings)
- Daily scheduled: $1.20/day (includes all 14 production tables)
ddl/ # SQL DDL files
├── create_functions.sql # UDF functions
├── create_*_events.sql # Event extraction and decoding
├── create_state_*.sql # Latest state computation
├── create_aggregated_*.sql # Data aggregation
└── create_*_table.sql # Production table creation
scripts/ # Pipeline automation
├── run_incremental_pipeline.sh # Main pipeline runner (manual)
├── run_pipeline.sh # Legacy full rebuild
├── setup_daily_scheduled_queries.sh # Daily scheduled queries setup
└── backup_dataset.sh # Data backup utility
Ready to get started? Run: ./scripts/run_incremental_pipeline.sh --project-id=web3-publicgoods