Skip to content

No TX Index without ReceiptsManager #3926

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ScottyPoi opened this issue Mar 20, 2025 · 3 comments
Open

No TX Index without ReceiptsManager #3926

ScottyPoi opened this issue Mar 20, 2025 · 3 comments

Comments

@ScottyPoi
Copy link
Contributor

Context:

When --saveReceipts is set, the client maintains in its DB a "transaction index". The "transaction index" is a key/value mapping of txHash > [blockHash, txIndex].

To find a transaction by hash, we look up the txHash to retrieve the blockHash and index (index here meaning which tx in the block.transactions array). We then use the blockHash to retrieve the block, and find the transaction at the right index in that block.

Problem:

This is only possible via the ReceiptsManager, which only exists if --saveReceipts is on. Transactions are not indexed otherwise. This does not feel like a necessary coupling. Additionally, in the process of looking up a transaction by hash (via eth_getTransactionByHash), we currently also do an unnecessary retrieval of the receipt from the DB.

Solution:

If we were to maintain the TxIndex independently of Receipts, we could enable Tx lookup by hash without also needing to have --saveReceipts on.

  • At a very minimum we should refactor getTransactionByHash to avoid the extra DB lookup.
@jochem-brouwer
Copy link
Member

Can we use this issue to write down the database key/values we are currently maintaining and if those are always present or only present under a flag?

How do other clients do this? It seems to me that our lookup of txhash -> [blockhash, txindex] and then retrieving the block and then the txindex (?) to lookup the tx is not the way to go (but we also don't want to keep too many key/value pairs to keep the db size manageable - at least not "too big").

It would make sense to me for a txhash to create a table for the txhashes, and then in this table store two things: (1) the tx data itself and (2) the blockhash it is included in in the canonical chain. (So we can also lookup in whatever block it was included, or if it was not included at all (pending txs in mempool would not have a blockhash field but would have a data field))

@ScottyPoi
Copy link
Contributor Author

ScottyPoi commented Apr 9, 2025

@jochem-brouwer i think this is comprehensive. i'm not sure about the Skeleton stuff, which is probably dependent on sync flags. i'm also not entirely sure what Heads refers to in the chain DB

  1. Meta Database:
  • Receipts

    • Value: block transaction receipts
    • Key Derivation: block hash
    • Flags: --saveReceipts
  • TxHash

    • Value: TxIndex object: { blockhash, index }
    • Key Derivation: tx hash
    • Flags: --saveReceipts, --txLookupLimit
  • SkeletonBlock

    • Value: Skeleton block data
    • Key Derivation: block number
  • SkeletonBlockHashToNumber

    • Value: block number
    • Key Derivation: block hash
  • SkeletonUnfinalizedBlockByHash

    • Value: skeleton block data
    • Key Derivation: blockHash
  • SkeletonStatus

    • Value: sync status
    • Key: fixed
  • Preimage

    • Value: preimage
    • Key Derivation: hashed key
  1. Blockchain Database:
  • Heads

    • Value: JSON object containing chain heads
    • Key Derivation: Fixed HEADS_KEY constant
  • HeadHeader

    • Value: Latest header
    • Key Derivation: Fixed HEAD_HEADER_KEY constant
  • HeadBlock

    • Value: Latest block
    • Key Derivation: Fixed HEAD_BLOCK_KEY constant
  • HashToNumber

    • Value: Block number
    • Key Derivation: blockHash
  • NumberToHash

    • Value: Block hash
    • Key Derivation: block number
  • TotalDifficulty

    • Value: RLP encoded total difficulty at given block
    • Key Derivation: tdKey(blockNumber, blockHash)
  • Body

    • Value: block body
    • Key Derivation: bodyKey(blockNumber, blockHash)
  • Header

    • Value: block header
    • Key Derivation: headerKey(blockNumber, blockHash)
  1. Trie Database:
  • Trie Node
    • Value: Serialized node
    • Key Derivation: node hash
  • Root
    • Value: root hash
    • Key Derivation: Fixed ROOT constant

@ScottyPoi
Copy link
Contributor Author

How do other clients do this? It seems to me that our lookup of txhash -> [blockhash, txindex] and then retrieving the block and then the txindex (?) to lookup the tx is not the way to go

I am curious too, but i believe that others do the same. we wouldn't want to store tx data, because we're already storing it in the block bodies. as far as i know, looking up the blockhash and index (meaning which tx in the block does the hash refer), and then finding the tx data in the block is a pretty standard approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants