Bitcoingraph - A Python library for exploring the Bitcoin transaction graph.
First, install the current version of Bitcoin Core (v.11.1), either from source or from a pre-compiled executable.
Once installed, you'll have access to three programs: bitcoind
(= full peer), bitcoin-qt
(= peer with GUI), and bitcoin-cli
(RPC command line interface). The following instructions have been tested with bitcoind
and assume you can start and run a Bitcoin Core peer as follows:
bitcoind -printtoconsole
Second, you must make sure that your bitcoin client accepts JSON-RPC connections by modifying the Bitcoin Core configuration file as follows:
# server=1 tells Bitcoin-QT to accept JSON-RPC commands.
server=1
# You must set rpcuser and rpcpassword to secure the JSON-RPC api
rpcuser=your_rpcuser
rpcpassword=your_rpcpass
# How many seconds bitcoin will wait for a complete RPC HTTP request.
# after the HTTP connection is established.
rpctimeout=30
# Listen for RPC connections on this TCP port:
rpcport=8332
# Index non-wallet transactions (required for fast txn and block lookups)
txindex=1
# Enable unauthenticated REST API
rest=
Test whether the JSON-RPC interface is working by starting your Bitcoin Core peer (...waiting until it finished startup...) and using the following cURL request (with adapted username and password):
curl --data-binary '{"jsonrpc": "1.0", "id":"curltext", "method": "getblockchaininfo", "params": [] }' -H 'content-type: text/plain;' http://your_rpcuser:your_rpcpass@localhost:8332/
Third, since Bitcoingraph needs to access non-wallet blockchain transactions by their ids, you need to enable the transaction index in the Bitcoin Core database. This can be achieved by adding the following property to your bitcoin.conf
txindex=1
... and restarting your Bitcoin core peer as follows (rebuilding the index can take a while):
bitcoind -reindex
Test non-wallet transaction data access by taking an arbitrary transaction id and issuing the following request using cURL:
curl --data-binary '{"jsonrpc": "1.0", "id":"curltext", "method": "getrawtransaction", "params": ["110ed92f558a1e3a94976ddea5c32f030670b5c58c3cc4d857ac14d7a1547a90", 1] }' -H 'content-type: text/plain;' http://your_rpcuser:your_rpcpass@localhost:8332/
Finally, bitcoingraph also makes use of Bitcoin Core's HTTP REST interface, which is enabled using the following parameter:
bitcoind -rest
Test it using some sample block hash
http://localhost:8332/rest/block/000000000000000e7ad69c72afc00dc4e05fc15ae3061c47d3591d07c09f2928.json
When you reached this point, your Bitcoin Core setup is working. Terminate all running bitcoind instances and launch a new background daemon with enabled REST interface
bitcoind -daemon -rest
Bitcoingraph is being developed in Python 3.4. Make sure it is running on your machine:
python --version
Now clone Bitcoingraph...
git clone https://github.com/behas/bitcoingraph.git
...test and install the Bitcoingraph library:
cd bitcoingraph
pip install -r requirements.txt
py.test
python setup.py install
Running bitcoingraph on a Mac requires coreutils to be installed
homebrew install coreutils
bitcoingraph stores Bitcoin transactions as directed labelled graph in a Neo4J graph database instance. This database can be bootstrapped by loading an initial blockchain dump, performing entity computation over the entire dump as described by Ron and Shamir, and ingesting it into a running Neo4J instance.
Bitcoingraph provides the bcgraph-export
tool for exporting transactions in a given block range from the blockchain. The following command exports all transactions contained in block range 0 to 1000 using Neo4Js header format and separate CSV header files:
bcgraph-export 0 1000 -u your_rpcuser -p your_rpcpass
The following CSV files are created (with separate header files):
- addresses.csv: sorted list of Bitcoin addressed
- blocks.csv: list of blocks (hash, height, timestamp)
- transactions.csv: list of transactions (hash, coinbase/non-coinbase)
- outputs.csv: list of transaction outputs (output key, id, value, script type)
- rel_block_tx.csv: relationship between blocks and transactions (block_hash, tx_hash)
- rel_input.csv: relationship between transactions and transaction outputs (tx_hash, output key)
- rel_output_address.csv: relationship between outputs and addresses (output key, address)
- rel_tx_output.csv: relationship between transactions and transaction outputs (tx_hash, output key)
The following command computes entities for a given blockchain data dump:
bcgraph-compute-entities -i blocks_0_1000
Two additional files are created:
- entities.csv: list of entity identifiers (entity_id)
- rel_address_entity.csv: assignment of addresses to entities (address, entity_id)
Download and install Neo4J community edition (>= 2.3.0):
tar xvfz neo4j-community-2.3.0-unix.tar.gz
Test Neo4J installation:
sudo neo4j start
http://localhost:7474/
Install and make sure is not running and pre-existing databases are removed:
sudo neo4j stop
sudo rm -rf /var/lib/neo4j/data/*
Switch back into the dump directory and create a new database using Neo4J's CSV importer tool:
sudo neo4j-admin import \
--nodes=:Block=blocks_header.csv,blocks.csv \
--nodes=:Transaction=transactions_header.csv,transactions.csv \
--nodes=:Output=outputs_header.csv,outputs.csv \
--nodes=:Address=addresses_header.csv,addresses.csv \
--nodes=:Entity=entities.csv \
--relationships=CONTAINS=rel_block_tx_header.csv,rel_block_tx.csv \
--relationships=APPENDS=rel_block_block_header.csv,rel_block_block.csv \
--relationships=OUTPUT=rel_tx_output_header.csv,rel_tx_output.csv \
--relationships=INPUT=rel_input_header.csv,rel_input.csv \
--relationships=USES=rel_output_address_header.csv,rel_output_address.csv \
--relationships=BELONGS_TO=rel_address_entity.csv
Then, start the Neo4J shell...:
$NEO4J_HOME/bin/neo4j-shell -path $NEO4J_HOME/data
and create the following uniquness constraints:
CREATE CONSTRAINT ON (a:Address) ASSERT a.address IS UNIQUE;
CREATE CONSTRAINT ON (o:Output) ASSERT o.txid_n IS UNIQUE;
Finally start Neo4J
sudo neo4j start
Some bitcoin addresses have associated public identity information. Bitcoingraph provides an example script which collects information from blockchain.info.
utils/identity_information.py
The resulting CSV file can be imported into Neo4j with the Cypher statement:
LOAD CSV WITH HEADERS FROM "file://<PATH>/identities.csv" AS row
MERGE (a:Address {address: row.address})
CREATE a-[:HAS]->(i:Identity
{name: row.tag, link: row.link, source: "https://blockchain.info/"})
Clone the git repository and compile from source. This requires Maven and Java JDK to be installed.
git clone https://github.com/romankarl/entity-plugin.git
cd entity-plugin
mvn package
Copy the JAR package into Neo4j's plugin directory.
service neo4j-service stop
cp target/entities-plugin-0.0.1-SNAPSHOT.jar $NEO4J_HOME/plugins/
service neo4j-service start
Bitcoingraph provides a synchronisation script, which reads blocks from bitcoind and writes them into Neo4j. It is intended to be called by a cron job which runs daily or more frequent. For performance reasons it is no substitution for steps 1-3.
bcgraph-synchronize -s localhost -u RPC_USER -p RPC_PASS -S localhost -U NEO4J_USER -P NEO4J_PASS --rest
This library is release Open Source under the MIT license.