
In our last post we introduced the cornerstone of scaling up blockchain analysis, commonspend, and its pitfalls. In this blog post we’ll explore more complex and novel blockchain analysis scaling methods, their drawbacks and why time is a critical feature of blockchain analytics.
Change prediction is the second most commonly applied UTXO heuristic. It aims to predict which receiving address is controlled by the sender. A hallmark of UTXO blockchains is that when addresses transact, they move all outputs. The surplus amount is normally returned to the sender via a change address.
Consider the transaction below and try spotting the change address that belongs to the sender:
The change address is likely 374jbPUojy5pbmpjLGk8eS413Az4YyzBq6. Why? In this case, prediction logic relies on the fact that the above address is in the same address format as the input addresses (P2SH format, where sender’s addresses start with a “3”).
Among other factors, rounded amounts (i.e. 0.05 or 0.1 BTC) are often recognized as the actual send, with the rest being redirected to the change address. This suggests that change prediction relies not only on technical indicators, but also on elements of human behavior, like our affinity for rounded numbers.
Naturally, a more liberal change prediction logic that takes into account multiple variables in favor of a desired outcome can potentially lead to misattribution and mis-clustering. In particular, blockchain analytics tools can inadvertently fall into the trap of unsupervised change prediction — that’s why it is vital for blockchain investigators to be mindful of the limitations posed by this approach.
Consider a more challenging example:
We have legacy addresses (starting with a “1”) sending on to two other legacy addresses. So which one is the change address?
The best way to figure out which address is the change address is to look at how each address spends BTC onwards. Usually output addresses receiving rounded amounts are not…










