Bitcoin P2P Network Security

Bitcoin P2P Network Security

·

10 min read

The way Bitcoin is trying to change the current monetary system is by having Nodes running all around the world. The consensus layer makes it possible. Everything boils down to validating blocks and transactions.

But there's another component to it, the nodes have to be able to communicate with each other, which is made possible by the Peer to Peer Network.

A node on the peer to peer network has concerns that include and are not limited to questions like:

  • What nodes to connect to?
  • Does it have enough peers?
  • How to manage limited bandwidth?
  • Will my transaction get mined?
  • What transactions should I relay?
  • Are my peers malicious?

Fundamentally we can distill all of this down to network nodes gossiping three different pieces of information:

  1. Blocks
  2. Transactions(non-mined)
  3. Addresses(information that allows nodes to connect with each other)

Peer to Peer Network Design Goals:

  1. Reliable - If a node submits a valid message to the network, it will eventually reach out to all nodes on the network.
  2. Timely - We also need to do this in a timely manner and the definition of 'timely' here changes from message to message and from user to user. For example, what is reasonable time for propagation property of blocks for miners vs users are vey different.
  3. Accessible - Entity is/are able to participate without the cost being too high. Right now you can run a Bitcoin node on a Raspberry Pi
  4. Private - It is important because it is money. You don't want your persistent identity to be linked to your transactions.
  5. Upgradable - If a user buys into the rulesets in a particular point of time. They should not be subject to a development cycle that they might not be necessarily participate with or agree with.

Reliability

Say you are a node and you form a transaction that hopefully propagates out to all of the different mempools because you don't know where the miners are. One thing that could threaten it would be as if there was a split in the network. Your split might not have a miner in it.

image.png

The other possibility is that your split might have a miner but it doesn't have the same hash power as the other split. Which would lead to a chain fork. It will be a really bad situation to be in. Bitcoin will fail.

So, we don't want network partitions to occur.

There's a few different ways that they can:

  1. A part of them can be unintentional. For example - peer prioritisation logic. If we give a little boost to peers that are able to serve me transactions and blocks quicker. That probably will improve my delivery guarantees and make sure that I have the main chain tip updated.

    What this could cause over time that nodes start prioritising other nodes that are closer to them because that's how the latency would be at it's lowest.

    Over time you could imagine that Australia and New Zealand start becoming their own network. Asia and Europe become their own Network. So, something down the line could create a Network split.

    This is what we need to take into consideration while upgrading the software as well because if nodes that upgrade, prioritise other nodes that upgrade, it could be disconnecting those older nodes.

  2. Other range of attacks could be intentional. One reason to do this would be as a technique for double spending. If you are able to split the network into two different parts and you are the only one who knows about that, then you can spend the same coin on both the networks.

    A particular type of partition attack is the eclipse attack. The idea here is - say there is a victim and it thinks that it is connected to the network but in reality, all of its connections are to the same adversary. Then effectively this victim has been eclipsed from the network.

image.png

So under this circumstance what the adversary would be able to do is execute a whole range of attacks. First and foremost, they would have unlimited time to generate blocks that would include double spends.

Because of the beauty of how consensus works, the adversaries would still have to find the proof-of-work but they would not need to compete with the hash rate of the rest of the network.

Another thing that they could do is censor what transaction victim gets to send to the network.

They could also de-anonymise the transaction source(i.e. the victim)

They could broadcast fake transaction which the victim won't be able to identify by comparing with rest of the network and it will have implications fro lightning.

Mitigating Eclipse Attacks

Increasing Number of Peers

We can increase the number of connections. The more connections you have, the more expensive it is for the adversary to take control of all of those connections. But we come up against the idea of BANDWIDTH because it is a limited resource both at the individual and network level.

That's because of an idea of inbound vs outbound connections. All nodes will have outbound connections but only some nodes will accept incoming connections. So, we have to make sure that the network as a whole is able to support all the outbound connections needed.

Default numbers are:

  • 8 outbound connections
  • 125 total connections

    NOTE: If your node initiated the connection, it's outbound, otherwise it's inbound. Nodes will send and receive data from both types of connections exactly the same way.

So, this is a useful way of mitigating being eclipsed but it has its own bounds.

Choosing Diverse Peers

We do implement this. But, we are working with a limited set of information. Usually the information available is at the network layer and this logic to implement bucket and try to choose a diversity from different buckets increases the cost of executing an eclipse attack.

Node Operator: Run multiple nodes on different network interfaces

This way the adversary would have to figure out all of your nodes and eclipse all the different ones, increasing the cost of the attack. This is a good mitigation technique but it depends upon how user wants to run the software.

Value long-lasting connections

The final technique is to value long-lasting connections. If an adversary has to run a connection for a week or a month, that's much more expensive than a hour or a day. But this comes up against the principle of privacy.

Privacy

Privacy on the peer to peer level is fundamentally you don't want to be able to connect to your persistent identity with your on-chain transactions. One way this link can be made is if a spy is able to see that "Hey, that node was the first to broadcast this transaction", then with a high degree of certainty, they can say that the node is participating in the transaction. Likely the one sending the fund.

A piece of information that can contribute to do that is knowing the topology of the network.

There are a ton of techniques to increase your privacy at the application layers.

Say you are a node and you form a transaction which is in the mempool. Ideally what should happen is the transaction should appear in everyone's mempool at the same time. But, this is not how computers work. The way that we have it implemented is there's a random time delay before it gets propagated to to other mempools and this makes it so that maybe a node couple hops away receives the transaction before your immediate neighbours. This will make it harder for the spy to identify the topology and the source.

But you can imagine with enough data points the spy will eventually be able to deduce what the layout of the network is and where the transactions are coming from.

So, something that you could do to make it harder for them to figure it out is to have connections that are changing because now they have to hit a moving target. Dynamic connections help maintain transaction privacy. Long lasting connections can help with reliable delivery so there is an intrinsic tension.

Reliability is the idea that you want everyone to know about your message but privacy is the idea that you don't want anyone to know about the persistent identity behind those transactions. So, these ideas are seemingly at odds with one another.

Deeper dive into Privacy

Each message is offering a unique set of information that is allowing an adversary to de-anonimize your connections or who your neighbours are. Let's dig into each piece of information one by one:

  1. Addresses - So, we have a victim and an adversary who sends a message to the victim declaring this other address. The neighbour tells its immediate peers. When the neighbours try to connect, the adversary nodes know that those are the immediate connections of the victim. This is a great oversimplification of how you would actually carry out the attack.
  2. Transactions - There are a lot of different techniques that can leak information. The timing of the propagation of the transaction can be used to de-anonimize connections.

    Another technique is to create a double-spend. Two versions of a transaction that conflict with one another and then you target two nodes and tell them that and you can probe other nodes to find out which one they have because they will only accept the one that they saw. This way you can identify out of those two initial ones which nodes the in-between ones are closer to.

    A technique to probe is to create this chains of mempool transactions. The idea of spending unconfirmed outputs, you can use the query behaviour of nodes. You can look into tx probe paper.

  3. Blocks - Blocks are actually a lot harder to use to figure out connections of a node. Let's talk why. First of all, they are really infrequent targeting every 10 minutes but maybe if we are lucky we can have it every couple of minutes but that's nothing in comparison to how frequently addresses and transactions can be created and sent around.

    Additionally there's a network called FIBRE that miners used to send blocks to one another. So, if you are not part of that network, it can seem like blocks just suddenly show up in a lot of different nodes.

    So, we don't have the same element of trying to keep the source of the transaction private. With blocks, you are just trying to get it out as quickly as possible.

    Transaction Propagation Video(2020):

    https://www.dsn.kastel.kit.edu/bitcoin/videos/transactions/2020.mp4

    Source: https://www.dsn.kastel.kit.edu/bitcoin/videos.html

    Block Propagation Video(2020):

    https://www.dsn.kastel.kit.edu/bitcoin/videos/blocks/2020.mp4

    Source: https://www.dsn.kastel.kit.edu/bitcoin/videos.html

    What this is showing here is the depiction of transactions showing up in different people's mempools. The rate of transactions showing up is kind of consistent. We can contrast that with blocks where within the first second or two, most of the blocks appearing. But then there's a huge slowdown and the tail-end is just nodes not connecting or limited bandwidth or something like that.

So the idea here is that each message gives a unique set of information that the adversary can use to identify the node's topology but blocks request information. So, we can use that to come up with a solution that increases privacy and reliability. And that is a block-relay-only connection. So, a full-relay connection sends addresses, transactions and blocks but a block-relay-only connection only sends blocks.

image.png

The idea is that you now have a network graph(full-relay: blue, block-relay: purple) with both sorts of connection. So, if a node's full relay connections get taken over by an adversary, hopefully, they still have the block relay connections to get the chain tip. That would dramatically decrease the type of attack that an adversary could execute even if the full relay connections have been taken over. And there are also important advantages for the network as a while.

This is really cool because earlier we thought that privacy and reliability could be at odds with one another but here we have a solution that increases privacy and thus increases reliability.

This is pretty recent. It got merged in September 2019. Based on that an issue was opened by in October, 2019 by @hebasto introducing the idea of an Anchor for further increasing security against eclipse attacks. The idea here is that if you are running a node, you can persist some of your connections and then have them in a file, and when you are starting up try to to reconnect with those connections. This would be good for reliability but would be bad for privacy for the reasons mentioned earlier. So, we can use the block-relay-only connections and try to reconnect to them on startups and then we don't have to have the privacy leak of transaction relay that we have discussed.(Read PR #17428 on bitcoin/bitcoin - https://github.com/bitcoin/bitcoin/pull/17428).

Review: https://bitcoincore.reviews/17428