Bitcoin Math Puzzle #003: Selfish Mining Pt. 3 \ stacker news

This will be a series of increasingly difficult Bitcoin math puzzles, designed to walk through the game theory of various bitcoin scenarios in a formal way. We're currently taking a look at selfish mining.

There are two miners, Alice and Bob. Bob is an honest miner and always mines on the public (longest) chain. Alice is a potentially honest, potentially selfish miner who can decide to mine on the public chain, or to secretly work on a private chain, only to publicly broadcast her private chain later.

For the purposes of this problem, we'll use the notation set forth in these posts: #1473180, #1475564, and #1477308.

In particular, we'll assume the following:

Alice has a hash rate of $h_{A}$ , measured in EH/s
Bob has a hash rate of $h_{B}$ , measured in EH/s
Alice's hash rate is less than Bob's: $h_{A} < h_{B}$
The effective difficulty is $d$ , measured in EH. Effective difficulty is the average number of exahashes required to find a block
The block reward for a found block is $R$ , measured in sats (includes any transaction fees, which we'll assume are constant and deterministic)
The cost to produce 1 exahash is $c$ , measured in sats per EH, for both Alice and Bob
The discount rate is $r$ , measured in percent per year, for both Alice and Bob

For now, we'll ignore difficulty adjustments, ignore block subsidy halvings, and ignore latency.

PreliminariesPreliminaries

Based on the above assumptions, we can say the following:

Alice's block finding rate is $λ_{A} = h_{A} / d$
Bob's block finding rate is $λ_{B} = h_{B} / d$
At any given moment, the probability that Alice finds the next block is $x = \frac{h _{A}}{h _{A} + h _{B}}$
At any given moment, the probability that Bob finds the next block is $1 - x = \frac{h _{B}}{h _{A} + h _{B}}$
The average amount of time for Alice to find a block is $1/ λ_{A}$
The average amount of time for Bob to find a block is $1/ λ_{B}$

Description of selfish miningDescription of selfish mining

Whenever Alice and Bob are both mining on the same public chain, we'll call that chain C.

If Bob finds the next block, the chain becomes C-B. He immediately publishes it, and both Alice and Bob will mine on that new chain. Since they're both mining on the same chain, we rename C-B → C.

If Alice finds the next block, she now has chain C-A. Since Alice is selfish, she can decide whether to publish it or not.

If she publishes it, the public chain becomes C-A and both Alice and Bob will mine on it, so we rename C-A → C.
If, however, Alice decides to hide her private chain, the two miners are now mining on different chains. Alice is now mining on C-A and Bob is now mining on C. If Bob finds the next block after that, Alice now has C-A and Bob has C-B. However, if Alice finds the next block after that, she now has C-A-A while Bob still has C. Alice can again decide whether she wants to publish C-A-A, or to keep it private.

In other words, Bob always publishes any block he finds, and always mines on top of the longest chain. Alice, when she finds a block, may decide whether to keep her chain private and mine on it secretly, or to publicize her chain.

In this series of problems, we're going to derive conditions under which Alice has an incentive to not publish her chain, but rather to mine privately.

Value function and state notationValue function and state notation

The strategic state space can be described by two numbers, a and b. a measures how many blocks Alice has mined on top of C in her private chain, and b measures how many blocks Bob has mined on top of C in the public chain (only when the private chain and the public chain diverge).

So for example:

If Alice is mining on C-A while Bob is mining on C, then (a,b) = (1,0).
If Alice is mining on C-A while Bob is mining on C-B, then (a,b) = (1,1).
If Alice is mining on C-A-A while Bob is mining on C, then (a,b) = (2,0).
If Alice and Bob are both mining on C, then (a,b) = (0,0).
In fact, if at any time Alice and Bob are both mining on the same public chain, then (a,b) = (0,0).
If Alice ever publishes her private chain which is ahead of the public chain, the state resets to (a,b) = (0,0) because Bob always mines on the longest published chain.

Let $V (a, b)$ be Alice's value function in state (a,b) (i.e. the net present value of mining while the state is a, b).

Let $P (a, b)$ be Alice's value function of publishing her private chain while in state (a,b). We note that:

P (a, b) = {R a + V (0, 0) if a > b V (0, 0) if a \leq b

This is because if Alice publishes when $a > b$ , then her chain becomes the longest public chain (which Bob now mines on), and she earns the accumulated rewards she had hidden up to this point. But if she publishes when $a \leq b$ , her chain is not the longest and will not be accepted as the public chain.

ProblemsProblems

Write a Bellman equation for the value function $V (0, 0)$ . Hint: It should include the terms $h_{A}$ , $c$ , $λ_{A}$ , $P (1, 0)$ , and $V (1, 0)$ .

Bounty: 500 sats (payable by zap) to the best answer after 24 hours. This means you have to not only give a correct answer, but also document your reasoning. Suspected AI use will be ignored.

If I make a mistake in my puzzles, please do let me know. We're working this through together!

Hint on Bellman equations

If the current state is $x$ that gives flow utility $u (x)$ , and two possible events could happen, event 1 and event 2, with Poisson rates $λ_{1}$ and $λ_{2}$ , and event 1 pushes you into state $x_{1}$ while event 2 pushes you into state $x_{2}$ , the Bellman equation would be:

r V (x) = u (x) + λ_{1} [V (x_{1}) - V (x)] + λ_{2} [V (x_{2}) - V (x)]

4 replies \ @Murch 29 Apr

But if she publishes when a ≤ b, her chain is not the longest and will not be accepted as the public chain.

Nit: when a = b, it is an alternative longest chain, but most nodes would have seen Bob’s chain already and therefore would not switch to her chaintip.

Alice and Bob produce about d exahashes until a new block is found. Alice is producing $xd$ of those hashes. Therefore, she is expected to expend $u(x) = -xdc$ until a block is found either by Alice or Bob.

$r V (0, 0) = - x d c + λ_{A} [V (1, 0) - V (0, 0)] + λ_{B} [V (0, 1) - V (0, 0)]$

If Bob finds the next block, Bob immediately publishes his block, and both Alice and Bob resume mining on top of a shared chaintip. Therefore, the state $V(0, 1) = V(0, 0)$ and the transition results only in the aforementioned cost and no value addition: $λ_{B}[V(0, 1) - V(0, 0)] = λ_{B}[V(0, 0) - V(0, 0)] = 0$

$r V (0, 0) = - x d c + λ_{A} [V (1, 0) - V (0, 0)] + 0$

If Alice finds a block, she pulls ahead and gains one block reward, i.e., $λ_{A}[V(1, 0) - V(0, 0)] = λ_{A}R×1$.

$r V (0, 0) = - x d c + λ_{A} R$

TBH, I’m writing this up a bit too quickly, and I’m not sure whether the flow of utility shouldn’t rather be per second, so please don’t be shocked if I’m completely lost—this was just a quick and dirty guess. ;)

Edit: I’m also not sure why the inline math is not rendering as expected. :(
Time for dinner, tho.

46 sats \ 3 replies \ @SimpleStacker OP 29 Apr

You are quite close.

The flow cost should be $h_{A} c$ rather than $x d c$ , since I'm no longer constraining $h_{A} + h_{B} = d$ as I was previously.
You are correct that $V (0, 1) = V (0, 0)$ because there's no point of Alice mining secretly on C while Bob mines on C-B.
Where you are incorrect: We cannot say that $V (1, 0) - V (0, 0) = R$ . That's because $V (1, 0)$ is the value function of privately mining on C-A while Bob mines on C.
- Note: We can say that $P (1, 0) - V (0, 0) = R$ . $P (1, 0)$ is the value function of publishing C-A when Alice gets ahead. But $V (1, 0) \neq = P (1, 0)$ necessarily.

I’m also not sure why the inline math is not rendering as expected

Inline math needs double dollar signs: $$ y = f(x) $$

525 sats \ 2 replies \ @Murch 29 Apr

Thanks, I feel I understand much better what you meant with V(x) and P(x), then.

Adding to above reasoning, I replace the flow cost with the corrected value: $u (x) = - h_{A} c$

Yet, I am still not sure how to exactly compose the value of Alice finding the next block. It seems to me that it is dependent on whether Alice will publish once she gets ahead. From what I know this would depend on Alice’s hashrate. I therefore introduce another term, a binary decision function that returns 0 or 1 if Alice would publish the state based on a, b, and x: D(a, b, x):

$r V (0, 0) = - h_{A} c + λ_{A} {D (1, 0, x) [P (1, 0) - V (0, 0)] + [1 - D (1, 0, x)] [V (1, 0) - V (0, 0)]}$

If she publishes, she is one block ahead and gets one block reward:

$[P (1, 0) - V (0, 0)] = R$

$r V (0, 0) = - h_{A} c + λ_{A} {D (1, 0, x) R + [1 - D (1, 0, x)] [V (1, 0) - V (0, 0)]}$

If she does not publish, she gains the reward of mining with a one-block lead V(1, 0), which in turn would either resolve to Bob catching up or Alice pulling ahead further, but is left as a black box because it was given in the problem statement.

167 sats \ 1 reply \ @SimpleStacker OP 29 Apr

Yes, this is correct! Congrats

The challenge going forward will be characterizing the function you introduced, $D (a, b)$ .

Just as a quick note, I would have written it like this, but your introduction of D(a,b) is also correct.

r V (0, 0) = - h_{A} c + λ_{A} [max {P (1, 0), V (1, 0)} - V (0, 0)]

71 sats \ 0 replies \ @Murch 29 Apr

Aye! Expressing it with a maximum function is much more elegant. Thanks for the guidance. This series is fun.

Oh and thanks on clearing up what I was doing wrong about the inline math. Looks much better now. :)

71 sats \ 2 replies \ @BITC0IN 28 Apr

I understand some of these words.

21 sats \ 1 reply \ @SimpleStacker OP 28 Apr

It's a bit advanced in terms of notation. But the answer doesn't require any tricky computation. Once a person understands how to write a Bellman equation, the answer becomes fairly straightforward.

4 sats \ 0 replies \ @BITC0IN 28 Apr

said the wizard to the frog