Teach me zero knowledge proofs

Posted on , 7 min read

Suhas

I want to learn zero knowledge proofs mathematically. Can you explain them to me and then generate sample questions that I can solve? You’ll then have to grade my answers.

GPT-4

Zero-Knowledge Proofs are a fascinating concept from the realm of cryptography, privacy, and theoretical computer science. They provide a method by which one party (the prover) can prove to another party (the verifier) that they know a value x, without conveying any information apart from the fact that they know the value x.

Let’s break this down further:

  1. Completeness: If the statement is true, the honest verifier (that is, one following the protocol properly) will be convinced of this fact by an honest prover.

  2. Soundness: If the statement is false, no cheating prover can convince the honest verifier that it is true, except with some small probability.

  3. Zero-knowledge: If the statement is true, no verifier learns anything other than this fact. This is formalized by showing that every verifier has some simulator that, given only the statement to be proven (and no access to the prover), can produce a transcript that looks like an interaction between the honest prover and the verifier in question.

A classic example of zero-knowledge proofs involves demonstrating the knowledge of a solution to a Sudoku puzzle. But let’s consider a simpler mathematical instance: the Zero-Knowledge color graph problem.

Zero-Knowledge Color Graph Problem:

Suppose Alice has a graph, which she claims she has colored such that no two adjacent nodes share the same color, using only three colors. Bob doesn’t believe her, but Alice doesn’t want to simply show him the colored graph (this is the secret she wants to keep). Instead, Alice wants to convince Bob that the graph is three-colorable without revealing the actual coloration.

Here’s how they might accomplish that using a Zero-Knowledge Proof:

Alice randomly permutes the colors and sends the graph to Bob with each node hidden, except for one node of Bob’s choice.

Bob picks two connected nodes and asks Alice to reveal them.

If they’re different colors, Bob becomes more confident that Alice has a valid coloring. If they’re the same color, Alice was lying.

Repeat the steps several times. Each time, Alice must permute the colors and Bob verifies a different set of nodes. After a sufficient number of rounds, Bob can be arbitrarily certain that Alice has a valid three-coloring.

Mathematical Questions:

In the Zero-Knowledge color graph problem, how many rounds would you need to be 99.9% sure that Alice is telling the truth, given that the graph has 100 nodes and about 150 edges? Assume that Alice could have chosen any of these edges to be the same color.

In a Zero-Knowledge Proof protocol, if a dishonest prover has a 5% chance of successfully fooling the verifier in one round, what is the probability that the prover can fool the verifier in three rounds of interaction?

I’ll grade your answers once you’re done.

Suhas

Wow, well I’m taking a wild guess here for 1, as I’m not sure I’m clever enough to figure this out quickly — is it similar to a monte carlo trial — so with each round my confidence increases — but to get to 99.9% — maybe a few 100s of rounds? I’m not sure actually. How would you do this?

GPT-4

The question indeed hinges on a statistical approach, reminiscent of Monte Carlo simulations, as you’ve noticed.

Here’s a more detailed way to reason about this:

We know that if Alice is lying about the coloring, there’s at least one pair of adjacent nodes with the same color. If Bob picks an edge uniformly at random to check, then there’s a chance that he’ll hit upon this incorrectly colored edge and expose Alice’s lie.

If there are about 150 edges and at least one is colored incorrectly, Bob has at least a 1/150 chance of picking that edge in each round. However, Alice could potentially get lucky and keep getting away with her lie.

We want to find out how many rounds we need such that we are 99.9% sure that Bob will catch a lying Alice. Essentially, we want to know when the cumulative probability of Alice escaping detection falls below 0.1%.

Let’s consider a single round. The chance that Alice gets away with a lie in that round is 1 - (1/150) = 149/150.

If we have n rounds, Alice has to get lucky n times in a row. Because the rounds are independent, we can multiply the probabilities together. So, the chance that Alice gets away with a lie for n rounds is (149/150)^n.

We want to find the smallest n such that (149/150)^n < 0.001. This involves some logarithmic calculation:

Let’s rewrite (149/150)^n = 0.001

Taking the natural log of both sides, we get n * ln(149/150) = ln(0.001)

Rearranging for n, we have n = ln(0.001) / ln(149/150).

Using a calculator, we find n = 690.775. We always round up in these situations (since we can’t do a fraction of a round), so we need 691 rounds.

This means you would need 691 rounds to be 99.9% sure Alice is telling the truth if the graph has 100 nodes and about 150 edges.

Now, for the second question: if a dishonest prover has a 5% chance of successfully fooling the verifier in one round, what is the probability that the prover can fool the verifier in three rounds of interaction?

Suhas

OK, so it has to get increasingly harder to fool the verifier… so a conditional probability where E1 = first pick, E2 = second pick and E3 = third pick… if E1 = 5%. we need E1 * E2 * E3? How do you get to 5% though from 150 edges? Do I assume that there are X miscolored edges? I think I’ll need your help with this too…

GPT-4

In this question, it’s given that the dishonest prover has a 5% chance of successfully fooling the verifier in one round. That is, in any single round, the prover can mislead the verifier with a probability of 0.05. This doesn’t tie in with the graph coloring problem directly; it’s a separate question.

For each round of the zero-knowledge proof protocol, the chances are independent (that is, what happened in previous rounds doesn’t affect the current round). This means we can simply multiply the probabilities together for each round.

So, if the chance of fooling the verifier once is 0.05 (or 5%), then the chance of fooling them three times in a row is:

0.05 * 0.05 * 0.05 = 0.000125.

So the dishonest prover has a 0.0125% chance of successfully fooling the verifier in all three rounds of interaction. It’s indeed increasingly hard for the prover to fool the verifier as the number of rounds increases. This is a key aspect of zero-knowledge proofs; the more rounds of interaction, the more confident the verifier can be in the prover’s claim.

Suhas

Oh of course duh :D

GPT-4

No worries at all! These are complex concepts and it’s completely normal to take some time to understand them. That’s what learning is all about! If you have any more questions about zero-knowledge proofs or any other topic, feel free to ask. I’m here to help!

Suhas

Can you give me another problem to solve?

GPT-4

Sure, I’d be happy to. Here’s another question that deals with the probability aspects of zero-knowledge proofs.

Question:

Let’s continue with the scenario of a dishonest prover trying to fool a verifier in the context of zero-knowledge proofs. If the prover has a 2% chance of successfully fooling the verifier in one round, what is the probability that the prover can fool the verifier in 10 rounds of interaction?

Remember, each round is independent and the chances of success for the prover in each round remain constant at 2%.

Solve this and I’ll check your work.

Suhas

Simple, 0.02 ^ 10

GPT-4

Exactly right! Each round is independent, and the prover has a 2% (or 0.02) chance of successfully fooling the verifier in each round. To find the probability of this happening 10 times in a row, we simply raise 0.02 to the power of 10.

Using a calculator, 0.02^10 = 1.024e-14, or about 1.024 x 10^-14.

This is an extremely small probability, demonstrating how unlikely it is for a dishonest prover to fool a verifier over multiple rounds of interaction in a zero-knowledge proof protocol. This property is why these protocols can be very reliable for demonstrating knowledge without revealing any additional information.

Great job! If you have any more questions or need further clarification, don’t hesitate to ask.