Lecture 36: 04/27/2012

Pollard’s Rho Algorithm and Integer Factorization.

Suppose you’re given a number n = p*q where p and q are large primes. We’d like to factor n efficiently, say O(\sqrt(p)) where p <= q.

We bounced a couple ideas around like the grade school method of factoring a number, or even using the birthday problem. However, these don't give us a good enough running time.

At last we came to the idea of Floyd's cycle-finding algorithm and using it to factor our number n. We start with two pebbles, one slow the other fast, at x_0, and at each step we compute x_slow = f(x_slow) and x_fast = f(f(x_fast)). If we find that x_slow == x_fast, then we've found a cycle.

Supposing we know p ahead of time, we use a pseudo-random generator, e.g., f(x) = x^2 + 1 (mod p), to generate a sequence of integers. Pollard's algorithm is a modified Floyd's algorithm, we compute d = gcd(|x_slow – x_fast|, n). If d != 1, then we've found a non-trivial factor of n, unless of course x_slow == x_fast.

The time until x_slow == x_fast is the length of the transient segment, plus at most the length of the cycle. In total, this adds up to about \sqrt(p), so the algorithm terminates in the correct running time. If it returns an answer, it is correct, however, there is a non-zero probability that the algorithm will not return an answer at all.

There was one last topic we wanted to cover, although we didn't have time. What if you want to generate a random number AND know its factorization. There is a nice randomizing algorithm to generate some large n and its prime factorization. If you'd like to see it, let me know and I'll write up a nice post about it.

Also, Professor Blum will not be giving any more lectures next week, but we can still organize some lectures with students on Wednesday and Friday to see something new. If anyone is interested in learning about something, let me know and we'll make it happen.

Our last class will be on Monday, taught by Steven Rudich. Please remember to sign up for a final exam slot starting on Tuesday. I will make a separate post for the exam requirements.


Lecture 35: 04/25/2012

Today: Matrix Multiplication and Set Equality

We have a nice implementation of a fast matrix multiplication algorithm, say the Coppersmith-Winograd algorithm, which runs in O(n^2.38). Its unbelievably complicated and you’d like to know if it even computes the correct answer.

How could you verify the answer? Well we could just do the vanilla matrix multiplication, but that runs in O(n^3) time, which would defeat the purpose of having a faster algorithm. How do we do better? I have an idea! Let’s try a randomizing algorithm!

Note, that matrix multiplication is associative. One thing we can do is choose a random vector r \in 2^n (flip coins, look at the stars). Compute (A*B)*r = A*(B*r) = A*d = e in O(n^2) time. Likewise, compute C*r = f in O(n^2) time. We can show that if A*B != C, then e != f with probability at least 1/2.

Since this is one of the final exam questions, make sure you understand all of the details!

I’ll be making a separate post for the final exam so that everyone understands what they have to do.

Lecture 34: 04/23/2012

The Final: 1/2 hour meeting with the professor to discuss some problems

I’ll make a Doodle registration link soon and you’ll sign up for meetings just like last time. However, we’ll have 30 minutes each and make sure there’s time for a restroom break and lunch.

The problems are listed on the boards so don’t put off studying!

Today’s topic (at the end): A randomizing algorithm to detect if an error in transmission of a string occurred.

It has a similar flavor to Schwartz-Zippel theorem for testing if a polynomial is identically zero. However, the latter used the fact that a polynomial can only have so many roots, the former uses the Chinese Remainder Theorem under the hood.

Its a nice application of the CRT so make sure you take a look!

Lecture 33: 04/18/2012

Another go at perfect matchings but this time, we want to look at bipartite graphs. The difference is we don’t want to use Tutte’s theorem to help us. So what can we do?

We may proceed using an idea similar to the Tutte matrix. However, since the graph is bipartite, the Tutte matrix will have many zeroes. Therefore, we may simplify out representation of the adjacency matrix.

At the end we saw a neat application of using the inverse of our simplified matrix to tell us about the matchings in the graph.

I’m uploading the pictures of the chalk boards, but as a special treat, we get to look at Prof. Blum’s personal notes. If these are helpful please let me or the professor know.

Lecture 32: 04/16/2012

More Matching. It’s starting to seem like perfect matchings are everywhere in computer science.

Lots of examples with graphs and Tutte matrices. We also looked at what the determinant of a Tutte matrix actually tells you about a graph when you compute it all out. Surprisingly, it tells you more than just perfect matchings. If will tell you if there is a Hamiltonian cycle in a graph.

So what? Well now we know that computing a determinant symbolically is at least as hard as determining if a graph has a Hamiltonian cycle, which is NP-complete. That’s hard, let me tell you.

Lecture 31: 04/13/2012

A recap of Wednesday’s material at the beginning. We changed our definition of the degree of a multivariable polynomial slightly to make the proof of the Schwartz-Zippel Theorem a bit simpler.

Let f(x_1, …, x_n) be a polynomial in n variables from some field F.

A deterministic algorithm works like this:
Fix d to be the degree of f.

Choose d+1 values of each x_i for 1 <= i 1, and evaluate f(x) for x \in S. If f(x) = 0, then there’s at most a 1/k probability the function is identically zero. If we repeat this test a few times using elements selected uniformly at random from S, we get a pretty high probability of obtaining the correct answer.

If we extend to multivariate functions, we still choose random elements from a set S of size k*d from the field F where the polynomial is defined. However, we now evaluate f(x_1, …, x_n) for a randomly chosen tuple from S^n. If f(x_1, …, x_n) = 0, we get the same bound of at most 1/k probability of f being identically zero.

Now for another cool application of testing polynomials. Suppose we have a graph G and we want to know if there exists a perfect matching.

A matching is a 1-regular subgraph of G. A perfect matching is a 1-regular spanning subgraph of G. All this means is we pair up vertices where every pair of vertices share an edge, and every vertex appears in exactly one pair.

How do we do this? Well we can look for obvious signs of not having a perfect matching, such as having an odd number of vertices. Ok, but what if I have an even number of vertices? Try all the matchings! Brute force is always the way to go (just kidding).

There’s a deterministic algorithm that runs in O(m sqrt(n)) time, but can we do better using randomization? Yes!

We assign every edge a variable and form an adjacency matrix of the graph G. However, let’s orient the edges arbitrarily and every time we have a 1 or -1, we replace it with the variable of that edge.

Now, we compute the determinant symbolically and if the polynomial is non-zero, it tells us the perfect matchings in the graph. However, computing determinants symbolically is hard, so we can accomplish the same thing by substituting random values of the variables and checking if the polynomial is zero.

Lecture 30: 04/11/2012

I hope the homework went (is going) well!

Tomorrow (today) we’ll be talking about totally different. We’re gonna dive into some randomized algorithms with the Schwartz-Zippel Lemma.

Try to look it up on the internet before class so you’re not totally lost but here’s some basics:

We have a polynomial a_0 + a_1 x + … + a_d x^d that we’d like to check if it’s the zero-polynomial. That’s easy if we have it in coefficient form (just check if they’re all zero!). But what I give you something like (1 – x)(x + 2x^2 -19x^2)…(1 – x^3)? Well you’d have to multiply it all out (which takes exponential time) and check the coefficients.


Schwartz-Zippel thought: what if you check that polynomial of degree d at some random point? If the value is not 0, then the polynomial can’t be the zero-polynomial. If it was zero, then there’s a good chance the polynomial is zero everywhere. So just repeat this test a bunch of times and if you always get zero, then its probably zero.

It gives us some cool applications so look forward to that!

What a great lecture today!

We first looked at the algebraic or deterministic way of determining if a polynomial is identically 0. You just have to check d+1 points for a polynomial of degree d. That’s great! So why bother with the randomized approach?

If you have a polynomial of degree d in two variables, then you have to check (d+1)(d+1) = d^2 + 2d + 1 points. Now it looks like our linear time algorithm is going to pay off.

We’ll look at how the probability bound changes when we go from one variable to several variables on Friday.

At the end, we looked at a very interesting application of this. We have two sets A and B. We want to check if they’re equal. Well, just sort ’em and check each element one by one. That takes O(n lg n) time, but what if n is huge??

Let A = [a_0, …, a_(n-1)] and B = [b_0, …, b_(n-1)]. Let P_A = (x – a_0)(…)(x – a_(n-1)) and P_B = (x – b_0)(…)(x – b_(n-1)). Now we can check if A = B by asking if (P_A – P_B) == 0. Using our probabilistic algorithm, we can pull this off in linear time and get the correct answer with very very very high probability.