Sunday, March 20, 2016

Prime Testing

For decades, a common test of a computer's abilities (and its programmer's, too) is calculating prime numbers. Right now a job applicant is being asked to provide yet another function to generate primes. The largest known prime (discovered a couple of months ago) contains over 22 million digits. Yesterday a couple of students and I were working on answering the question, "What is the 10,001st prime number?" We started at the beginning, with the question of how we would tell whether a number is prime.

Obviously we'd divide it by every lower number. If there are no divisors, it's prime. The Python code for this is

def isPrime(n):
    for i in range(2,n):
        if n % i == 0:
            return False
    return True

One of my students suggested we start a list to store the primes in, so we can easily tell the program to stop when the list has reached the desired length. Here's the code for that:

def primeList(length):
    primes = []
    num = 2
    while len(primes) < length:
        if isPrime(num):
            primes.append(num)
        num += 1
    return primes

Once we got this working correctly, my student went straight for the 10,001st:
 
primes = primeList(10001)
print(primes[-1]) #prints the last number in the list

Using Python's time module, you can time how long it takes to run your program. At the top of your file, you import the time module and record the starting time:

import time
time1 = time.time()

At the bottom of the code, after printing the answer, you record the ending time and print the elapsed time:

time2 = time.time()
print(time2 - time1)

On the computer the student was on this took a minute and a half. On my desktop, it took 45 seconds. Still a long time, since we wanted to generate a much bigger list of primes and save it for future use. How could we save the computer some effort?

This process of optimizing algorithms is a big deal in computer programming, and it's the reason we'll still need mathematical thinking in the digital age. Our first effort was "brute force," meaning we checked all the numbers up to n as possible divisors, and then we checked every number until we got a list of 10,001 primes. Do we really need to check all these numbers?

Obviously, after 2 there will be no more even primes. So we can skip all the even numbers. We changed our code to this:

def isPrime(n):
    #start at 3 and go up to n by 2's
    for i in range(3,n,2):
        if n % i == 0:
            return False
    return True

and

def primeList(length):
    primes = [2] #the only even prime
    num = 3 #the next number to try
    while len(primes) < length:
        if isPrime(num):
            primes.append(num)
        num += 2 #only check odd numbers
    return primes

I was sure this would cut the processing time in half, and it did. We got our 10,001st prime in 22.6 seconds.

But could we do even better? In my book Hacking Math Class with Python, I show how to speed up the process much more. To check whether a number is prime, you only have to check the numbers up to the square root of that number. Past that point, if there's a divisor, it must be multiplied by a smaller number, and you've already checked every smaller number. We changed our "isPrime" code to this:

from math import sqrt

def isPrime(n):
    for i in range(3,int(sqrt(n))+1,2):
        if n % i == 0:
            return False
    return True

Now we got our answer in 0.3 seconds!

With that kind of speed we could create a .txt file of primes to have on our computer so we wouldn't need to wait:

f = open('primelist.txt','w')
for prime in primes:
    f.write(str(prime)+ ' ')
f.close()

Now we can just access the file anytime and loop through our primes without having to generate them all over again.

No comments:

Post a Comment