Tuesday, March 22, 2016

Drawing Triangles the Hard Way

I'm a big fan of fractals, and dedicated a whole chapter of Hacking Math Class with Python to them. The Sierpinski Triangle (or "Sieve") is a famous shape that pops up in the most unexpected ways.


In my book I showed the easy way to draw one by making a turtle do it, but there's a little more complicated way that involves another famous triangle. It was known to the Chinese and Indians while we Europeans were still hiding from the Vikings, but we like to call it Pascal's Triangle after a more recent French mathematician.


The numbers can be generated by adding the numbers in the row above, but using a computer it's quicker to use the combination formula.



Here's the (Python 2) code I copied and pasted from dheerosaur on Stack Overflow:

import operator as op
...
def ncr(n, r):
    r = min(r, n-r)
    if r == 0: return 1
    numer = reduce(op.mul, xrange(n, n-r, -1))
    denom = reduce(op.mul, xrange(1, r+1))
    return numer//denom

(To convert it to Python 3, just change "xrange" to "range" and add "from itertools import reduce" to the imports.)

The 1st row and column are really the 0th, so the "20" in row 7 is actually row 6, column 3:

>>> ncr(6,3)
20

How could you possibly get The Sierpinski Sieve from Pascal's Triangle? Let's use Pygame.

I use a template I found at Professor Craven's excellent Pygame tutorial. This sets up the screen and the colors and the display loop. All I have to do is draw shapes; it even updates the display and quits for me. I have to create a function to draw a triangle where I want it and what size I want it. The number (calculated with ncr above) will determine whether it's filled with color or not:

def triangle(x,y,size,filled):
    '''draw an equilateral triangle'''
    #if the number isn't a multiple of 2, color it in.
    if filled % 2 != 0: 
        pygame.draw.polygon(screen,VIOLET,[[x,y],
                                       [x-size/2.,y+1.732*size/2.],
                                       [x+size/2.,y+1.732*size/2.]],
                                        0)
    else: thick = 0 #leave multiples of 2 empty

The "gasket" function will draw the triangles at the right spots:

def gasket(rows):
    size = 400./rows #resize depending on the number of rows
    for i in range(rows):
        for j in range(i+1):
            y = 1.732*i*size/2. #height of equilateral triangle
            x = -i*size/2. + j * size + WIDTH/2. #width of triangle
            triangle(x,y,size,ncr(i,j)) #draw the triangle


Now running "gasket(50)" will get you 50 rows of Pascal's triangle, with the multiples of 2 left empty:



Look familiar? You can change one digit in the "triangle" function to leave out the multiples of 3 instead:


Or the multiples of 8. The pattern of empty triangles changes every time. Try other numbers!


This blog entry has been brought to you by O'Reilly books.:-)

Sunday, March 20, 2016

Prime Testing

For decades, a common test of a computer's abilities (and its programmer's, too) is calculating prime numbers. Right now a job applicant is being asked to provide yet another function to generate primes. The largest known prime (discovered a couple of months ago) contains over 22 million digits. Yesterday a couple of students and I were working on answering the question, "What is the 10,001st prime number?" We started at the beginning, with the question of how we would tell whether a number is prime.

Obviously we'd divide it by every lower number. If there are no divisors, it's prime. The Python code for this is

def isPrime(n):
    for i in range(2,n):
        if n % i == 0:
            return False
    return True

One of my students suggested we start a list to store the primes in, so we can easily tell the program to stop when the list has reached the desired length. Here's the code for that:

def primeList(length):
    primes = []
    num = 2
    while len(primes) < length:
        if isPrime(num):
            primes.append(num)
        num += 1
    return primes

Once we got this working correctly, my student went straight for the 10,001st:
 
primes = primeList(10001)
print(primes[-1]) #prints the last number in the list

Using Python's time module, you can time how long it takes to run your program. At the top of your file, you import the time module and record the starting time:

import time
time1 = time.time()

At the bottom of the code, after printing the answer, you record the ending time and print the elapsed time:

time2 = time.time()
print(time2 - time1)

On the computer the student was on this took a minute and a half. On my desktop, it took 45 seconds. Still a long time, since we wanted to generate a much bigger list of primes and save it for future use. How could we save the computer some effort?

This process of optimizing algorithms is a big deal in computer programming, and it's the reason we'll still need mathematical thinking in the digital age. Our first effort was "brute force," meaning we checked all the numbers up to n as possible divisors, and then we checked every number until we got a list of 10,001 primes. Do we really need to check all these numbers?

Obviously, after 2 there will be no more even primes. So we can skip all the even numbers. We changed our code to this:

def isPrime(n):
    #start at 3 and go up to n by 2's
    for i in range(3,n,2):
        if n % i == 0:
            return False
    return True

and

def primeList(length):
    primes = [2] #the only even prime
    num = 3 #the next number to try
    while len(primes) < length:
        if isPrime(num):
            primes.append(num)
        num += 2 #only check odd numbers
    return primes

I was sure this would cut the processing time in half, and it did. We got our 10,001st prime in 22.6 seconds.

But could we do even better? In my book Hacking Math Class with Python, I show how to speed up the process much more. To check whether a number is prime, you only have to check the numbers up to the square root of that number. Past that point, if there's a divisor, it must be multiplied by a smaller number, and you've already checked every smaller number. We changed our "isPrime" code to this:

from math import sqrt

def isPrime(n):
    for i in range(3,int(sqrt(n))+1,2):
        if n % i == 0:
            return False
    return True

Now we got our answer in 0.3 seconds!

With that kind of speed we could create a .txt file of primes to have on our computer so we wouldn't need to wait:

f = open('primelist.txt','w')
for prime in primes:
    f.write(str(prime)+ ' ')
f.close()

Now we can just access the file anytime and loop through our primes without having to generate them all over again.

Saturday, March 12, 2016

Hacking the Election (Data)

A student at the Coder School was creating his own election results program, to print out who won or if there was a tie. He was doing a lot of stuff manually, like entering his own data, and writing lots of conditionals for "if Candidate_A > Candidate_B and Candidate_A > Candidate_C" and so on. I figured Python could offer us a way to get real data and visualize it in a graph.

Python has a built-in module for reading csv ("comma separated values") files you find on the web. I found a .csv file of the results of the Virginia Republican Primary, held less than two weeks ago. I downloaded it and opened it in a spreadsheet program, but it had a lot of info I wasn't interested in:

This wasn't even the whole sheet, which contained 22 columns, from "Election Name" to "District Type" to lots of crazy "IDs." I was only interested in the "Last Name" and "Total Votes" columns, D and F.

And instead of just a dozen or so rows of the total votes each candidate received in Virginia, I had 38,000 rows of all the totals for each candidate from every county in Virginia! Clearly some pruning was needed. I saved the file as "results.csv" and fired up Python to sort things out.

I started by adapting the "read_csv" function from Chapter 3 of Amit Saha's brilliant book Doing Math With Python. The csv module has a "reader" function which loops over the rows of the file. This code will open the file and start an iterator called "rows":

def read_csv(filename):
    '''Returns the results in a dictionary'''
    with open(filename) as f:
        rows = csv.reader(f)
        next(rows)

So now I can go through all the rows and add up everybody's votes. I'll put the 4 top candidates (Trump, Cruz, Rubio and Kasich) into a dictionary called "results" and start their totals at 0:

results = {'Trump':0, 'Rubio': 0, 'Cruz':0,'Kasich':0}

Now the program will loop over the rows and check if column D is one of those four names. If so, it'll add the number in column F to his total. The indices start at 0, so column D is the item in "row" with index 3, or "row[3]" and column F is "row[5]."

for row in rows:
    if row[3] in results:
         results[row[3]] += int(row[5])

That should work perfectly, but instead it gives me a ValueError because it hit a blank row or something. Let's just have it continue if it hits a snag like that:

for row in rows:
    try:
        if row[3] in results:
            results[row[3]] += int(row[5])
    except ValueError:
        continue

Now we can print the results!

print(results)

And at the bottom of the file, call the "read_csv" function on running the program:

if __name__ == '__main__':
    read_csv('results.csv')

These are the totals for the top 4 Republican candidates:

{'Kasich': 97791, 'Rubio': 327936, 'Trump': 356896, 'Cruz': 171162}

There was no mistake: those totals match the official results on http://results.elections.virginia.gov/. After printing the results we could add some code to print the winner. First we find the "key," in this case the name, associated with the maximum value. Then we print it.

winner = max(results.keys(), key=(lambda k: results[k]))
print('The winner is {0}, with {1} votes.'.\
       format(winner,results[winner]))

Now it adds the winner line:

The winner is Trump, with 356896 votes.

To make a nice bar chart out of the numbers, add this code:




And here's the chart:


Did I leave somebody out? If you want to add a candidate, just add them to the dictionary like this:

results = {'Trump':0, 'Rubio': 0, 'Cruz':0, 'Kasich':0, 
           'Carson':0, 'Bush':0}

Run the program and you'll get an updated chart:
(It looks a little different because I ran it in an IPython notebook.)

A great introduction to using Python for data analysis and very timely, too!

Update 3/13/16:
It's even easier to get the totals using a database like pandas:

Thursday, December 10, 2015

3D Graphics Made Easy

I'm a big fan of doing math using very visual tools and loved using Visual Python to do 3D graphics. I was sad to learn it doesn't work on the Raspberry Pi, and since the Pi is a slower processor than the average desktop or laptop, I must have had the impression that 3D graphics were too much for the Pi. Imagine my surprise when after some googling I found Pi3D. It's a free, open-source Python package which is easy to install on Linux, and by making use of Numpy and the Graphics Processing Unit (GPU), the graphics are surprisingly fast!

After getting a lot of help from the Pi3D Google group and one of the developers, Paddy Gaunt, Pi3D has become my favorite Graphics package to play around with. I've installed it on every new Pi image I've burned, on the Ubuntu machines I work with at the Coder School, on my laptop that runs Manjaro Linux and even old Red, my Windows laptop. So I definitely have a lot of experience installing it, making me the perfect person to make a tutorial video on Getting Started with Pi3D.

Paddy Gaunt had already posted a dozen or more cool videos demonstrating the graphics he created with Pi3D but this is the first tutorial explaining to beginners how to install Pi3D and write a simple program. Here it is:


Paddy was the first to comment on the video, and he added a link to it in the official documentation. There's my name!





Thursday, November 19, 2015

The Code of Clocks

One of the exercises at the program I work at is reading and drawing clocks. I figured that would be a great programming exercise. Drawing lines in Pygame would be simple and I'd just put it on a clock background. First I downloaded a picture of a clock and called it "clock2.jpg."

The code for loading a background image and making it transparent is:

# Load and set up graphics.
background_image = pygame.image.load("clock2.jpg").convert()
background_image.set_colorkey(WHITE) #make it transparent

You need to write commands for drawing the hands. In Pygame a line is defined by its endpoints. Naturally the center of the screen is width/2, height/2, but the other end of the hand is a math problem! What would be the best coordinate system to use? It wouldn't be easy to point to a specific point in Cartesian (x- and y-) coordinates but it's not hard to figure out how far to rotate the hand in polar coordinates. See my book Hacking Math Class for how to explore polar coordinates using Python. Here's how to find our point using the length of the hands and the angle of rotation:


The point would be (r*cos(theta), r*sin(theta))
The length of the hand is r. I made all the hands the same length for testing. They'd be different colors, though:

# Define some colors
WHITE = (255, 255, 255)
BLACK = (0, 0, 0)
RED   = (255, 0, 0)
BLUE = (0, 0, 255)

#clock hands:
hour_length = 290
minute_length = 290
second_length = 290

Unlike the figure above, in a clock face the rotation is, well, clockwise from straight up. The easy solution to that is to switch sine and cosine. Now what's theta? All programming languages measure angles in radians, not degrees. No problem: import pi from the math or numpy module and 2*pi is a whole circle.

from math import pi, sin, cos

Each hour is a twelfth of the whole circle, or 2*pi/12. That means if it's 1 o'clock, the rotation is 1 * 2*pi divided by 12, 2 o'clock would be 2 * 2*pi /12 and so on. That makes the code for our hour hand this:

#draw hour hand
h_theta = hour*2*pi/12

and the Pygame syntax for drawing a line:

h_tip = [300+hour_length*sin(h_theta), 300-hour_length*cos(h_theta)]
pygame.draw.line(screen, BLACK, [300,300], h_tip, 2)

The "300" comes from the fact that the center of my clock is (300,300).

So that draws a perfect hour hand. Here's 1 and 2 o'clock:
hour = 1 or hour = 2

 

Let's add a minute hand. Now the circle (2*pi radians) is cut up into 60 slices, so we'll add that to our code:

#draw minute hand
m_theta = minute*2*pi/60
m_tip = [300+minute_length*sin(m_theta), 300- \ minute_length*cos(m_theta)]
pygame.draw.line(screen, BLUE, [300,300], m_tip, 2)

It'll be just as long as the hour hand, but blue instead of black. Now 1:00 looks like this:
That's correct, but when I put in 1:30 I get an incorrect clock:
hour = 1
minute = 30
The hour hand should be halfway between the 1 and the 2. Every hour the hour hand moves 2 * pi/12 radians but it should move 1/60th of that every minute. I'm changing the hour hand code to this:

h_theta = hour*2*pi/12 + minute*2*pi/(12*60)

That should fix the mistake:

We'll add a similar line for our minute hand to take into account the number of seconds, add a seconds hand and we'll have a three-handed clock.

h_theta = hour*2*pi/12 + minute*2*pi/(12*60)
...
m_theta = minute*2*pi/60 + second*2*pi/(60*60)
...
s_theta = second*2*pi/60

Of course, you need the usual Pygame code to get stuff on the screen. I highly recommend Professor Craven's tutorials, and that's where I got a template for Pygame programs.

Here's the whole code:

import pygame
from pygame.locals import *
from math import pi, sin, cos
from random import randint

# Define some colors
WHITE = (255, 255, 255)
BLACK = (0, 0, 0)
RED   = (255, 0, 0)
BLUE = (0, 0, 255)

#clock hands:
hour_length = 290
minute_length = 290
second_length = 290

#Change these numbers to show the time:
hour = 1
minute = 45
second = 50

# Create an 600x600 sized screen
screen = pygame.display.set_mode([600, 600])

# This sets the name of the window
pygame.display.set_caption('Clocks!')

# Set positions of graphics
background_position = [0, 0]

# Load and set up graphics.
background_image = pygame.image.load("clock2.jpg").convert()
background_image.set_colorkey(WHITE)

# Call this function so the Pygame library can initialize itself
pygame.init()

clock = pygame.time.Clock()

done = False
#main loop:
while not done:
    for event in pygame.event.get():
        if event.type == pygame.QUIT:
            done = True
            
    screen.fill(WHITE)
    # Copy image to screen:
    screen.blit(background_image, background_position)

    #draw hour hand
    h_theta = hour*2*pi/12 +minute*2*pi/(12*60)
    h_tip = [300+hour_length*sin(h_theta), 300-hour_length*cos(h_theta)]
    pygame.draw.line(screen, BLACK, [300,300], h_tip, 2)

    #draw minute hand
    m_theta = minute*2*pi/60 + second*2*pi/(60*60)
    m_tip = [300+minute_length*sin(m_theta), 300-minute_length*cos(m_theta)]
    pygame.draw.line(screen, BLUE, [300,300], m_tip, 2)

    #draw seconds hand
    s_theta = second*2*pi/60
    s_tip = [300+second_length*sin(s_theta), 300-second_length*cos(s_theta)]
    pygame.draw.line(screen, RED, [300,300], s_tip, 2)

    pygame.display.flip()

    clock.tick(60)

pygame.quit()

You can use the time module in Python to make a clock read the current time:

>>> import time
>>> the_time = time.time()
>>> dt = time.localtime(the_time)
>>> print(dt.tm_hour,":",dt.tm_min,":",dt.tm_sec)
16 : 41 : 33

In our clocks program just set the 'hour' variable equal to 'dt.tm_hour" and so on. Yes, you have a clock on your computer and your phone already but you made this one!

Monday, November 9, 2015

Crypto-Code

A student of mine was thinking out loud about encoding messages and it got stuck in my head. The next day I challenged my Coder School students to try it. How do you even replace a letter with a number? You could use dozens of if-statements:

for letter in message:
    if letter == 'a':
        print(1)
    elif letter == 'b':
        print(2)

And so on. But an easier way is to create a string of characters:

ALPHA = "abcdefghijklmnopqrstuvwxyz ',.?"

Now each letter has a number, an index, its place in the string. The letter a is ALPHA[0], b is ALPHA[1] and so on . You can replace each letter with its index this way:

def encode(msg):
    '''takes a message and prints number code'''
    for letter in msg:
        print(ALPHA.index(letter), end=' ')

The last bit is thanks to Python 3, where print is a function. After printing the number you can specify printing something at the end, like a space. And it doesn't automatically print a line break. Running encode('call me ishmael.') we get

2 0 11 11 26 12 4 26 8 18 7 12 0 4 11 28

The decode function will require a little Python trickery. The numbers can be fed back in as a list if you're patient enough to type tons of commas, or it can be copied and pasted in, as a string inside quotes, to a function like

decode('2 0 11 11 26 12 4 26 8 18 7 12 0 4 11 28')

The decode function converts the string to a list using Python's split() function. Then you can just iterate over the list, printing the letter in the ALPHA list with that index number.

def decode(msg):
    '''takes numbers and prints decoded letters'''
    msg2 = msg.split() #converts string to list
    for item in msg2:
        print(ALPHA[int(item)],end = '')

So decoding another message, like

decode('8 26 22 0 13 19 26 19 14 26 7 14 11 3 26 24 14 20 17 26 7 0 13 3 28')

we get 

i want to hold your hand.

But that cipher wouldn't fool spies even from a thousand years ago. We need to make it a little sneakier. We could get rid of the spaces if all the numbers were 2-digits long:

def encode(msg):
    '''takes a message and returns number code'''
    code = ''
    for letter in msg:
        #get the index of letter and
        #convert it into a string
        num = str(ALPHA.index(letter))
        if len(num) == 1: #if it's only one digit
            num = '0' + num #add a zero in front
        code += num         #add that to the code
    print(code)

Now execute encode("beautiful is better than ugly.") and you'll get a more confusing-looking code:

010400201908052011260818260104191904172619070013262006112428

Now decoding it is fairly simple. You just have to take 2-digit slices from the string and print out the element of ALPHA that has that index.

def decode(msg):
    '''takes numbers and prints decoded letters'''
    for n in range(0,len(msg),2): #n goes up by 2's
        print(ALPHA[int(msg[n:n+2])],end='') #take 2-digit slices

Your enemy might notice a bunch of zeroes. My student's suggestion was to convert to binary numbers, make all the numbers 5-digits, then string them together! I already have a binary converter in my book Hacking Math Class, so we'll assume you have the binary function. You've converted the letters to numbers, then send them to this function:

def binary5(number):
    '''converts number to 5-digit binary'''
    number = binary(number) #get the binary form
    number = str(number) #convert to string form
    x = len(number)     #to find number of digits
    number = (5-x)*'0' + number #add zeroes if it's not 5
    print(number)

Now entering binary5(5) will add zeros to the front to make it a 5 digit number:

00101

I created strings of even and odd numbers for random choosing:

EVENS = '02468'

ODDS = '13579'

Here's the code (instead of printing) for converting the zeros to any even digit and the ones to any odd digit:

    #convert to random evens and odds
    binrand = ''     #string for random evens or odds
    for digit in number:        #go over every character
        if int(digit) % 2 == 0: #if its integer form is even
            #replace with random even digit
            binrand += random.choice(EVENS)
        else: #otherwise, replace it with random odd digit
            binrand += random.choice(ODDS)
    return binrand

Here's the new encode function:

def encode(msg):
    '''takes a message and returns number code'''
    code = '' #empty string for numbers
    for letter in msg: #goes over every letter
        #take the index of that letter, convert to
        #5-digit binary and add that to number string
        code += binary5(ALPHA.index(letter))
    print(code)
    print()#blank line

Now running the encode function will encode our message nicely.

>>> encode("what's my age again?")
9217424331466441621919055700927185047984178469303480480845780254853858208286417028442012044592133956

The first 5-digit slice, '92174', using the evens = '0' and odds = '1' transform, is the binary number 10110. That's 22 in decimal, or w's place in the ALPHA string.

The decode function reverses the process. First it takes the message and slices it into 5-digit slices.

def decode(msg):
    '''decodes a message'''
    msg2 = [] #list for 5-digit numbers
    for n in range(0,len(msg),5): #n goes up by 5's
        #add each 5-digit slice to list
        msg2.append(int(msg[n:n+5]))

Then it creates a list to store the 5-digit binary numbers and converts all the even digits to zeros and all the odds to ones.

    msg3 = [] #list for 5-digit binary numbers
    for number in msg2:
        number2 = str(number) #turn it into a string
        number3 = ''        #empty string for binaries
        for digit in number2: #go through digit by digit
            if digit in EVENS: #if digit is even
                digit = '0'         #replace by 0
            else: digit = '1'       #or replace by 1
            number3 += digit        #add to binary string
        msg3.append(number3)        #add binary string to message
        #print(msg3)

Finally you have to convert all the binary numbers to decimals. It's a good thing leading zeros are ignored by Python's 'int' function. That saves us a step. I already have a function for converting binary to decimal, (called "binDec") so I didn't give the code here.

    for number in msg3:     #go through binaries
        #convert to int, then decimal, then letter:
        print(ALPHA[binDec(int(number))],end = '')
    print()             #blank line after message

Now entering a nonsensical string of unbreakable code will yield a message of power and beauty:

>>> decode('164732271528482768355715312256534944844083853656139
161595498463010797645033218905045253958')
that's all, folks!

Update: Naturally my Python mentor Paddy Gaunt showed me how to do it in an eighth of the code.

import numpy as np

ALPHA = "abcdefghijklmnopqrstuvwxyz ',.?"
ALPHA = {c:i for i,c in enumerate(ALPHA)} # make it into a dict

def encode(str):
    narr = np.array([ALPHA[c] for c in str], dtype=np.uint8)
    narr = np.unpackbits(narr).reshape(-1, 8)[:,3:].reshape(-1)
    evod = np.array([np.random.choice([0,2,4,6,8], len(narr)),
                     np.random.choice([1,3,5,7,9], len(narr))])
    return evod[narr, np.arange(len(narr))]


print(encode("the quick brown fox jumps over the lazy dog"))

Wednesday, September 16, 2015

Boring Stuff, Automated

Thanks to Al Sweigart for the title to his latest Python book. Automating the boring stuff is reason enough to learn a bit of programming. Here's how I used Python to automate something boring today.

Every week or so in my new teaching job I have to time students one by one as they write strings of symbols. The way it's usually done is with a stopwatch, and the student pauses to write down the time for each of the 25 trials. Since I was looking at testing all my students this week, I wondered if I could write a Python program that would help.

This summer I worked at a camp introducing Python programming using Pygame and I've been keeping up my skills. Many of the programs involve interaction with the keyboard or mouse and I woke up this morning with an idea to write a stopwatch program using the space bar. I fired up the Pygame template I adapted from Professor Craven of Simpson College. First I had to create a list to store all the times, and create a variable to tell when the program is timing or not.


Then I just added to the usual key-events code where Pygame checks if the user clicked to exit the program or pressed a key. Here's the code for "if the space key is pressed and if you're not timing, record the present time and start timing (and print "timing" in the shell so I'll know)."


But if the program is already timing, record the present time, calculate the elapsed time, and add that to the times list. If the times list is 25 items long, stop the program.


I like that last line. Whenever the space bar is clicked, it toggles between timing and not timing.

Here Comes Numpy

The tedious part is calculating the means and standard deviations of 3 blocks of times. Most teachers type the numbers into a web page, but since we've already got the times in a list I knew it would be child's play to get the means and standard deviations. All I needed to do was to make the times into a Numpy array and use the "mean" and "std" functions.


Here's what the final result looked like.


The students were impressed I could produce their results so quickly!