Sunday, January 30, 2011

The Great Apple Software

I bet you guessed that there was a software aspect to my last post on the Great Apple Experiment.  You were right.

First, there's the fairly easy job of randomly assigning numbered jars to Scouts and marking them for their purpose:  Love, Hate, etc.  This was a fairly straightforward job of maintaining a collection of jars and a collection of spots (22 Scouts x 6 jars per Scout = 132 spots).  Then I just use a pseudo-random number generator to pick a random jar and assign it to a random spot until all the jars are gone.

The harder part is collecting the results from the experiment.  My goal is to have all 22 Scouts rank all 132 apple slices and to crunch the numbers before the end of the meeting.  Entering the 2904 numbers by hand will be a serious challenge within the one-hour limit I have.  So, I decided to speed up the process.  I can bring a scanner to the meeting and scan the score cards to produce black and white bitmap images.  The only problem now is capturing the results from the bitmaps.

OCR technology would be rather hard to use - especially when the Scouts will be writing the results by hand with pencils. I decided to go with a "fill in the dots" approach.  Now the problem is that the scanner isn't perfect and the page can be rotated by several degrees.  I can't just look at an x/y point on the page and expect the circle to be there.

Here's a scanned score card:

And here's a magnified version of the top left part of the card with red marks added by my software:

The circles are arranged in 4 columns of 33 rows of circles. At the top of the page, I include a circle at the top left and top right of the page for registration marks.

I was able to locate the registration marks and find their centers. Based on this information, I knew the orientation of the page compared to a carefully-scanned sample. I was able compute the centers of the circles on the sample sheet and stored that information in the program.  Then, with the scanned image, I take each point (the centers of the circles shown as red dots in the image above), transform them according to the registration marks, draw a 21x21 square around them (shown as red squares), and capture the bits (1 or 0) in each square.  Any square that had more than 200 black bits (0's) was counted as ON.

This actually worked (wow) and I was able to get reliable results. The result is an array of 132 numbers from 1 to 9 or 0 if no circle was filled in.  This is exactly the information I need to run the stats.

You'll notice that I print a series of bars at the top of the page.  My thought was that I'd scan the Scout's number by scanning it and collecting which bars were long and which were short.  I'd always start and end with a long bar then within the code, short bars are 0 and long are 1 making a binary number representing the Scout's number.  I decided in the end not to use it because it doesn't take long for me to manually enter the Scout number and it allows me to print generic score cards instead of ones specific to each Scout. I expect that some of the Scouts will mess up the score cards and will need to re-do them.  It's a whole lot easier if I can just give them a generic sheet.

Part 2 of the experiment is on Tuesday.  I'll let you know how it goes.

No comments:

Post a Comment