Thursday, October 1, 2015

Simple Example of Mapmaking with GIS

For a while, I have had it in mind to produce a physical map of stores and resources one might wish to go to before or during working on a project at Dallas Makerspace.  The map would consist of places one would go to procure raw materials, consumable supplies, or tools to finish a project in electronics, arts, woodworking, scientific endeavors, or you name it.  After spending a while scouring message board threads for local stores and resources previously mentioned by members, I called out for suggestions for any additional ones beyond those I scouted out.  After this, I sought out each place's address, geocoded it, and then visited each point using Web tools in order to test the accuracy of the geocoded list.

Here is a diary of my trials and tribulations throughout this journey.  Not having used GIS software before, you should know that many people have spent years building in all sorts of intricacies for dealing with many situations, from different datums one can select to orient a map all the way to accounting for atmospheric conditions when using aerial or satellite imagery as map layers.  As such, mentions of as much riffraff and unnecessary steps and settings as possible will be kept to a minimum, but you would need to learn these advanced settings in order to really build a map from scratch using all your own measurements and imagery.

Starting with QGIS on Linux

At first, QGIS seemed like a natural place to start.  I could use it on my Linux box with a huge 4K monitor, and it was easy to install from the Ubuntu Software Center.  There is a nice plugin for it called OpenLayers that really makes it easy to add nice raster map imagery from various OpenStreetMap sources.  However, I ran into 3 problems with QGIS version 2.10 "Pisa" on Linux:

1. Won't download all map imagery at once.

If the map is big and/or to be printed at a high DPI, this of course requires highly detailed map imagery which can take a long time to download.  Unfortunately, QGIS does not wait for all of the map tiles to finish downloading before it begins to render, so you will see only a rendered circle (imagine a Japanese flag where the red area is now a map) if you haven't waited long enough.

Here, I tried to show the same area with four different map servers.  You can see how fast (or not) the various servers responded to my request.

An example of a single map tile, © OpenStreetMap contributors.  Each map tile is designed to be 256*256 pixels.  All of the tiles containing imagery near this tile could be used to create a large map of the United Kingdom.

2. After upgrading from 2.08.0 to 2.10, it started giving me weird Python errors.  (Turns out these errors weren't really a big deal, at least for my use case.)

3. The print composer orients the points differently than the image renderer -- the image rendering is inaccurate and useless, especially for map insets.

Can GRASS GIS do any better?

To work around the problems faced in Linux, I installed GRASS GIS for Windows.  It makes you set up some things in advance, whereas QGIS lets you start going to town right away.  The interface can be a bit confusing and intimidating at first, but once you realize that most windows need to be expanded for you to see everything, and that there is more than just one window, it becomes easy to navigate.

I wanted to import my CSV file until I realized the Address field had commas in it too.  This was throwing off the import wizard.  I re-exported the CSV file as a tab-delimited file and then at least that problem was fixed.  However, GRASS still gave me a fair share of problems:

1. Relatively obscure error messages that don't exactly tell you why things fail.

2. Column names as you set up your database can't have spaces in them, or if they can, then you need to possibly surround them with single quotes or escaped double quotes in your command.

3. GRASS GIS kept emptying out the contents of the points file at some particular step.  I had to leave the file open and make sure to Save it each time the text editor told me the contents had changed.

4. I'm not sure if it likes file names with spaces in them.  Either it wasn't reading the file because it wasn't putting quotes around the "in=filename.txt" part of the command string, or it was trying to read from an empty file.  (I thought the whole reason of hitting the "Load" button was so the program would actually parse the points data from the Preview textbox or from allocated memory rather than having to reread it again once you hit "Run".)

5. Given all the frustrating failures I was having with importing my points file, I actually tried Command Line mode for a while.  Naturally, the instructions on this page didn't work for me because, of course, they'd released a whole 'nother major revision of GRASS GIS since I'd installed it, but it gave me a good starting point nonetheless.  After successfully importing my data, I tried to switch back to GRASS GUI but could not see the data I'd just imported!!! Why not?!?  I ended up having to re-import it through the GUI, carefully making sure it reconstructed the exact query I needed in order to get it to import correctly.

6. GRASS GIS can't handle special characters when parsing database data because their Python script hasn't been set up to handle ASCII characters above 127.

Obviously, as a new user myself, there are probably questions and blocking issues in here that someone more familiar with the program could address.  Maybe I just need to ditch the old version and try the newer one for a better user experience.  Nevertheless, with all these headaches, I finally gave up and embarked on the last frontier: Mac OSX.

Something that actually worked

There is, fortunately, a QGIS version built for Mac OSX.  It does require the manual installation of some dependencies, but it comes out just like the Linux version.  I quickly set up my desired map style and layers, then built the print composer to specify where to put the map, scale, titles, grid marks, and other text on the final rendering.  I checked to make sure the points came out the same between the map view and the Print Composer view, and sure enough, they came out OK.  It was time to hold my breath, not try anything different or unusual, and render the map.

For these specific steps, assuming you have QGIS for Mac and OpenLayers already installed:

  1. Add your Base Map layer.  In the menu, go to Web -> OpenLayers plugin and then select your map provider and map style.  Use your mouse to position the map in the window as desired.
  2. Add your data layer(s).  Go to Layer -> Add Layer -> Add Delimited Text Layer... (assuming you have a CSV-formatted list of points), then follow the prompts to guide it to your attribute names, which columns are latitude & longitude, and other such settings as desired.
  3. Specify the Coordinate Reference System (CRS) of choice if it hasn't prompted you to do so yet.  Check with your coordinates provider to see which CRS/datum they base their coordinates off of.  I typically use one of the World Geodetic System 1984 (WGS84) sets.  To specify this, go into Projects -> Project Properties -> Coordinate reference systems of the world, and make sure QGIS confirms your selection as the one to use in your project.
  4. Fine-tune the symbols used for your placemarks.  Notice in the bottom left-hand portion of the QGIS window, in the Layers panel, where you should now have at least two layers: one for your base map, and at least one per layer you added.  Right-clicking on one of your data layers and hitting Properties.  You can select several different ways to assign colors to your placemarks in the Style tab, from the dropdown list box on the top left.  Graduated is good for data on a continuous numeric scale.  Categorized is good for points associated with non-numeric categories, such as the type of store it is.  After you choose what "Column" contains the data on which you wish to index, use this panel to select the color, point type, and point size to associate with each category you specify.
  5. Ideally, your very simple map is now positioned within the window just as you imagined it.  Now, it's time to export it into an image file for uploading or printing.  Go to Project -> New Print Composer and enter a name for your new print composer.  All this represents is one specific arrangement for which you wish to export the map.  Imagine if you want to make a version of a map to hang in a police station, a version to use in the car (if you're mobile-app averse :-P), and a version to hand out to citizens on a pamphlet; you could set up several print composers in your project to format it just perfectly for the different media you're using for each purpose.
  6. Set the media size.  On the right, there is a spot where you can pick between three different panels: "Composition", "Item properties", and "Atlas generation".  Choose "Composition", then specify your paper size, resolution in DPI, and orientation.
  7. Take time to learn the toolbars in the Composer Editor.  There are very few words in this view, so the icons will really help you out.  The most important one is the "Add new map" icon.  Click this icon and drag along the area which you wish to add the new map.
  8. Set up your map.  Notice the "Item properties" panel on the right side.  Hopefully you have your map object selected in the print composer.  If so, you should be able to click "Set to map canvas extent" in the "Item properties" pane so it shows exactly what you intend it to.  Chances are the aspect ratio of your map canvas (the original window you started working with) is not exactly the same as the media you chose for the Print Composer, so you may need to scroll up a bit and adjust the Scale (zoom level) of the map.  You can also tweak the map data placement by adjusting the map in the map canvas and clicking "Set to map canvas extent" again.
  9. Add other features, such as text labels with the map's name, your name, attribution information, scale bars, grids, shapes, legends, or whatever makes you happy.
  10. Export your image.  In the menu bar, go to Composer -> Export as Image... and choose the name of your output file.  Take careful note of the output; just because it finishes (i.e. unfreezes itself) does not mean it will render what you expect.

Unfortunately, for several minutes after opening the print composer, it renders just a filled circle of map imagery, not the whole map, similar to the Linux version above.

I continued attempting to render the map, and each time, the print composer took a little longer to finish rendering, but at least the rendered circle got a little bigger too.  Eventually, after about the 13th attempt, the whole map was filled in.  Great!  Now it's time to make the one inset I need to put in to show a far-away town.

Oh crap, now it needs to re-download the imagery for both maps once again... and eventually, only certain random tiles from the original map are showing.  I restarted QGIS, then took off the original map so that only the inset is supposed to be rendered.  (To "take off" items in this context means I excluded them from the render without necessarily deleting them from the print composer.  To do this, click on the desired item, go to the "Item properties" view, and scroll down to the "Rendering" options.  Simply check the box for "Exclude item from exports".)  Now when I try to render the inset, it comes out with a completely different part of the map than what it's supposed to be rendering.  Not sure what that's all about, but it's highly annoying.  I eventually give up on doing the inset, leaving it blank and saving a spot for it so I can print it and paste it on later.

Each step of the way, I'm careful to save the final output with an incremental number, and am diligent in deleting useless copies I won't be able to send to the print shop.  After the fiasco with the inset map, I simply leave a rectangle for where it should go, and then re-enable rendering of the original map.  Now it has to spend a bunch of time re-downloading map imagery before I finally get the desired output, which can now get sent to the print shop.

A Physically Interesting Final Result

I went through some A/B testing exercises with folks at the Dallas Makerspace (really, more like A/B/C/D and A/B/C testing) regarding which map style and what type of points would be preferred.  I may have forgotten to take into account their preference for map style when creating the final output (oh well, that's what Version 2.0 is for), but among the ways to present placemarks, there was a clear winner.  The choices presented were to print the placemarks directly on the map, use strong magnets as placemarks, or use standard pushpins.  User feedback was that printed placemarks would get out of date and would obsolete the map in the event that places had to be added or (especially) removed; magnets could get disturbed by accidental or malicious activities; and pushpins would leave holes where they once existed, leaving the map to look tattered after years' worth of updates.  Weighing all these alternatives, the pushpin approach came out ahead, but I still sought to keep compatibility with magnetic markers for temporary placemarks.

For the placemarks, I considered what colors tend to be evoked when one thinks of a particular type of material or supply.  Given the number of categories, and my disdain for graphs with so many lines on them that the colors start to run together and become very hard to distinguish, I also made up shapes to pair with colors to give the final result; the shapes actually represent "category groups" including "Consumables/Materials" and "Tools".  Since I am looking out for those with disabilities too, I installed Color Oracle on my Mac to test out what all these colors would look like for colorblind people.  Since it is a very common disability mostly seen in men, and since the DMS membership is overwhelmingly men, this is an especially important situation to be aware of given the population.  I spent time rotating between the three forms of color blindness one can test for: deuteranopia, protanopia, and tritanopia.  Among these, I found deuteranopia and tritanopia to be quite ugly, so if I had to choose what colorblindness to get, I would definitely pick tritanopia (the rarest form).  It would still suck to have, but at least you don't have so many ugly yellow colors.

Magnetic backings are available in several forms.  The most common is, of course, a whiteboard, but whiteboards are impenetrable by common push pins -- not to mention expensive per square foot.  Another approach recommended to me in the Dallas Makerspace Forums was to use magnetic paint.  I'd never heard of that before, and after reading some reviews, was a bit skeptical.  I obtained a scrap piece of foam core and applied three coats of magnetic paint to one side.  Here are my hints for optimal magnetic paint application:

  • Stir it well, as the globs of magnetic ore tend to clump together over time, leaving you with nothing but a heavy oil.  If stirring it seems risky (because you're wearing your nice slacks and shirt), just grab one of the big globs with your stir stick and mush it down with your paint roller.
  • Apply several coats.  One won't do it for you.  Even after several coats, areas where you may have spread around clumps of ore will come out stronger than where you applied just the paint by itself.
  • You may have had the Paint Department at your hardware store stir it up for you, but you will need to stir it again.  Even after spending time in their machine, it was still really clumpy by the time I got it back home.

The paint is also extremely oily, so bear that in mind when you choose your brush/roller.  Nevertheless, I was very satisfied with the outcome, and I can actually hold up the whole piece of sample foam by gripping just one neodymium magnet stuck to it.

After getting the printout and proper foam core delivered from the shop, I had just a basic map with really small dots guiding where I needed to place pin markers indicating each location I chose to show and exactly what type of place it is.  It took about 1 hour to place all the pins in their right location.  Without clipping all these pins, it's impossible to put it in a conventional frame and mount it against the wall, so currently it's balanced standing up on a table against the wall.

The index of all these places took me a little while to construct, as it's the heart and soul of what makes this map actually meaningful.  The places listed are formatted into 2 columns, and organized into categories with a Table of Contents.  The next step is to add color-coded tabs along the side so people can flip directly to the desired category.

While this map only has one or two instances of chain stores listed for this entire metro region, we are planning on a Web-based version of this Makers' Markers project that will allow people to filter by location as well as who is open now hours-wise.  Nevertheless, this experience has broadened my horizons in the endeavor of producing nice static maps with more flexibility than afforded by simply Google Maps or Open Street Maps alone.

Thursday, July 23, 2015

Adding Stock Quotes to a BriteBlox Display


People interested in the stock market might be inclined to have a cool stock ticker in their room or office.  While it's not quite as useful as a whole bunch of monitors displaying entire panels of quotes and graphs, it does provide at least a mesmerizing distraction, not to mention your room now looks at least a little bit more like a real trading floor in a big office.  This was the original use case for BriteBlox, the modular LED marquee designed by #DoesItPew and myself.  Not only is the marquee itself modular (i.e. you can configure it in various lengths, heights, and shapes), but ideally the control software is modular, leading to a design where any coder can develop plugins to drive the board from any data source they can convert into animated pixels.

Unfortunately, we haven't gotten the software quite to the point where it is incredibly easy for third-party developers to write plugins; right now, they would actually have to integrate the feature into the program with an understanding of what needs to be called where in the UI thread and how it ties in with the display thread.  But, as we hope to have shipping for our Kickstarter backers wound down by the end of this week (finally!), there should be more time to flesh out this modular software design.

That being said, there was another challenge: despite actually developing this marquee with the idea of displaying stock quotes, there was the problem of actually finding a legitimate source of quotes that's free and not on a delay.  For those who haven't done this before and are searching Google for the first time, there are many money-hungry demons thickly spamming Google's search results pages with false promises of a useful product for my purpose.  And, despite that I have several brokerage accounts with various institutions, it's hard to find one that actually provides an API for stock quotes unless you meet the $25,000 minimum equity requirement usually required for day trading.  You might get lucky and find one that interfaces with DDE for ancient versions of Microsoft Office or Visual Basic for Applications, but it's been a very long time since I've ever touched any of these and don't want a service that requires a whole lot of other dependencies for the user to install.  The most general approach to take seems to be to provide an RSS feed reader.

The Best Solution For General Cases

Really Simple Syndication (RSS) is useful for quickly scanning sites for updates.  Most of the time, it is provided for news sites or sites whose content changes frequently.  Of course, stock quotes also change frequently; since RSS offers "pull style" updates (it doesn't get updated until you refresh), it works well with our protocol since there's no need to manage what particular symbol appears where and what to do with an updated price.  Sometimes, on days when one particular stock is trading volume an order of magnitude above others, you'll see tickers dominated with quotes from that stock.  Our mechanism won't do that because each ticker symbol is represented on each update, and updates occur each time the marquee is done scrolling all messages.

Python can easily parse RSS feeds by means of the feedparser module which you can install with pip.  Once you download the feed, you need to parse through its XML with xml.etree.ElementTree.  This is all pretty easy, but if you're using the NASDAQ feed in particular, you'll notice the quotes are embedded in ugly, unwieldy HTML.  It is difficult to parse because they do not provide unique identifiers as to what information is contained in what table cell, so you have to do a little bit of exploration ahead of time in order to see which cells contain what you want.  Here is how I'm currently handling the parsing, from end to end:

                    d = feedparser.parse(self.feedURL)
                    console.cwrite("There was an error in fetching the requested RSS document.")
           = False
                info = []
                feed = "<body>%s</body>" % d.entries[0].summary_detail.value.replace("&nbsp;", "")
                tree = ET.ElementTree(ET.fromstring(feed))
                root = tree.getroot()
                # Find the last updated time
                last = root.find(".//table[@width='180']/tr/td")
                info.append("%s%s  %s" % ((setColor % yellow), last.text.strip(), endColor))
                # Find all the quotes
                counter = 0
                for elem in root.findall(".//table[@width='200']"):
                    for elem2 in elem.findall(".//td"):
                        for text in elem2.itertext():
                            idx = counter % 13
                            if idx == 0:  # Ticker symbol
                                info.append("%s%s " % ((setColor % yellow), text.strip()))
                            if idx == 3:  # Last trade
                                info.append("%s %s" % (text.strip(), endColor))
                            if idx == 5:  # Change sign
                                sign = text.strip()
                                info.append("%s%s" % ((setColor % (green if sign == "+" else red)), sign))
                            if idx == 6:  # Change amount
                                info.append("%s %s" % (text.strip(), endColor))
                            counter += 1
                # We're done parsing, so join everything together
                newMessage = globals.html % ''.join(info)
                # FIXME: For now, this will be Message #1
                globals.richMsgs[0] = newMessage

 Now, the next challenge was to come up with the means to integrate stock quotes with the usual message display thread.  Twitter is a unique case; since its API updates my app via push notifications, it can tell my app to save the incoming message into the next available message slot in the queue, and then when it's time for the serial output thread to refresh what goes out onto the matrix, any messages currently in the queue get shown.  Ideally, it'd be nice to find a stock API that behaved in a similar manner, despite that it'd expose us to possibly showing multiple quotes for the same stock in one run through the message queue -- there are ways we could work around this if needed.  However, since this is a pull-based notification, I needed a way for the serial output thread to signal pull-based threads to refresh their data.

There were, in fact, two ways I debated on while trying to develop this feature: 

  • Create an array of flags that the serial update thread raises before each text update; then, each respective class instance who registered a flag in this array will update its message in the appropriate slot
  • Tie a reference to a class instance or flag in with the RawTextItem objects (derived from Qt's QListWidgetItem) initialized whenever you make a new message in Raw Text mode; this would be empty for raw text and push-based notifications but would be populated for text requiring pull-based notifications, and would require the serial input thread to iterate over these items typically stored in the UI class instance

Ultimately, I settled on the first design.  A plugin developer would be required to know where to register the flag in either case, and I thought it'd be better to make that an array defined in the LEDgoesGlobals module rather than requiring people to pull in the UI class instance just to have access to that.  Also, they're not having to add extra data to something that gets displayed on the UI thread.  As you can imagine, my biggest pain points were simply refactoring & debugging all the little changes made throughout mostly the data structures that are used to pass the bits from the computer onto the marquee.

In the process of writing this big update to support pull updates between refreshes of the matrix, I also cleaned up code in the serial update thread that was iterating through the XML tree twice for no good reason other than just to put the letter(s) into the appropriate color data structure (i.e. red or green).  I also started to make this modular by defining colors in LEDgoesGlobals, but there are still many parts of other code that treats colors individually by name rather than agnostically (by simply sending a particular color data structure to a particular set of chip addresses).

As with most things, there is still some work left on this code before it's "perfect," but it's available on our GitHub right now if you are capable of running the BriteBlox PC Interface from source and would like to check it out.

Thursday, July 16, 2015

I Finally Found an Application For My CUDA Cores!

During graduate school, I was exposed to the power of CUDA cores through my parallel computing class.  Back then, there was a relatively small number of such cores on the video card inside their shared server, something like 40 if I remember correctly.  With my NVIDIA GeForce GTX 650 Ti video card, however, I now have 768 CUDA cores at my disposal -- almost 20 times as many as in grad class 4 years ago!

Not being much of a mathematician at heart, and generally spending time on logic problems, application testing, or new HTML5 & browser paradigms rather than crunching big data, I was never really inspired to do much with these cores.  This all changed when watching the Google I/O 2015 keynote address when they showed off the capability for you to draw (as best you can) an emoji, and Google's engine will try to recognize your scrawl and offer you up several profesionally-drawn emojis to represent whatever it is you're trying to express.  With recent changes in my life that have augmented my ability to "Go Get 'Em" and increased the likelihood that my ideas will actually reach customers, I immediately began scheming to learn how they set out doing this.  Obviously image analysis was involved, but what algorithms did they use?  Thinking back to my Digital Image Analysis class, I began researching how applicable Hough transforms would be to my problem.  I would need to teach the computer what certain symbols looked like in that particular mathematical space, which would probably take me a while since it's not really one of my strong points.  Another discouraging bit of trivia is that Hough transforms can be difficult to apply to complex shapes because there starts to become very little margin for error.  Well, scratch that; back to the drawing board.

Then, thinking back to Machine Learning class, one algorithm in particular seemed adaptable to all sorts of problems, and is even designed with the same (or very similar) scientific principles as human thought.  This particular learning algorithm has received quite a bit of buzz lately, with projects such as MarI/O and Google's "Inceptionism" experiments: neural networks.  With neural networks, you ultimately end up with (through some sort of black magic that occurs through repetitive training exercises) a series of very simple algebraic equations that will help you arrive at an answer given one or more inputs (it usually helps to have at least two inputs to make things at all interesting).  Through stacked layers of various sizes, each comprised of various quanta called "perceptrons" (which fulfill a very similar role to neurons), the neural network will begin to perceive features in a set of data in much the same way a human will analyze a visual scene and pick out all the items they can see.  There are many variables involved with coming up with a good neural network for a specific problem; for instance, the number of iterations you run training on the network, and the functions your perceptrons use when weighing inputs to make the final decision.  The neural network can also, unfortunately, be easily biased by the training data it sees during formation, so sometimes it can perceive things that aren't really there.

Given a set of data that could end up being very large, it became desirable to find a way to train the neural network using some sort of parallel framework, if possible.  Luckily, people have already solved this problem: NVIDIA has devised a library of primitives for neural networks (including Deep Neural Networks and Convolutional Neural Networks) called cuDNN.  Computer scientists at UC Berkeley have developed a DNN framework called Caffe, a highly-optimized neural network creator; it happens to support cuDNN, which you specify support for when you build it, and this takes its existing capabilities to a whole new, much faster level.

Getting My Caffe to Brew

Important note: This is all cutting-edge information, and is subject to change over time.  Some of the sources I used to put this article together are already slightly out of date, and so I expect this post will eventually go out of date too.  You've been warned!

Unfortunately, Caffe with cuDNN requires quite a few dependencies; these are all called out on this particular introductory post.  I chose to install some directly from source (by downloading the source or cloning from GitHub), and others were installed through Synaptic Package Manager on Ubuntu.  For this particular project, I installed the following binaries from the following sources:

Expected PackageInstalledMethod
CUDACUDA 7.0.28Synaptic
BLASOpenBLAS 0.2.14Direct download (see Note 2)
OpenCVOpenCV 3.0.0Direct download
protobuf (see Note 3)protobuf 3.0.0-alpha3 2.6.1Direct download
glogglog 0.3.3Direct download
gflags (see Note 1)gflags 2.1.2Direct download
hdf5libhdf5-dev 1.8.11-5ubuntu7Synaptic
leveldblibleveldb1 1.15.0-2Synaptic
snappylibsnappy1 1.1.0-1ubuntu1Synaptic
lmdbliblmdb0, liblmdb-dev 0.9.10-1Synaptic
And finally...
CaffeMerge 805a995 7d3a8e9, 7/3/15Git clone

Note 1: When making gflags, take a moment to go into the Advanced option of ccmake, and specify the CMAKE_CXX_FLAGS variable (how, you ask? read the next paragraph).  You need to set this variable to contain the compilation flag -fPIC thusly, or else later on, when you try to build Caffe, it will complain that the files you built for gflags aren't suitable to be used as shared objects by Caffe.

Note 2: For reasons unknown, I first tried to install it from a Git clone, but then ended up downloading this version directly and installing it successfully.

Note 3: At the time of this writing, you will run into trouble if you try to use the Python wrapper for exploring Caffe models if you build Caffe with protobuf 3.0.  Until this is fixed, use protobuf 2.6.1.

If you've never used cmake before, it's not very difficult at all.  At its heart, cmake facilitates making build instructions for multiple platforms in one convenient place, so that users of Windows, Linux, and Mac only need to tell it about certain paths to libraries and include files that don't already exist on their PATH or in some environment variable.  To set up your Makefile with cmake, the easiest thing to do is to go into the directory one level above cmake (e.g. caffe/, which contains caffe/cmake) and write ccmake . on the command line (note the two C's and the dot).  If you're into isolating new work, you may wish to create a build directory inside the project root directory, then run ccmake .. so that it's easy to trash all temporary files.

However, setting up the configuration for Caffe itself was not so easy for me.  After installing all the dependencies, the system just flat out refused to believe I wanted to use OpenBLAS rather than Atlas, so I ended up actually having to delete several lines of the Dependencies.cmake file -- specifically, the parts that specified which environment variables to read from if the user had specified Atlas or MKL -- as indicated by the "stack trace" being provided by ccmake.  Ultimately, not too difficult an adjustment to make; I just never have too much fun adjusting Makefiles by hand, so if it can be done through the configuration tool, I'd much prefer that.

Building a Useful Data Model

Once you have done all these steps to make Caffe with cuDNN, a great real-world example to run through is the "mnist" example which hashes through several thousand samples of handwritten numeric digits from the National Institute of Standards & Technology that were taken back in the early '90s (i.e. the MNIST database).  These scans are very low-resolution by today's standards, but are still often used as a benchmark for the performance of neural networks on handwriting samples (just as the picture of Lena Soderberg from a 1972 Playboy centerfold is still used as a benchmark for image processing algorithms, except with a lot less sexist undertones :-P).  Nevertheless, my machine took just under 4 minutes and 17 seconds to crank through a 10,000-iteration training cycle for a neural network that will classify image input as a digit.  The demo (linked to above) was very simple to run, as all of the work to create the neural network structure and the mechanism of the perceptrons was all done for me in advance; all I had to do was kick off the script that iteratively runs the training so it drills down on salient features distinguishing each digit from each other.  The only hangup was that some of the scripts expected files to be located in the ./build/ directory, but my particular installation skipped the ./build/ and went directly to the desired paths.

Putting the Model To Use: Classifying Hand-Drawn Numbers

After doing a bit of reading on how to extract the features from the neural network, I decided it'd be easiest to stick to the Python wrapper until I get some more experience with what operations exactly get run where, which is highly dependent on the way your deployment prototxt file is set up. One thing that would have been nice to know is the link seen in many places in the Caffe documentation that is said to describe how to use the Python module is wrong; they omitted a "00-", so it should really be  On my environment, some Python dependencies also needed to be installed before the Python wrapper would run properly.  Here's what I had to do:

  1. for req in $(cat requirements.txt); do sudo pip install $req; done -- Installs many of the Python modules required, but leaves a little bit to be desired (which is accounted for in the next steps)
  2. Install python-scipy and python-skimage using Synaptic
  3. Uninstall protobuf-3.0.0-alpha3, and install an older version (in accordance with Caffe issue #2092 on GitHub)... would have been nice to know this ahead of time.  (Don't forget to run sudo ldconfig so you can verify the installation by running protoc --version).
  4. Rebuild caffe so it knows where to find my "new (old)" version of protobuf
Once my dependency issues were sorted, I managed to find the deployment prototxt file for this particular neural net in caffe/examples/mnist/lenet.prototxt.  Now, I can run the model simply by issuing the following Terminal command:

caffe/python$ python --model-def=../examples/mnist/lenet.prototxt --pretrained_model=../examples/mnist/lenet_iter_10000.caffemodel --gpu --center_only --channel_swap='0' --images_dim='28,28' --mean_file='' ../examples/images/inverted2.jpg ../lenet-output.txt

lenet_iter_10000.caffemodel is the trained model from the training exercise performed earlier from the Caffe instructions.  inverted2.jpg is literally a 28x28 image of a hand-drawn number 2, and lenet-output.txt.npy is where I expect to see the classification as proposed by the model (it tacks on .npy).  The channel swap argument relates to how OpenCV handles RGB images (really as BGR), so by default, the value is "2,1,0".  By carefully scrutinizing this command, you may notice two things:

  • The input image should be inverted -- i.e. white number on black background.
  • The input image should only have one channel.

Thus, before running my model, I need to make sure the image I'm classifying is compliant with the format required for this classifier.  For further confirmation, take a look at the top of lenet.prototxt:

input_dim: 64   # number of pictures to send to the GPU at a time -- increase this to really take advantage of your GPU if you have tons of pictures...
input_dim: 1   # number of channels in your image
input_dim: 28   # size of the image along a dimension
input_dim: 28   # size of the image along another dimension

You may be tempted to change the second input_dim to 3 in order to use images saved in the standard 3-channel RGB format, or even 4-channel RGBA.  However, since you trained this neural network on grayscale images, it will give you a Check failed: ShapeEquals(proto_ shape mismatch (reshape not set) error if you do this.  Thus, it's important the image is of single-channel format and inverted, as mentioned above.

Finally, so that properly handles the single-channel image, you need to make some amendments to it.  Take a look at this comment on the Caffe GitHub page for an explanation of exactly what you need to do; in short, change the two calls of type to, False), and then use the channel_swap argument as specified above in the syntax.  However, you may just wish to hold out for (or incorporate) (or check out the Git branch that contains) Caffe Pull Request #2359, as this contains some code that'll clean up so you can simply use one convenient command-line flag --force_grayscale instead of having to specify --mean_file and --channel_swap and rewrite code to handle single-channel images.  It'll also allow you to conveniently print out labels along with the probability of the image being each category.

Now that you've been exposed to the deployment prototxt file and have an idea of what layers are present in the system, you can start extracting them by using this straightforward guide, or possibly this other guide if you're interested in making HDF5 and Mocha models.


Before discovering lenet.prototxt, I tried to make my own deploy.prototxt.  First, I utilized lenet_train_test.prototxt as my baseline.
  • If you leave the file as it is but do not initialize the database properly, you will see Check failed: mdb_status == 0
  • I deleted the "Data" layers that are included on phase TRAIN and phase TEST.  I am not using LMDB as my picture source; I'm using an actual JPEG, so I need to follow something along this file format:
    name: "LeNet"   # this line stays unchanged
    input: "data"   # specify your "layer" name
    input_dim: 1   # number of pictures to send to the GPU at a time -- increase this to really take advantage of your GPU if you have tons of pictures...
    input_dim: 1   # number of channels in your image
    input_dim: 28   # size of the image along a dimension
    input_dim: 28   # size of the image along another dimension
    layer: {
      name: "conv1"   # continue with this layer, make sure to delete other data layers

  • Delete the "accuracy" layer, since it's used in TEST only, and protobuf doesn't like barewords like TEST in the syntax anyway.
  • Replace the "loss" layer with a "prob" layer.  It should look like:
    layer {
      name: "prob"
      type: "Softmax"
      bottom: "ip2"
      top: "prob"
    If you're simply replacing the loss layer with the new text, rather than removing and replacing, it's important to take out the bottom: "label" part, or else you'll probably get an error along the lines of Unknown blob input label to layer 1.  Also, just use plain Softmax as your perceptron type in this layer; nothing else.
  • Make sure you don't have any string values (barewords) that don't have quotes around them, such as type: SOFTMAX or phase: TEST.
  • If you have both the "loss" layer and the "prob" layer in place in deploy.prototxt, you will see Failed to parse NetParameter.  Again, be sure you replaced the "loss" layer with the "prob" layer.
  • If you forget the --channel_swap="0" argument on a single-channel image, and you don't have something in your code to the effect of Git pull #2359 mentioned above, you will see the message "Channel swap needs to have the same number of dimensions as the input channels."


Later on, as this algorithm gets closer to deployment in a large production setting, it could be nice to tweak it in order to get the best success rate on the test data.  There are some neural networks developed to classify the MNIST data so well that they have actually scored higher than their well-trained human counterparts on recognizing even the most chicken-scratch of handwritten digits.  It has also been noted that some algorithms end up getting significantly weaker performance on other datasets such as the USPS handwritten digit dataset.

More Information:

Thursday, July 9, 2015

Restoring the Granddaddy of Modern Computers: the IBM 5150

It Was a Dinosaur Back Then...

At some point a very long time ago in my life, I acquired an IBM 5150 PC from my grandfather.  I'm not sure why he wanted to give it to me at that time, but I did have fond memories of playing old games on 5.25" floppy such as Grand Prix Circuit and Wheel of Fortune with my cousins on hot summer days in Grandpa's garage in Houston (along with a similarly vintage Ferrari 308GTB which always remained under wraps -- I didn't even realize it was blue until after he died), so I was definitely happy to take it.  (The computer, of course. ;)  By most people's definition, the 5150 is the root of the modern personal computer, but in the late '90s when I received the machine, I did not have the right skills nor tools to get it up and working; moreover, with no expansion cards installed (no floppy disk controller nor video card in particular), it would not have been very useful nor even easy to triage and fix.  Fast forward 16 or 17 years since my last attempt, during which time I got degrees in Computer Engineering and Computer Science, and with a little bit of inclination from an outside event, now there's a newly-restored 5150 sucking gobs of power off the grid. :-P

This restoration project was spurred by the closing of a computer store in Arlington, TX called Electronic Discount Sales.  They have been on Pioneer Pkwy for nearly 30 years, but the owner has finally decided after all this time that he wants to "semi-retire," so he has been working on closing the store for several months now by trying to self-liquidate all of the remaining merchandise.  It is quite a large building, probably occupied by a grocery store in its former life before EDS moved in.  The thing that makes this a highly unusual case, though, is EDS contains stuff he received new 10 or even 20 years ago that still has yet to sell.  From a business perspective, we're surprised he's stayed in business this long.  But from a nerd perspective, it is amazing a place such as this still exists with all sorts of retro gadgets we used to enjoy throughout our lives.  Whatever we need to fix or to get further enjoyment out of an old computer or console system, he probably has.  Despite some of this stuff reaching its "knee" in the market (it has stopped losing value and is actually gaining value again), they still had some of these gadgets at shockingly low prices, especially in the video games section.  (To be fair, it did seem like they were asking a lot of money for certain other things they were selling, particularly the "old but not quite vintage yet" laptops.)  Also, for those who remember the earliest PCs, they had a Computer Museum devoted to this old technology, which is also in the process of being liquidated.  It is mostly from the Computer Museum that I have been able to restore my machine to something that works, at least in the most "BASIC" way.

A view of Electronic Discount Sales

A shelf of software from the 1990s

80% Off Everything tends to reveal all the obscure artifacts from an era in computing I'm sure no one misses...

After sitting in my grandpa's garage for a long time, the old relic sat in my mom's garage for yet another 16 or 17 years.  When I opened it up that long ago, it seemed impossible to have been the exact same machine I played all those games on -- after all, it wasn't that long ago back then that I was playing them, and yet this machine was pretty well stripped down and even seemed to be a different color than I remembered.  I didn't even feel confident in turning it on, so I left it alone until two days after visiting EDS, when Mom was able to dig it up from her garage.  After acquiring it, I started reading up on the machine and trying to learn its capabilities and what would make it tick.  Many sources pointed to a book called Upgrading & Repairing PCs, of which I have several editions at Mom's house (including the original edition), so the next day, I met her to get that original edition.  By then, I had about four different ways to confirm some important information:
  • The BIOS on the system is the 3rd iteration BIOS from 10/27/1982 (the most bug-free of them all, but still not great).
  • The system needs expansion cards to do anything useful, such as display video or read from a floppy.
  • The power supply in this particular machine is not stock, and that is a good thing.  The original PSU was very noisy yet only about 40% as powerful as the one provided to me.

The first thing a circuit-savvy individual might wish to do upon receiving an ancient circuit, especially one that has been sitting in Texas garages for most of its life, is to replace any electrolytic capacitors.  We did this on our Gold Wings pinball machine from 1986, and along with other electronic modifications, it now runs like a champ.  Electrolytic capacitors tend to dry up over time due to either low-quality manufacturing or heat stresses on their bodies which will crack the dielectric, let in moisture, and introduce "gremlins" (odd phenomena you can't explain or is hard to troubleshoot when using an electronic device).  There are 16 electrolytic capacitors inside the power supply, but thankfullly, none on the motherboard.  After making careful notes detailing capacitance, voltage, and placement, I sent +DoesItPew to Tanner Electronics in Carrollton to obtain the needed capacitors for the PSU as well as for the two 360K 5.25" full-height floppy disk drives (which I'll get to restoring later).  I spent two or three hours removing and replacing these capacitors, and just after midnight, began putting it back together and eventually tested it on an early SATA hard drive that still had a 4-pin Molex connection for power.  It fired right up like a champ, and the voltages coming out of other Molex connectors appeared to be correct!  Obviously, the capacitor replacement worked (ultimately I'm not sure if it was totally necessary, but for the purposes of doing a good job on a long-lasting restoration, electrolytic capacitors should get replaced).

The next step was to plug this PSU into the motherboard.  However, the inside of this computer was caked full of dust and dirt from the onset.  It is not a good idea to turn on a dirty machine, so I spent a while carefully removing the motherboard and one of the floppy disk drives, and then using a dry toothbrush to brush off all sorts of dust and dirt from the case and motherboard.  I followed everything up with squirts of compressed air, and also did all this work outside in order to keep the house clean and make sure the dust doesn't get a chance to resettle in my machine (as it was a windy day, so the dust particles would blow elsewhere).

With the PC now cleaned (and some of the chips having their shiny luster restored :-P), I plugged the PSU into the motherboard, prayed to the computer gods above, and flipped the big On switch.


Not even a sound from the speaker.

Thanks to many folks who have been down this road before, there are some good troubleshooting guides for debugging problems starting an IBM 5150 PC.  The possible symptoms when an IBM 5150 does not beep are that you have a bad power supply or a bad motherboard.  Just because you replace capacitors in the PSU doesn't mean it's all good; again, heat stresses or dust can work "magic" on your circuits so they don't work as intended.  While the computer was running, I whipped out a multimeter and probed the power supply lines in order to assure myself that the voltages coming out of it are good.  Everything checked out within the operating specifications.  Then, I powered the system off and checked the resistance between various "hot" lines and Ground on the motherboard.  Again, all these values appeared within spec.  All this work proves that the power supply is good and there are no shorts on the motherboard.  It has been reported, though, that the tantalum capacitors regulating the power right by the PSU connectors can go bad and cause a short on the motherboard.  Luckily, I didn't need to rework a capacitor, but ultimately I did need to rework something that's much more of a pain -- you'll see later.

A Potentially Huge Time Sink

If the problem has been found not to be in the power supply, yet the computer does not beep a POST code to you, then it's either in the speaker or somewhere in the motherboard.  The speaker was measured to have the correct amount of resistance, and the cone was still in good shape, so that wasn't the issue.  This left the daunting task of finding out what was wrong on the motherboard.  However, there is a culprit far and away more likely than anything else: faulty memory chips.  The memory in the IBM 5150 is unreliable and often goes bad.  Toggling several DIP switches in order to try to adjust the memory got me nowhere, so I elected to remove all of the memory chips in banks 1-3 (bank 0 is soldered into the motherboard directly).  After this, the computer still wouldn't make any noise, so I probed several other things with the multimeter and experimented with some more DIP socket settings, also to no avail.

There is a technique known as "piggybacking" where you take a good chip and set its legs right onto the legs of the bad chip.  This is an unreliable method to triage a PC, though, as you probably don't know if the good chip is actually good, you don't know which bad chip it is, and it's not guaranteed to make the circuit behave as expected if the bad chip is not totally dead.  Nevertheless, I figured I'd give it a shot; it beats the alternative of having to order an obscure ROM chip, program it with a diagnostic tool that's notoriously buggy, and then make an adapter for it just so it fits in the original BIOS slot on the motherboard.  That sounded like an even bigger waste of time than just piggybacking, so I put a random memory chip on top of a random chip in Bank 0, and turned it on.


I went back to double-check my DIP socket settings in order to indicate I had the absolute minimum amount of memory installed, and...

Voila!  It worked!

(In retrospect, this was actually a pretty good random guess, since the computer tends to appear dead if the memory fault occurs on the first two chips of Bank 0; I happened to pick the very first one, Bit 0 of Bank 0.)  The first signs of life out of this PC were the long-short-short beep code, indicating it is expecting a video card but did not detect one.  Immediately, I packed everything up and headed down to EDS in Arlington.

The liquidation sale has been going on for quite some time, so what's left of the inventory was rather disheveled.  I sifted through several buckets of ISA cards, but did not turn much up at all that was of the 8-bit ISA variety required for this PC.  I went back to their PC Museum area hoping to find any sort of useful PC card, and one of the associates helped me track down two IBM-compatible CGA/EGA cards to put into my machine.  One card, the Epson Y1272040002, was only $30.  The other, a Compaq Merlin "Enhanced Color Graphics Board", ran for $150.  Given the rarity of the Compaq card on the Internet -- it seemed like I had just stumbled across Unobtainium -- I ended up plunking down for both without much hesitation.  Fortunately, it turns out everything in their Museum is 35% off, so the total was just over $100 for both cards.  After spending about two hours searching that store, I think I found the last things of use to me from there.  What a sad day.  From EDS, I obtained a $65 Tandy CGA monitor and two video cards totaling about $120 (everything considering the 35% discount).  It's also amusing that I'm retrofitting the IBM PC with IBM-compatible parts, though simply due to supply issues more than anything.

After getting home and having some dinner, I tried both of the cards in the PC as-is, and neither of them seemed to do the trick.  The PC was still emitting the long-short-short beeps indicative of no video found.  I decided to take the simpler of the two cards (guess which one that was :-P) and switch its setting from Monochrome to Color.  Upon firing it up... just one short beep!  That's exactly what you want to hear.

I ran to get the CGA monitor from the other room, and plugged it in next to the PC.  Immediately, I was greeted with PC Basic, which is what you see when you don't have any working or enabled floppy disk drives.  This was enough for me, though; I was extremely satisfied with four days' worth of work after work.

Picture of the first video signal emitted from this PC in a very long time
And on the fourth night, the PC Gods proclaimed, "Let there be video!"

I spent the remainder of the night trying to conjure up my BASIC programming skills, yet incorporating some of the differences I had only read about when comparing original BASIC to the QBASIC I used when first starting programming in the late '90s.  One particularly amusing aspect is that you can move the cursor wherever you want to on-screen, so I altered the PC's greeting to say some immature things that were amusing until my program started to scroll the window deep into the depths of spaghetti code (what else are you going to write when you don't exactly know BASIC?).  Overall, I'd say that was pretty impressive to restore a 5150 in just a few hours a day over four days.

Can't just rest on your laurels...

Of course, it's not wise to trust a piggybacked chip for very long.  It needs to be soldered into the board eventually.  Over the long July 4th weekend, I took some time to desolder the bad memory chip from the motherboard and replace it with a DIP socket so that any chip that sits in that spot will be removable thereafter.  This process took a while because I went about it not by simply clipping the pins, but by trying to heat up the solder in each via, then pushing each pin to the center of its via.  After each pin was centered, and ChipQuik was applied to each via as well (bismuth lowers the melting temperature of solder), I would apply yet more heat to several holes at once and eventually managed to pry the chip out with a screwdriver.  Unfortunately, my IC extractor was too thick to navigate around some of the other socketed ICs, so I had to use a screwdriver (a more brutish, primitive method).  Also, when I was pushing each pin to the center of its via, occasionally I would push too hard with the tool and scrape off some of the protective coating on the traces surrounding the chip as I was trying to center the pins.  Next time I know a chip is dead, I likely won't even bother with all this trouble.

Once the chip was removed, I used a desoldering vacuum and solder wick to remove the old solder and bismuth, then set a new DIP socket into place.  It was soldered in with new lead-free solder, and one of the memory chips from Bank 3 was installed into place.  The old chip was indignantly thrown away.  I was very proud when the motherboard I had just reworked successfully powered on and booted to BASIC!

Now that the rework was successful, I took some time to notice the errors thrown up on screen just before BASIC would come up.  First, I was curious as to what "301" meant -- it turns out that the 301 error indicates a problem with the keyboard.  For some reason, I have to leave my keyboard unplugged until after the computer boots up, or else it initializes with the wrong data rate and sends a bunch of gibberish.  In any event (plugged or unplugged), I get the 301.  No big deal right now; I'll try it with one of my Model Ms and see how it goes.

Once I discovered that 301 was an error, though, it got me thinking about the "201" also displaying on my screen.  It turns out 201 is a lot more interesting, and indicates a memory error.  The specific memory error I was getting indicated there were problems with Bit 2, 4, 6, and 8 in Bank 1 (the message was 1055 201 -- 10 = Bank 1, and 0x55 = 0101 0101 in binary, where ones indicate problem bits).  This was because I had no memory installed in Bank 1 anymore, due to trying to isolate RAM problems, so I repopulated Banks 2 & 3 and booted once again.  This time, the machine was satisfied.

There are a couple bugs on the 10/27/1982 IBM PC BIOS that wreak havoc with the memory on the 64-256KB board (which is the one I have).  The first is that, due to a portion of a byte being set with an incorrect value when not all 4 banks of memory are enabled, the system multiplies the number of chips by the wrong number and seriously under-reports the amount of RAM installed in the system.  The second is that, for the same reason but in a portion of a different byte, the system tries to check much more memory in its POST initialization than what might actually be installed.  For instance, if Bank 1 is enabled, it will try to test memory in Banks 1-3.  When Banks 1 & 2 are enabled, it thinks there's so much memory that you would need the memory expansion card in order for all tests to pass.  Luckily, the expected behavior is exhibited when all 4 banks are enabled -- it runs the tests in exactly the 4 banks.  Based on these two bugs, it makes very little sense to run a 64-256K IBM 5150 with less than 256K of memory.

Nevertheless, I used these glitches to my advantage when testing the remainder of the memory chips.  It turns out that only the one chip at Bank 0 Bit 0 was bad, so I have been in contact with some of the local electronics stores to see if any of them happen to carry a suitable replacement.  Luckily, it turns out that technology has run in the bloodstream of the Dallas economy for some time, so I shouldn't be too far away from finding the chip.  However, I have other projects to tend to, now that this system is at least booting up happily...

Useful Sites

If you too are on a quest to restore an IBM PC, XT, or AT system, here are some good places: