Prettiest Little QR Code Ever
This is the chronicle of a misguided attempt to create a small, aesthetically pleasing QR Code.
The premise was absurd. The smallest QR Codes are 21 dots across. That's not enough room for any signifcant art, much less aesthetic beauty. Even worse is the fact that QR Codes compress information and support error correction. Well, that's great for the code, but bad news for the aesthetic.
Successful compression guarantees uniform density of information. There's visual noise everywhere. Art requires something else: sparcity, shape or symmetry. Beauty requires something to bring it out.
So, I was starting with guaranteed uniform noise, and I wanted shape, symmetry and sparcity.
Yeah, this was going to work.
So, why did I even decide to try?
Well, I'm really proud of the QR Code for http://dlma.com/. Because the link is so short, the QR Code that represents it is the smallest possible type, at 21 dots across.

When I look at that image it's so obviously a combination of a dancing Rasta banana (dancing for tips in a hat), a jackhammer, and a Phoenix.

You see it? Of course you do.
If I could get the awesome banana-jackhammer-phoenix without even trying, I bet I could make a bee-yoo-tee-ful QR Code if I asked the computer nicely. Or made it work really hard at it.
I took a quick stab at every possible 21x21 QR Code that'd direct to my domain, but they all sucked. If I allowed myself to paint on a 25x25 canvas (the next size up for QR Codes), that'd give me a lot more breathing room.
So here's what I did: Starting at 0, or 0000000000, I had the computer count up in base 36 numbers to about zzzzzzzzzz. (Yep, "zzzzzzzzzz" is a number in base 36, it is over 3.5 quadrillion.)
For each and every number, I had the computer create many possible URLs that goes to dlma.com, and for each one of those URLs, I had the computer do every possible QR Code encoding at 25 dots across. (There are about three or four encodings at different levels of error correction.)
For example, for the base 36 number "52gb", there are 32 different URLs at the domain dlma.com, like so:
| URLs for 52gb | ||
|---|---|---|
| http://dlma.com/52gb | http://dlma.com/52gb/ | http://dlma.com/52g/b |
| http://dlma.com/52g/b/ | http://dlma.com/52/gb | http://dlma.com/52/gb/ |
| http://dlma.com/52/g/b | http://dlma.com/52/g/b/ | http://dlma.com/5/2gb |
| http://dlma.com/5/2gb/ | http://dlma.com/5/2g/b | http://dlma.com/5/2g/b/ |
| http://dlma.com/5/2/gb | http://dlma.com/5/2/gb/ | http://dlma.com/5/2/g/b |
| http://dlma.com/5/2/g/b/ | http://5.dlma.com/2gb | http://5.dlma.com/2gb/ |
| http://5.dlma.com/2g/b | http://5.dlma.com/2g/b/ | http://5.dlma.com/2/gb |
| http://5.dlma.com/2/gb/ | http://5.dlma.com/2/g/b | http://5.dlma.com/2/g/b/ |
| http://52.dlma.com/gb | http://52.dlma.com/gb/ | http://52.dlma.com/g/b |
| http://52.dlma.com/g/b/ | http://52g.dlma.com/b | http://52g.dlma.com/b/ |
| http://52gb.dlma.com | http://52gb.dlma.com/ |
If I were doing 5-digit numbers, then they'd generate 64 different URLs, and 6-digit numbers generate 128 different URLs. For each URL, the computer generated a few different QR Codes at different encoding rates, and for each of the QR Codes, they get rated for different visual attributes.
It can take nearly 10 seconds just to do 100 numbers. The program I used displayed 16 images at a time while it was doing the processing.
Believe it or not, the codes above are the best of breed for certain categories of aesthetic of small QR Codes that lead to my domain. If you mouse over the image, a legend will appear that says to which category each one of those codes belongs. "Darkest" means "has the most black dots" and "lightest" means the opposite. "Least lines" could also be called, "has the most individual dots, making checkerboard patterns." So "most lines" would mean has the least individual dots, making it the "clumpiest" code of them all. Technically speaking, that is.
The program that did the analysis has a lot of great features. It was all written in Python, except for the QR Code generator, which is a C++ Python module. The UI is one thread, while the worker thread is another. Every once in a while, the worker thread phones home to a remote web service and reports its results. That way, I could have multiple computers running the same analysis and they wouldn't step on each other. Each one also kept track of the best fifty codes for each category, giving me the chance to review 800 interesting QR Codes. I let computers run the program for days.
The result?
Well, let just say that there aren't many naturally beautiful QR Codes of size 25 or less. I ended up picking out three codes: One that sorta resembles an electrified soot sprite from Totoro from the heavy-center category, one that resembles Cthulhu from the H Symmetry category, and another that looks menacingly like Skeletor.
![]() Skeletor | ![]() Soot Sprite | ![]() Cthulhu |
All those hours of computing, and this was the best the program could come up with. No beautiful butterfly, eerily symmetric pattern, or falling rain of Matrix codes. Ah, well. At least I have a nice framework for whatever project I start next.
My Movie Rating Service
The Setting
My wife would call me from the video store when the new DVDs came out and ask what we should watch. She'd already have grabbed a couple of DVDs with interesting covers, and would want to know if they were any good.
I'd hop online and see what the IMDB ratings were. But IMDB only showed one movie at a time, so we'd have to keep track of all the ratings in our heads as we mulled it over.
Worse, we wouldn't take advantage of the fact that I've got an account at Netflix, which has a very sophisticated algorithm to predict how much I'd enjoy a new movie based on the ratings I've given for other movies. It was too much trouble to navigate from the IMDB to Netflix, you see.
Worse still, we were only looking at the movies that the video store was showcasing. We weren't being made aware of movies that were generally thought to be better than the major releases, or movies that Netflix thought we'd love far more than the average viewer.
The Solution
I wrote a small web service, imdb.dlma.com, that does a few things:
- Display for each movie I asked: the IMDB rating, the Netflix average rating, and best of all, the personalized Netflix predicted rating.
- Remember the last few movies I asked about, and display them next to each other in a convenient table.
- Show me the best new releases of the week.

The service has been alive for a couple of weeks now, and it's paid off in spades! It's brought to our attention movies that we'd never heard of that we'd enjoy far more than the average person. And even more frequently, it's suggested to us that the movie we're thinking about renting won't be worth our time.
Thank you, little Movie Rating Service of mine!
Details
My website had a cron job that would scan all the weekly new releases from a Netflix feed, and of those, see which ones have 500 or more votes at the IMDB. (The first votes are generally skewed higher, because the first voters have a vested interest in the success of the movie.) Then it would get the average IMDB rating for the movie, the average Netflix rating for the movie, and then my predicted rating of that movie from the Netflix algorithm.
To do this, I had to use the official Netflix API, and grant permission, as a Netflix user, for my web service to request the predicted ratings for me. Yay, Netflix, for respecting my privacy, and for providing such an awesome API.
In addition to the cron job, the site allows me to enter the title of a movie, and it'll do its best to get the IMDB and Netflix ratings based on just that. If it can't tell exactly which movie I meant, it'll offer me a list of titles, and I'll select the one I meant.
Challenges
By far and away, the biggest challenge was that I was dealing with databases maintained by two different companies, with data entered inconsistently in both databases. It's very difficult to determine an exact match when dealing with such sets of data.
Consider that the IMBD associates movies with their original title in the original language. Thus, "The Good, the Bad, and the Ugly" is actually "Il buono, il brutto, il cattivo." And "Star Wars: Episode IV - A New Hope" is actually "Star Wars" at the IMDB. At Netflix, it's actually, "Star Wars: Episode IV: A New Hope," notice the colons instead of the hyphen.
The databases can get anything wrong: titles, names, year-of-release. For example, Adventureland: the IMDB has the year of release at 2009, but Netflix thinks it was released in 2008.
Further confounding the issue is that the databases contain TV series and video games, too. Consider that "Cloudy With A Chance of Meatballs" the movie and game are both released in the same year with the same title and have the same cast. How is an algorithm to determine which entry to use? (Hint: The ESRB rating values are thankfully different than the MPAA's.)
Sample Test Searches That Fail Without Fuzzy Matching
I wrote a "fuzzy match" algorithm that tries to accommodate as many near-misses between the databases as possible. The following list of titles illustrates some of the challenges that have cropped up.
| Pride and Prejudice | Sometimes written as "and" or &, sometimes HTML encoded. |
| Adventureland | Different years-of-release in the databases. |
| Dil Se.. | Ellipses at the end of one title, but not the other. |
| Rabu Hina | Quotation marks surrounding one of the titles but not the other. |
| The Good, the Bad and the Ugly | Title in Italian at IMDB, not at Netflix. |
| Run, Fatboy, Run | Commas at Netflix, "Fatboy" one word at IMDB. |
| Silent Light | "Stellet licht" at IMDB. |
| The Good | This is an exact hit. Algorithm should bypass choices. |
| Star Wars | Called, "Episode IV..." at Netflix with inconsistent subtitle separators. |
| First Blood | Called, "Rambo" at Netflix. |
| Cloudy with a Chance of Meatballs | Game and movie have identical details. |




Entries