30 January 2019

MD5: nice to see you here, old friend!

I recently assembled a new desktop computer system. In the course of moving my data to it from its predecessor, I managed to corrupt the database that my photo organizer (Shotwell) uses to manage my photo collection. I rebuilt a new database from the photos themselves, but this lost metadata like comments and edits that I'd applied. I hoped, though, that I'd be able to recover that information later from the older versions of the database tables. I managed to do this, thanks in part to one, er, key element in the database structure.

In Shotwell's database, each photo has a row in the PhotoTable, with many columns containing information about it. There's a unique ID for each photo, but the IDs generated as photos were imported into the earlier database couldn't be assumed to be the same as when I reimported them into the new database. It would clearly be a Bad Thing to apply the tags for photo #457 in the old database to photo #457 in the new database. What to do?

Looking at the PhotoTable columns, I noticed that each photo had an entry for an MD5 hash of the image. Hash functions are great and useful things. It's unlikely (and, I mean, highly, probabilistically, unlikely) that I'm going to encounter two different images in my collection that yield the same MD5 value. (Even though MD5 isn't recommended today for security-relevant applications, it's still doing its job here in distinguishing among image files that came out of my cameras, which haven't generally acted as hostile attackers.) I expect that Shotwell's code uses the stored MD5 as a quick and effective means to determine whether or not a photo has already been imported into its database. When I saw the MD5 column in the table, I realized that it also provided me with a means to find the correspondence between photo entries and their IDs in the old table with their entries in the new table. Thusly armed, SQL of this form followed:

REPLACE INTO PhotoTable ( named-columns )
SELECT named-columns
FROM old-PhotoTable src
INNER JOIN PhotoTable dest ON src.md5 == dest.md5

which took less than a second to replace corresponding metadata into a table representing about 23,000 photos. I restarted Shotwell with the resulting table, and found my edits accurately restored. I was pleased to have been able to accomplish this. I was glad to have been using an open source organizer with an accessible and documented database representation, and emerged with refreshed respect for the power and value of hash functions.

29 January 2019

I thought it was a nice owl

There are things worth getting outdoors to see, even in winter, and I thought this screech owl last week was a good example. And, it's been a while since I've posted a picture here, so here one is. If you're viewing, thanks, you're welcome.

 
This and many more collected bird images can be found here, now augmented with improved site navigation capabilities. Ah, projects!

13 January 2019

Cold Days for Code Monkeying

I've taken and enjoyed a number of online courses in technical topics like programming and web technology, such as this example which I'm doing now. For no cost or a nominal fee, MOOCs often offer valuable means to refresh and update skills, providing opportunities to pursue engaging projects and inspiration for others. There's nothing like the satisfaction of building something and making it work. And, there's no better season for such indoor activities than wind-chilly winter days, so I've been doing that lately. Having been through this experience a number of times now, I find that the usual (or at least my usual) flow tends to fall into a sequence of four phases:
  1. The "how will I ever assemble this project" phase, associated with pondering, hesitation, and sometimes procrastination.
  2. The "OK, I'll get started" phase, setting up prerequisites and frameworks and assembling components to the reassuring point where a basic code skeleton operates.
  3. The "Check the boxes" phase, where I go through the project requirements and add support for them one or a few at a time. This usually breaks down nicely into a series of coding and testing sessions, each of which adds a few features.
  4. The "Cleanup and Embellish" phase, where I get rid of false starts accumulated during the prior phases and customize the overall result to add features that seem intriguing or add further capabilities.
After these steps, I'll submit the code, but it's not as if its grading is a primary goal. The point is the learning, and the enjoyment of the process that gets there.

07 January 2019

Making coffee: unit conversions in everyday life

We replaced our coffee grinder and coffee maker. This fact wouldn't ordinarily be blog-worthy; we've generally found these everyday appliances to have a useful life of a few years, after which point they don't seem to work as well. This replacement round brought a small puzzle and lesson, though. Our prior grinder had a dial setting, numbered in "cups", corresponding to the intended number of cups of coffee which a grinding cycle was intended to serve. I put "cups" in quotes, because it was intended to align with a corresponding measure on the same manufacturer's coffeemaker. I'd taken to calling them Arbitrary Coffee Units, or ACUs. I think that those "cups" may have been 6 ounces each, though our new coffeemaker designates its "cups" as 5 ounces. Clearly, either of those is smaller than the 8 ounces in a standard, measuring, unquoted cup.

But, our new grinder doesn't have a numbered dial. Perhaps to embrace coffee purists, other minimalists, or perhaps to avoid the cost of a timer mechanism, it just has an on-off switch. You put as much whole-bean coffee into the hopper as you need, and grind until it's done and ground. You wouldn't want to grind more, as the excess would go stale quickly. But how much whole-bean coffee do you need to brew a pot? One recommendation urged a large number of tablespoons of ground coffee, which seemed messy and inconvenient and still wouldn't answer the question of how many beans to process in order to obtain that ground result. A friend suggested their practice of weighing 66 grams of coffee to fuel their 8-cup coffeemaker - and, no, I'm not sure how large its cups are - but we didn't have a suitably precise kitchen scale and didn't want to add something else to the counter. I converted weight to volume by weighing a measuring cup with and without coffee beans on a postal scale set to metric, and found that an (8-ounce) cup of the coffee beans that I tested weighed about 72 grams, not too far off. In our kitchen for the moment, therefore, we grind a loosely-filled measuring cup of beans to yield 8 of our current ACUs and are enjoying the result.