General

debdiff

Hmmm, never ran across a debdiff before today. Gonna see if the one posted for this libvirt problem helps me.

Update (which was actually posted simultaneously with the original): yup. Yay, open source is just so sweet to me.

General

Things undone

Occasionally I feel bad about all the things left undone.

In software development, there are two things we routinely do that occasion such feelings. (Ooops, I’ve been told I’ve been using the word ‘feel’ too much.) We use ticket-tracking systems, and we write TODO comments.

Ticket-tracking systems are a way to keep track of work to be done and the evolution of the product. A ticket is a short, detailed description of an action to be taken to improve the product. They come in different types: bug, improvement, feature, task, etc. The description is action-oriented and specific, and assumes the current product as the context (at least the best tickets are like that; maybe I’ll talk a little more later about less-than-best tickets). Here’s a decent-looking example of a ticket. I could go on for quite a while about the good and the bad of ticket-tracking systems… For now, the salient points are:

  • a ticket represents some work to be done
  • tickets are assigned to someone, to do the work
  • as a developer, you spend time daily looking at the list of tickets assigned to you

TODO comments are a simpler manifestation of a similar idea. When writing code, you might see room for improvement in some particular technical aspect, but you don’t have the time to tackle it right now. So you put a little TODO comment in there explaining what you think might be improved. Here’s a little contrived example:
[python]
if n == 0:
print "none"
elif n > 0:
print "some"
# TODO: what if n is negative?
[/python]

In this case, I can see that the set of all numbers is not covered by the two conditions, so I’m led to wonder whether I should be handling the case where n is negative. Now, if I really thought that case was going to arise naturally and soon, I’d figure out how to handle it and write the code. But in this example, I’m thinking: it could happen, but I can’t see any reason it should. Let’s say n is the count of apples in the box from the apple-counting machine. No reason for that to be negative. But, ya know, it could be, somehow. So I note that I feel uneasy about leaving that case uncovered. Later, when browsing through the code or debugging it, I might see that comment, and due to an increased understanding of the world or an abundance of spare time or something, I’ll decide to cover the case. Maybe I learned that the apple-counter counts a zucchini as -1 apples, and somehow a zucchini gets in the box occasionally.

Where tickets and TODOs can get depressing is when you look at your list and realize that there are 30 tickets on it, 10 of which are more than a year old, and 23 of which you’ve been systematically ignoring for a long time. Or you run across a TODO and a scenario flashes through your mind, where the improbable thing happens, and the resulting chain of causality ends with a lawsuit in Taiwan.

It might be easy to say “Well, this simply should never happen! These tasks should be dispatched with great haste and with all available resources!”. Certainly one can imagine a process or organization where that is the rule, and buildups of old cruft never happen. But follow me for now into my world: in the places where I’ve worked, there is no such rule, and the cruft does build up. I’ve left jobs with dozens of tickets and TODOs left in my wake, where they either evaporated when I left or are still lurking around somewhere today.

The rule in these places is more like “Good ideas should be captured immediately and dealt with when there’s time”. That’s a perfectly valid rule to use, but it does tend to lead to a situation where there are lists of lots of good ideas left unattended. If you believe in good ideas and you want to perfect your work, then these lists sometimes look pretty grim, numerically speaking. There are always more good ideas than time to implement them, by a laaarge factor.

So, you learn to live with the backlog.

General

A DSL menagerie

Someone should compile a menagerie of domain specific languages, to highlight all the different ways people have used DSLs to solve real-world problems…

General

Digging for answers

Frustration. While it feels like a waste of time and can put my stomach in knots, I think it can help me become a better person, in general, at least.

Lately I’ve been working on adapting this big open source server project for use for a client (sorry about the vagueness, but the stuff I’m working on is proprietary). In the best of all worlds, I’d deploy the client’s app on the server and it would just work. And it won’t surprise you to learn that we don’t live in that world. The world we live in has features like:

  • the server has an old version of the main framework that the client’s app uses, meaning I have to find the uses of the new features and back them up to older code
  • the server project has issued very little documentation. That might make me look for an alternative server, but it happens that there are only two alternatives and this one seems, from various viewpoints, to be the best by far
  • there is some problem

That last point might seem even more vague than the others, and believe me, I wish I could be more specific. But ya see, that’s approximately all the information I can get the server to give me about this particular problem. I try to run a tiny piece of code, and it tells me “Error: 500”. I’ve spent hours digging through the pieces of the server; it has load-balancers and caches and proxies and RPC servers talking to database adapters talking to databases, talking in HTTP and a few other protocols; it has components written in Python and Ruby, Java, and C; it has log files in many different places and with different ways to enable logging in each component; etc.

So far, all this digging has led me to the conclusion that yes, there still is a problem. That’s frustrating. But the process leads me to probe into the server components with various techniques, so I’m learning. Learned about ngrep, tonight, for example. Learned how to use the shells provided with the various databases to try to see what’s going on in them. Learned about the wonders of libvirt. Learned about some new network protocols.

At a higher level, I think these experiences teach me about patience, persistence and investigative techniques. Those abilities come in handy, and it seems I still somehow have less than the maximum amount of each of them.

General

Sense of accomplishment

I have a difficult relationship with the concept of ‘a sense of accomplishment’. I feel like it’s somewhat necessary to true motivation, but then I feel it’s a character flaw to truly feel such a sense. I feel like I’ve done a lot of good work, but there’s always more I could have done. I feel like the things I’ve done are significant, but if I pointed them out to the average person, they’d be far less than impressed.

One of my clients reached a big product milestone recently. It’s pretty amazing that I’ve been with this product since its inception until this milestone. I felt the need to reflect on what I’ve worked on in the product in that time.

  • code generation for AS3 to Python RPC
  • HTML and RTF paste
  • spell checking
  • highlights and callouts
  • XBRL HTML slicing
  • equation parsing and evaluation
  • slimming Flex module download size
  • HTML import
  • browser issues with keys and mousewheel
  • Google AppEngine/EC2 integration
  • parallelization of translation functions
  • Undisclosed Big-Deal Project
  • PDF export

When I look at each of those bullets, I remember lots of work that I had to do on each, the difficult problems that arose and the solutions to them, the necessity of each function and the contribution toward the overall product. But I also remember the things left undone that could make each function more perfect, the work that others did that I can’t take credit for, that if I pointed out these functions in the app, someone unfamiliar with the job wouldn’t understand the work that it represents, and that in at least one of these projects, all of my code is now dead.

So my sense of accomplishment is a complicated and fragile thing. That’s not a problem; actually, when I say it, I feel like that’s a more mature attitude than one that’s more monolithic. Maybe that’s a partial solution to the question I mentioned above of whether it’s a character flaw to feel a sense of accomplishment: maybe it’s only hubris to feel good about the foreground of one’s accomplishments if one doesn’t also understand the inseparable background against which they are viewed.

General

Encrypted directories and Bacula restore

Assuming you have machines using Ubuntu’s helpers for eCryptfs to create a Private directory backed by an encrypted .Private directory, and a Bacula configuration for these machines created in the days before Bacula had encryption support, you might take the easy road to encrypting your backups of the Private directories. You can just instruct Bacula to ignore the Private directories, since by default it’s going to pick up the .Private directories, which hold the content but are encrypted. That ensures the data is all there but is stored in encrypted form in the Bacula backup volumes.

However, this can be a little painful when it comes to restore time. If you’re restoring a whole machine or the whole home directory or something, it shouldn’t really present any problem. But so far, the only time I’ve ever had to restore something from Bacula was because of some dumb user error where I deleted a subdirectory. This happened yesterday (back when I was dumb); I deleted a significant subtree of a project directory that hadn’t been checked into my client’s source control lately, so I lost some work.

In Bacula, when you restore, you can browse the backed-up data and select a set of files to restore. So I needed to restore a subdirectory of Private by finding the corresponding subdirectory of .Private. eCryptfs encrypts the filenames, which makes that job fairly impossible. In this case, though, I had the previous version of the subdirectory, checked out of source control, on the machine. With the help of ‘du’, I was able to translate the target Private path name into a .Private path name, by matching du sizes down the tree. I restored the appropriate .Private path. Bacula restores to /tmp, so I copied the contents into the real .Private. eCryptfs caches unencrypted versions of stuff, so I rebooted (not wanting to bother to learn how to flush the eCryptfs cache). My work was restored.

This is mainly a cautionary tale, but it might also help someone who’s stuck needing to restore part of an encrypted directory. I’m going to switch to using Bacula’s encryption support pretty soon, to make this sort of thing more sane in the future…

General

Method of aerogel manufacture

I want to invent a method of aerogel manufacture that works by filling a volume with gas A, then squirting in a tiny bit of gas B (or changing the ambient temperature/pressure, etc.) which catalyzes rapid conversion from gaseous state to aerogel state. That would allow completely form-fitting aerogel packaging materials to be manufactured in situ and without chemical waste. Variations in the gases, pressures, etc. would allow variations in the final properties of the material, including pore size, hydrophobicity/hydrophilicity, etc.

The material could also rapidly transition back to the gaseous state upon exposure to some other agent, to unpack.

But I have no idea how to invent such a thing, just a baseless wish/hunch that something like that could be possibly possible in some sort of world. So, if you’re reading this, please go invent that and give me 1% of your licensing fees in perpetuity. Thank you.

General

libguestfs

Hmmm, interesting project: libguestfs. Back when Windows VMs were more a daily part of my life, I might have really had a use for this; now I’m just sort of curious about it.

From the description:

libguestfs is a set of tools for accessing and modifying virtual machine (VM) disk images… libguestfs can access nearly any type of filesystem including: all known types of Linux filesystem (ext2/3/4, XFS, btrfs etc), any Windows filesystem (VFAT and NTFS), any Mac OS X and BSD filesystems, LVM2 volume management, MBR and GPT disk partitions, raw disks, qcow2, VirtualBox VDI, VMWare VMDK, CD and DVD ISOs, SD cards, and dozens more.

my first thought was that they had integrated a whole lot of libraries to support reading filesystems from disk images, layered perhaps on top of some libraries to support reading the various disk image formats. But they actually use a sorta more interesting approach: they use qemu to boot a little VM that attaches the target disk image and then acts as a server with which the library communicates to do its things. I assume (haven’t verified) that the little VM is based on a Linux kernel, for which drivers are available for a lot of that stuff. ‘course, depending on how abstract you want to be, you could consider the two approaches to be the same, with different meanings of the term ‘libraries’ and different sorts of layering. Anyway, it’s an approach I’ll have to keep in the back of my mind in case there are other problems where it can be applied…

General

Nosy phone

Did I ever share my little vision of future phones that listen in and contribute to conversations? When voice recognition and some basic conversational AI are well-developed, I can imagine that phones would listen to the conversations, both calls and environmental conversations (when they’re sitting on the table between two people), and constantly scan for some sort of question they could answer.

Just now someone asked me in a call “What time is it?”. Of course, I had to look at my phone to answer that, so it’s sorta obvious that a nosy phone could answer that one. But the scope could be much larger than that with Internet-connected phones running little background searches for anything interesting and piping in with answers.

Futurama had a good bit about that, though the computer wasn’t in portable form. See number 12 here.

General

XSLT exceptions

As of XSLT 2.0, and as-far-as-I-know-correct-me-if-I’m-wrong, there’s no native mechanism in the language for exception handling. (Update: although that’s still true, I should have looked at 2.1 or Saxon’s extension. Though I’m still going with this method because I don’t have 2.1 and I’m not using the EE version of Saxon.)

I have some stylesheets that attempt to do some processing on specified chunks of an input document, copying everything else unaltered. There are rare exceptional conditions that I can’t easily detect before I start producing output for a given chunk. These are rare enough that when I encounter them, all I want to do is cancel processing on this chunk and emit it unaltered. Some sort of exception handling is in order, but XSLT doesn’t help very much.

Here’s an example of the sort of scenario I’m talking about. Here’s an input document:
[xml]
<doc>
<block>some text, just copy.</block>
<!– the following table should have B substituted for a –>
<table>
<tr><td>a</td><td>b</td><td>c</td></tr>
<tr><td>b</td><td>a</td><td>c</td></tr>
<tr><td>b</td><td>c</td><td>a</td></tr>
</table>
<block>some more text, just copy.</block>
<!– the following table should be copied unaltered because of the presence of an x –>
<table>
<tr><td>a</td><td>b</td><td>c</td></tr>
<tr><td>b</td><td>a</td><td>x</td></tr>
<tr><td>b</td><td>c</td><td>a</td></tr>
</table>
</doc>
[/xml]

I want to look through each table and replace all cell values ‘a’ with ‘B’. However, if there’s an ‘x’ somewhere in the table, I want to just copy the table unmodified. I know that in this case, I could just do a tr/td[.='x'] test on the table to discover this condition. In the real case, though, it’s not so easy to test ahead of time for the condition.

Here’s some XSLT that doesn’t account for the exception:
[xslt]
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template match="table">
<xsl:copy>
<xsl:copy-of select="@*"/>
<xsl:apply-templates mode="inner"/>
</xsl:copy>
</xsl:template>

<xsl:template mode="inner" match="td">
<xsl:copy>
<xsl:copy-of select="@*"/>
<xsl:choose>
<xsl:when test=". = ‘a’">
<xsl:value-of select="’B’"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="."/>
</xsl:otherwise>
</xsl:choose>
</xsl:copy>
</xsl:template>

<xsl:template mode="inner" match="@*|node()" priority="-10">
<xsl:copy>
<xsl:apply-templates mode="inner" select="@*|node()"/>
</xsl:copy>
</xsl:template>

<xsl:template match="@*|node()" priority="-10">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
[/xslt]

The output of that is:
[xml]
<?xml version="1.0" encoding="UTF-8"?><doc>
<block>some text, just copy.</block>
<!– the following table should have B substituted for a –>
<table>
<tr><td>B</td><td>b</td><td>c</td></tr>
<tr><td>b</td><td>B</td><td>c</td></tr>
<tr><td>b</td><td>c</td><td>B</td></tr>
</table>
<block>some more text, just copy.</block>
<!– the following table should be copied unaltered because of the presence of an x –>
<table>
<tr><td>B</td><td>b</td><td>c</td></tr>
<tr><td>b</td><td>B</td><td>x</td></tr>
<tr><td>b</td><td>c</td><td>B</td></tr>
</table>
</doc>
[/xml]

(it did the substitutions in the second table, which I don’t want.)

My current solution is to do this:

  1. Emit each table into a variable instead of directly into the output
  2. If the exception occurs, emit an <EXCEPTION/> tag
  3. After each table is processed, look through the variable for the <EXCEPTION/> tag.
  4. If the exception happened, copy the original table, else copy the contents of the variable.

Here’s the modified code and output:
[xslt]
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template match="table">
<xsl:variable name="result">
<xsl:copy>
<xsl:copy-of select="@*"/>
<xsl:apply-templates mode="inner"/>
</xsl:copy>
</xsl:variable>
<xsl:choose>
<xsl:when test="$result//EXCEPTION">
<xsl:copy-of select="."/>
</xsl:when>
<xsl:otherwise>
<xsl:copy-of select="$result"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>

<xsl:template mode="inner" match="td">
<xsl:copy>
<xsl:copy-of select="@*"/>
<xsl:choose>
<xsl:when test=". = ‘a’">
<xsl:value-of select="’B’"/>
</xsl:when>
<xsl:when test=". = ‘x’">
<EXCEPTION/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="."/>
</xsl:otherwise>
</xsl:choose>
</xsl:copy>
</xsl:template>

<xsl:template mode="inner" match="@*|node()" priority="-10">
<xsl:copy>
<xsl:apply-templates mode="inner" select="@*|node()"/>
</xsl:copy>
</xsl:template>

<xsl:template match="@*|node()" priority="-10">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
[/xslt]

[xml]
<?xml version="1.0" encoding="UTF-8"?><doc>
<block>some text, just copy.</block>
<!– the following table should have B substituted for a –>
<table>
<tr><td>B</td><td>b</td><td>c</td></tr>
<tr><td>b</td><td>B</td><td>c</td></tr>
<tr><td>b</td><td>c</td><td>B</td></tr>
</table>
<block>some more text, just copy.</block>
<!– the following table should be copied unaltered because of the presence of an x –>
<table>
<tr><td>a</td><td>b</td><td>c</td></tr>
<tr><td>b</td><td>a</td><td>x</td></tr>
<tr><td>b</td><td>c</td><td>a</td></tr>
</table>
</doc>
[/xml]

It works, but I’m still wondering if there’s a better approach…