Category Archives: General

Catch-all category

General

Encrypted directories and Bacula restore

Assuming you have machines using Ubuntu’s helpers for eCryptfs to create a Private directory backed by an encrypted .Private directory, and a Bacula configuration for these machines created in the days before Bacula had encryption support, you might take the easy road to encrypting your backups of the Private directories. You can just instruct Bacula to ignore the Private directories, since by default it’s going to pick up the .Private directories, which hold the content but are encrypted. That ensures the data is all there but is stored in encrypted form in the Bacula backup volumes.

However, this can be a little painful when it comes to restore time. If you’re restoring a whole machine or the whole home directory or something, it shouldn’t really present any problem. But so far, the only time I’ve ever had to restore something from Bacula was because of some dumb user error where I deleted a subdirectory. This happened yesterday (back when I was dumb); I deleted a significant subtree of a project directory that hadn’t been checked into my client’s source control lately, so I lost some work.

In Bacula, when you restore, you can browse the backed-up data and select a set of files to restore. So I needed to restore a subdirectory of Private by finding the corresponding subdirectory of .Private. eCryptfs encrypts the filenames, which makes that job fairly impossible. In this case, though, I had the previous version of the subdirectory, checked out of source control, on the machine. With the help of ‘du’, I was able to translate the target Private path name into a .Private path name, by matching du sizes down the tree. I restored the appropriate .Private path. Bacula restores to /tmp, so I copied the contents into the real .Private. eCryptfs caches unencrypted versions of stuff, so I rebooted (not wanting to bother to learn how to flush the eCryptfs cache). My work was restored.

This is mainly a cautionary tale, but it might also help someone who’s stuck needing to restore part of an encrypted directory. I’m going to switch to using Bacula’s encryption support pretty soon, to make this sort of thing more sane in the future…

General

Method of aerogel manufacture

I want to invent a method of aerogel manufacture that works by filling a volume with gas A, then squirting in a tiny bit of gas B (or changing the ambient temperature/pressure, etc.) which catalyzes rapid conversion from gaseous state to aerogel state. That would allow completely form-fitting aerogel packaging materials to be manufactured in situ and without chemical waste. Variations in the gases, pressures, etc. would allow variations in the final properties of the material, including pore size, hydrophobicity/hydrophilicity, etc.

The material could also rapidly transition back to the gaseous state upon exposure to some other agent, to unpack.

But I have no idea how to invent such a thing, just a baseless wish/hunch that something like that could be possibly possible in some sort of world. So, if you’re reading this, please go invent that and give me 1% of your licensing fees in perpetuity. Thank you.

General

libguestfs

Hmmm, interesting project: libguestfs. Back when Windows VMs were more a daily part of my life, I might have really had a use for this; now I’m just sort of curious about it.

From the description:

libguestfs is a set of tools for accessing and modifying virtual machine (VM) disk images… libguestfs can access nearly any type of filesystem including: all known types of Linux filesystem (ext2/3/4, XFS, btrfs etc), any Windows filesystem (VFAT and NTFS), any Mac OS X and BSD filesystems, LVM2 volume management, MBR and GPT disk partitions, raw disks, qcow2, VirtualBox VDI, VMWare VMDK, CD and DVD ISOs, SD cards, and dozens more.

my first thought was that they had integrated a whole lot of libraries to support reading filesystems from disk images, layered perhaps on top of some libraries to support reading the various disk image formats. But they actually use a sorta more interesting approach: they use qemu to boot a little VM that attaches the target disk image and then acts as a server with which the library communicates to do its things. I assume (haven’t verified) that the little VM is based on a Linux kernel, for which drivers are available for a lot of that stuff. ‘course, depending on how abstract you want to be, you could consider the two approaches to be the same, with different meanings of the term ‘libraries’ and different sorts of layering. Anyway, it’s an approach I’ll have to keep in the back of my mind in case there are other problems where it can be applied…

General

Nosy phone

Did I ever share my little vision of future phones that listen in and contribute to conversations? When voice recognition and some basic conversational AI are well-developed, I can imagine that phones would listen to the conversations, both calls and environmental conversations (when they’re sitting on the table between two people), and constantly scan for some sort of question they could answer.

Just now someone asked me in a call “What time is it?”. Of course, I had to look at my phone to answer that, so it’s sorta obvious that a nosy phone could answer that one. But the scope could be much larger than that with Internet-connected phones running little background searches for anything interesting and piping in with answers.

Futurama had a good bit about that, though the computer wasn’t in portable form. See number 12 here.

General

XSLT exceptions

As of XSLT 2.0, and as-far-as-I-know-correct-me-if-I’m-wrong, there’s no native mechanism in the language for exception handling. (Update: although that’s still true, I should have looked at 2.1 or Saxon’s extension. Though I’m still going with this method because I don’t have 2.1 and I’m not using the EE version of Saxon.)

I have some stylesheets that attempt to do some processing on specified chunks of an input document, copying everything else unaltered. There are rare exceptional conditions that I can’t easily detect before I start producing output for a given chunk. These are rare enough that when I encounter them, all I want to do is cancel processing on this chunk and emit it unaltered. Some sort of exception handling is in order, but XSLT doesn’t help very much.

Here’s an example of the sort of scenario I’m talking about. Here’s an input document:
[xml]
<doc>
<block>some text, just copy.</block>
<!– the following table should have B substituted for a –>
<table>
<tr><td>a</td><td>b</td><td>c</td></tr>
<tr><td>b</td><td>a</td><td>c</td></tr>
<tr><td>b</td><td>c</td><td>a</td></tr>
</table>
<block>some more text, just copy.</block>
<!– the following table should be copied unaltered because of the presence of an x –>
<table>
<tr><td>a</td><td>b</td><td>c</td></tr>
<tr><td>b</td><td>a</td><td>x</td></tr>
<tr><td>b</td><td>c</td><td>a</td></tr>
</table>
</doc>
[/xml]

I want to look through each table and replace all cell values ‘a’ with ‘B’. However, if there’s an ‘x’ somewhere in the table, I want to just copy the table unmodified. I know that in this case, I could just do a tr/td[.='x'] test on the table to discover this condition. In the real case, though, it’s not so easy to test ahead of time for the condition.

Here’s some XSLT that doesn’t account for the exception:
[xslt]
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template match="table">
<xsl:copy>
<xsl:copy-of select="@*"/>
<xsl:apply-templates mode="inner"/>
</xsl:copy>
</xsl:template>

<xsl:template mode="inner" match="td">
<xsl:copy>
<xsl:copy-of select="@*"/>
<xsl:choose>
<xsl:when test=". = ‘a’">
<xsl:value-of select="’B’"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="."/>
</xsl:otherwise>
</xsl:choose>
</xsl:copy>
</xsl:template>

<xsl:template mode="inner" match="@*|node()" priority="-10">
<xsl:copy>
<xsl:apply-templates mode="inner" select="@*|node()"/>
</xsl:copy>
</xsl:template>

<xsl:template match="@*|node()" priority="-10">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
[/xslt]

The output of that is:
[xml]
<?xml version="1.0" encoding="UTF-8"?><doc>
<block>some text, just copy.</block>
<!– the following table should have B substituted for a –>
<table>
<tr><td>B</td><td>b</td><td>c</td></tr>
<tr><td>b</td><td>B</td><td>c</td></tr>
<tr><td>b</td><td>c</td><td>B</td></tr>
</table>
<block>some more text, just copy.</block>
<!– the following table should be copied unaltered because of the presence of an x –>
<table>
<tr><td>B</td><td>b</td><td>c</td></tr>
<tr><td>b</td><td>B</td><td>x</td></tr>
<tr><td>b</td><td>c</td><td>B</td></tr>
</table>
</doc>
[/xml]

(it did the substitutions in the second table, which I don’t want.)

My current solution is to do this:

  1. Emit each table into a variable instead of directly into the output
  2. If the exception occurs, emit an <EXCEPTION/> tag
  3. After each table is processed, look through the variable for the <EXCEPTION/> tag.
  4. If the exception happened, copy the original table, else copy the contents of the variable.

Here’s the modified code and output:
[xslt]
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template match="table">
<xsl:variable name="result">
<xsl:copy>
<xsl:copy-of select="@*"/>
<xsl:apply-templates mode="inner"/>
</xsl:copy>
</xsl:variable>
<xsl:choose>
<xsl:when test="$result//EXCEPTION">
<xsl:copy-of select="."/>
</xsl:when>
<xsl:otherwise>
<xsl:copy-of select="$result"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>

<xsl:template mode="inner" match="td">
<xsl:copy>
<xsl:copy-of select="@*"/>
<xsl:choose>
<xsl:when test=". = ‘a’">
<xsl:value-of select="’B’"/>
</xsl:when>
<xsl:when test=". = ‘x’">
<EXCEPTION/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="."/>
</xsl:otherwise>
</xsl:choose>
</xsl:copy>
</xsl:template>

<xsl:template mode="inner" match="@*|node()" priority="-10">
<xsl:copy>
<xsl:apply-templates mode="inner" select="@*|node()"/>
</xsl:copy>
</xsl:template>

<xsl:template match="@*|node()" priority="-10">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
[/xslt]

[xml]
<?xml version="1.0" encoding="UTF-8"?><doc>
<block>some text, just copy.</block>
<!– the following table should have B substituted for a –>
<table>
<tr><td>B</td><td>b</td><td>c</td></tr>
<tr><td>b</td><td>B</td><td>c</td></tr>
<tr><td>b</td><td>c</td><td>B</td></tr>
</table>
<block>some more text, just copy.</block>
<!– the following table should be copied unaltered because of the presence of an x –>
<table>
<tr><td>a</td><td>b</td><td>c</td></tr>
<tr><td>b</td><td>a</td><td>x</td></tr>
<tr><td>b</td><td>c</td><td>a</td></tr>
</table>
</doc>
[/xml]

It works, but I’m still wondering if there’s a better approach…

General

OK/Cancel

Whenever someone asks me a yes-or-no question, the first answer that pops to mind, though one which I rarely articulate, is always “well…. no, and yes, and yes and no.”.

General

Saxon and generate-id()

I’m using Saxon as my XSLT 2.0 processor in a project. The stylesheet relies on generate-id(), and my unit testing currently relies on string comparison of the output to expected output. Occasionally, generate-id()s values change on me and I get annoyed.

After a little investigation, I believe that the first little bit of the ID comes from some internal index of XML documents that Saxon is keeping track of. Things like <xsl:include/> and document() calls can change these indexes. So there ya go.

General

The anatomy of a semantic mishap

Last night I threw out a bunch of my neighbors’ stuff.

It was an accident. And they recovered their stuff before it was truly lost. And I think this record will show that I did not act too strangely. But I still feel bad about it, and I’m going to be a bit more cautious in such situations in the future. And I like to ramble on uselessly about these sort of weird corner-cases in life, so here goes.

I am the unofficial trash-curber for my duplex (the neighbors I’m referencing are the people in the other apartment). I took on that role when I moved in, after a little discussion with the previous tenant in my apartment, and figuring that it’s easier to just be that guy than to try to work out some scheme to share the work.

Last night, the night before garbage day, I went to put stuff out to the curb. There was the one trash bin full. Closely adjacent to it, probably an inch away, were a computer case and one of those plastic storage bins. Adjacent to those were a sorta homemade-looking plastic/wood thingy and another storage bin. I got the flashlight and inspected more closely. One bin seemed to be mostly empty bottles and stuff. Don’t recall exactly what was in the other bin. Some of this stuff I had seen in a pile elsewhere in the carport for a few weeks.

Overall, my thought process was: the whole list above constitutes a ‘pile’ by virtue of transitive adjacency. On a scale from 0-10, 0 = obvious treasure, 10 = obvious trash, this stuff was all in the 4-9 range by my visual judgment. Nothing in the pile was something I hadn’t seen in trash piles or thrown out myself. Since I’d seen it piled elsewhere, and now it was all in the candidate trash pile, it was probably a final status transition from ‘maybe we need it and will take it into the house some day’ to ‘nah, it’s trash’. So, the candidate trash pile is now a certified trash pile, and I’ll move it all to the curb.

Later that night, I heard some bumping about outside, and when I looked out, all that stuff was gone from the curb. The scavengers in the neighborhood have been known to be fast, but not that thorough, so I kinda figured it must be the neighbors recovering the stuff. This morning, I heard the neighbors leaving, so I ran out and asked, and indeed, I had incorrectly judged the pile, but they got everything back.

This all makes me quite curious about the semantics of trash piles. Trash in general, too. In some web searches, I came across Purity and danger : an analysis of concept of pollution and taboo, which looks pretty interesting. See, I’m not so crazy for being somewhat fascinated by this stuff; some book author is too.

General

Reading open source

I haven’t done a lot of it, but I’ve done a little, and per my recent thoughts about the value of recording accumulated learning in the form of code, I’m wondering whether the wide availability of open source for so many types of software will make it a lot more prevalent and show a whole facet of the value of open source that hasn’t been addressed too often.

What I mean by ‘it’ is reading source code in order to learn about the field in which the code is used. With good code and skill at navigating and reading it, I bet a lot of answers to questions about a field can be found, along with related structures and details and exceptions that aren’t always captured effectively in books or expert answers. Hmm, I say.

General

“Software as capital”

This book is pretty interesting so far: Software as capital: an economic perspective on software engineering. He has a nice statement of bit rot in economic terms, for example.

I think one thing that really appeals to me about this book is that it’s helping paint one of the pieces of the puzzle of my life.

If I’m anything, I’m a creature of learning. Baetjer points out quite explicitly something I’ve been coming to realize slowly over my career: software development is, more than anything, a process of social learning. A software developer enters the world of the user and helps establish a process to out all the strange little bits of knowledge hidden in the corners of every sort of human endeavor. Code is a way to structure this learning so that it can be shared, studied, remembered, and, maybe most importantly, incrementally accumulated as those bits of knowledge come out.

When I was in school, I could never really take notes. I found that it was easier to pretend to take notes than to just sit there; partly, that kept me awake, partly, it was a social ritual. But it was never much of a learning tool. How could I engage with my notes any better than I could engage with the lecture? But code is a different sort of record of learning. Maybe it’s that compilers and the actual execution of the code keep us more honest and force us to be more thorough. Maybe it’s the fact that the visual structure of a computer language on the screen fits its semantic structure a lot better than with human languages. Those aspects and the discipline that grows around them make it a lot easier to slowly build an effective learning repository that elicits the abstract structures of the knowledge while not losing any of the little details. I love the feeling I get when I’m in the middle of a big software system… I’m not sure what’s a good analogy to help explain what that means; perhaps everyone’s familiar with the feeling of really knowing the geography of the city they live in. The feeling when someone asks you a question about where something is, half of your brain activates instantly and simultaneously with a complete map of the route from here to there, with alternate routes and associative connections to attractions and landmarks along the way and estimates of how long it will take to get there. The feeling that it’s inside you as much as you’re inside it. Those sort of feelings arose only by accident in school; software development is a pretty reliable methodology for generating this very deep sense of understanding.

So surely that has something to do with why I continue to be a software developer…