Category Archives: General

Catch-all category

General

Annals of weird little problems, part 12

It’s fascinating to me sometimes to describe a problem that comes up in software development, outside its full context, just to remind myself of how weird and deep it sounds in isolation. Makes me feel a little better about how long it takes me to solve said problem, maybe.

In this case, in the full context we just call it the ‘slicer’, but that one word hides a lot of complexity. What the slicer does is: given a document in XML format A and a set of start points and offsets of selected regions of formatted text (both specified by character counts as the user sees the document), extract the given regions from the document in XML format C, rebuilding any necessary start/end tags and preserving internal formatting. Another process does the translation from format A to B to C. (These translation layers can add extra characters that are not part of the document as the user sees it.)

General

One cloud ain’t enough

One thing I’ve been learning over the last few months is that if your app is pretty beefy, you might well need more than one cloud provider to deploy it. With the major differences in approach between Amazon EC2 and Google AppEngine, they each have a variety of strengths and weaknesses with respect to a given application, and the weaknesses are such that you can’t really just live with them. So you partition the app into pieces and spread ’em around…

I’d like to be able to claim that that gives you a more robust solution, but of course, if you require two clouds to be operational in order for your app to work, you’re less robust. Both EC2 and GAE have been OK, but not great, in terms of uptime. I expect both to get better over time, though.

General

Stupid, but effective, hack for dev_appserver task queues

If you work with Google AppEngine and use task queues much, you know it can be annoying to have to press the ‘Run’ button on each task to make it actually run. Here’s a Greasemonkey script to push the button for you. It mindlessly pushes the first Run button it finds immediately on page load. Since the page reloads, that should eventually press them all away.

Have fun, but there are no guarantees that this won’t run tasks you didn’t really want to run or somesuch.

Dumb script

General

Digit frequency in pi

Hmmm, pi is a little bumpier than I thought. (It could just be that my statistical intuition is off, though.)

Each bar plot below represents the number of occurrences of a digit in the decimal expansion of pi. The y-axis is an index, and the frequency x is counted over the range of digits y*20 to y*20+400. I thought 400 would be a long enough length to make these graphs pretty flat. Higher lengths make it flatter, of course, but still not to the degree that seems ‘right’ to me.

I guess I could calibrate my perception by using a uniformly distributed sequence of digits…

Pi digit frequencies

Update: huh. I guess it is just me. Here’s the same sort of graph but with uniformly random digits (at least, assuming that RAND’s book, which I lazily selected as my source, is indeed uniform). Looks equally bumpy to me. Ah well, I’ll leave the post up as a reminder of my folly…

Random digit frequencies

General

asc-gzip/.xfd decompression

Scott Stafford was nice enough to post code for asc-gzip/.xfd decompression to go with my asc-gzip/.xfd compression code. See this comment. I’m also reposting it here because the comment formatting is a little more bad than the main-post formatting.


Thanks for your post. Of course, I needed the opposite, I had one I needed to decompress. So I backwarded your algorithm and here is the result:

def decompress(fc):
    fc2 = fc.splitlines(True)
    fc3 = "".join(fc2[1:]) # could verify that it's asc-gzip here if we wanted to...
    unb64 = base64.standard_b64decode(fc3)
    
    ctr = 0
    ret = []
    while 1:
        if ctr == len(unb64): break
        
        ccltop = ord(unb64[ctr])
        ctr += 1
        cclbottom = ord(unb64[ctr])
        ctr += 1
        compressedchunklen = ccltop * 256 + cclbottom
        
        cltop = ord(unb64[ctr])
        ctr += 1
        clbottom = ord(unb64[ctr])
        ctr += 1
        chunklen = cltop * 256 + clbottom
        #~ print compressedchunklen, chunklen
        
        compressedchunk = unb64[ctr:ctr+compressedchunklen]
        ctr += compressedchunklen
        
        chunk = zlib.decompress(compressedchunk)
        assert(len(chunk) ==  chunklen)
        ret.append(chunk)
    
    return "".join(ret)
General

XSLT and the wonders thereof

It’s always interesting for me to dig into XSLT. I don’t use it a whole lot, so when I do it’s all fun and new. This time around, I’m using 2.0, which wasn’t really implemented the last time I was doing anything much with the language. Loving the new features; so far I’ve used several to good effect.

It occurred to me today that document conversion is an interesting niche wherein a ‘pure functional’ paradigm is useful. I still can’t see myself using a pure functional approach in many other areas, but I do have to deal with document conversion often enough that I’m glad XSLT exists.

I find the tree-transformation model of computation is an interesting mind-bender, though when I think about it, it’s really only mind-bending in combination with the functional approach. And thinking further, I’d suppose that a functional program of any size would tend toward the nature of a tree transformation… Hmmm, something to ponder some more.

General

Dumb ways to lose files, part 18

Here’s a bizarre way to lose some files:
rm foo.x `
(didn’t notice I’d accidentally hit backtick)
ls
(always do this after file operations, it’s a weird little habit)
(oops, didn’t work. Oh, there’s a backtick in operation.)
`
(a bunch of errors about some files and directories that can’t be removed.)
(CRAP!)

Running a Bacula restore job as I type this…

General

Lost clicks in Flash Player, Eclipse

I’ve been having some weirdness in Flash Player and Eclipse, where mouse clicks seem to get lost and so I can’t click certain buttons. Turns out it has something to do with the new GTK client-side windows. It can be fixed with GDK_NATIVE_WINDOWS=1. If you need more info, ask; I’m mainly posting this as a reminder to myself, but maybe it’ll catch a few web searches…

General

Dirt/hardwood/carpet

Hmmm, one difference between hardwood floors (which I have now) and carpeted floors (which I’ve lived with most of my life) that I didn’t really anticipate: I tend to perceive hardwood floors as ‘dirtier’, in the sense that they’re more likely to give up their dirt to my sticky clothing and skin surfaces when I touch them.

That, of course, is the flip side of the ‘easier to clean’ thing, because they’ll give up their dirt to the sweeper, vacuum and dry mop more easily, too. I guess you could say it’s a more open relationship between me and housedirt. Weirder analogies are sitting in my brain, but aren’t going to get out.

General

asc-gzip/.xfd compression

If you should ever find yourself in the position of having to figure out how to compress XFDL files with the asc-gzip encoding, and I don’t wish it on you, here’s Python code to do it. Obviously, you’ll need some imports and error handling and optimization.

This thread gave me pointers to figure this out; Bryan was just a little off because he was using a gzip library rather than a zlib one.

# compress according to wacky XFDL compression scheme
def compress(fc):
    CHUNK_SIZE = 60000
    out = ''
    for i in range(0, len(fc), CHUNK_SIZE):
        chunk = fc[i:i + CHUNK_SIZE]
        chunklen = len(chunk)
        compressedchunk = zlib.compress(chunk)
        compressedchunklen = len(compressedchunk)
        out += chr(compressedchunklen / 256)
        out += chr(compressedchunklen % 256)
        out += chr(chunklen / 256)
        out += chr(chunklen % 256)
        out += compressedchunk
        
    f = StringIO.StringIO()
    f.write('application/x-xfdl;content-encoding="asc-gzip"n')
    b64 = base64.standard_b64encode(out)
    
    for i in range(0, len(b64), 76):
        f.write(b64[i:i+76])
        f.write('rn')
    ret = f.getvalue()
    f.close()
    return ret