10.08.09

asc-gzip/.xfd compression

Posted in General at 4:24 pm by Steven

If you should ever find yourself in the position of having to figure out how to compress XFDL files with the asc-gzip encoding, and I don’t wish it on you, here’s Python code to do it. Obviously, you’ll need some imports and error handling and optimization.

This thread gave me pointers to figure this out; Bryan was just a little off because he was using a gzip library rather than a zlib one.

# compress according to wacky XFDL compression scheme
def compress(fc):
    CHUNK_SIZE = 60000
    out = ''
    for i in range(0, len(fc), CHUNK_SIZE):
        chunk = fc[i:i + CHUNK_SIZE]
        chunklen = len(chunk)
        compressedchunk = zlib.compress(chunk)
        compressedchunklen = len(compressedchunk)
        out += chr(compressedchunklen / 256)
        out += chr(compressedchunklen % 256)
        out += chr(chunklen / 256)
        out += chr(chunklen % 256)
        out += compressedchunk

    f = StringIO.StringIO()
    f.write('application/x-xfdl;content-encoding="asc-gzip"\n')
    b64 = base64.standard_b64encode(out)

    for i in range(0, len(b64), 76):
        f.write(b64[i:i+76])
        f.write('\r\n')
    ret = f.getvalue()
    f.close()
    return ret

6 Comments »

  1. anna said,

    October 21, 2009 at 12:11 pm

    ahhh, code beautiful code! Remember the night I “watched you write code”? OO

  2. Scott Stafford said,

    December 10, 2009 at 11:08 am

    Thanks for your post. Of course, I needed the opposite, I had one I needed to decompress. So I backwarded your algorithm and here is the result:

    def decompress(fc):
        fc2 = fc.splitlines(True)
        fc3 = "".join(fc2[1:]) # could verify that it's asc-gzip here if we wanted to...
        unb64 = base64.standard_b64decode(fc3)
    
        ctr = 0
        ret = []
        while 1:
            if ctr == len(unb64): break
    
            ccltop = ord(unb64[ctr])
            ctr += 1
            cclbottom = ord(unb64[ctr])
            ctr += 1
            compressedchunklen = ccltop * 256 + cclbottom
    
            cltop = ord(unb64[ctr])
            ctr += 1
            clbottom = ord(unb64[ctr])
            ctr += 1
            chunklen = cltop * 256 + clbottom
            #~ print compressedchunklen, chunklen
    
            compressedchunk = unb64[ctr:ctr+compressedchunklen]
            ctr += compressedchunklen
    
            chunk = zlib.decompress(compressedchunk)
            assert(len(chunk) ==  chunklen)
            ret.append(chunk)
    
        return "".join(ret)
    
  3. Steven’s weblog » asc-gzip/.xfd decompression said,

    December 10, 2009 at 11:30 am

    [...] to post code for asc-gzip/.xfd decompression to go with my asc-gzip/.xfd compression code. See this comment. I’m also reposting it here because the comment formatting is a little more bad than the [...]

  4. Zachary D. Skelton said,

    February 7, 2011 at 11:38 am

    Last year, I led a group of open source devs in creating an online utility to create XFDL files in mass based on a database. In all the google searching, we never came by this post and what a shame, as our page was run by python! Currently, I’m working to create a Linux based XFDL/XFD file viewer and am wondering if you’ve dealt with these files in C? It is far easier in Python but I’m going with C right now to utilize wx and create an app that can be compiled on Windows, Mac, and Linux with only minor changes. I plan to reference this post in my blog but thanks for your research!!

  5. Steven said,

    February 7, 2011 at 11:45 am

    Sorry, no, I haven’t done anything more with these file. The particular chunk of code I wrote was replaced by a completely different system and I haven’t been involved.

    (You’re aware of http://wxpython.org/ , right? Just want to make sure you aren’t punishing yourself with C unnecessarily :-) .

  6. Neil said,

    February 16, 2011 at 3:42 pm

    Thanks, Steven.
    I linked to you as well: http://stackoverflow.com/questions/6811/how-can-i-encode-xml-files-to-xfdl-base64-gzip/5021875#5021875

Leave a Comment