Archive for May, 2008

Get Satisfied

Thursday, May 15th, 2008

Along with making awesome software, we strive to provide the highest quality support to all of our customers. From the beginning, we’ve done 100% developer based support. We, the developers, are held directly accountable and are pushed by our users for new features and fixes.

For the first year, all support traffic ran through support@sftpdrive.com. You could usually expect a response within a few minutes [or worst, a few hours]. People love a quick response from the lead developer of a product they just shelled out $39 for. Predictably, as the volume of e-mail grew, we had to switch over to some method that would allow us to consolidate some of this effort.

Next came the Magnetk Support Forum. Forum-based support works great. We avoid answering the same questions by making public all our previous support interactions. Users help each other and answer questions for us. Some users really buy into the forum and provide all sorts of interesting tips and tricks that we wouldn’t have thought of on our own. That’s awesome. In addition, a forum builds a searchable ad-hoc-knowledge-base, where anyone can search for an answer without ever having to ask the question. But it’s not some super-lame knowledge base where some chump in the “support” department decided what questions you wanted answers to. Man, I hate corporate knowledge bases.

Still, the forum isn’t perfect. The idea of a message board really turns some people off, and it is hard categorize or organize support in any meaningful way. Enter Get Satisfaction. Get Satisfaction is a great small company who is squarely focused on helping companies like Magnetk provide support to their customers. Get Satisfaction reduces the amount of friction required for a user to ask a question and makes it even easier to receive notifcations of a response.

Now our users can also quickly follow the responses and progress of a problem they also have by clicking the “I have this problem, too!” button. Get Satisfaction also excels in helping categorize Questions, Ideas, Problems, or pure discussion. In addition, it provides some more of that modern-web-2.0-application feel that most of our customers have come to really appreciate in other places. I must admit, it’ll be nice to be able to tag posts with meaningful information so we can help build up a good search index later on.

Get Satisfaction is being used successfully by hundreds of companies: Timbuk2, Twitter and Pownce are just some of our favorites. They have a huge number of active users providing support to each other and to the companies. I have a feeling this will work great.

Packing It All In: Distributing Python With an App

Tuesday, May 13th, 2008

Python has lovely built-in distribution tools. They’re great to use if you need a nice, repeatable, easy way to distribute your source code and have it install cleanly on a platform that has its $PATH set up correctly. However, if you want to distribute Python as part of a commercial software package, to platforms that may not even have Python installed, the procedure is not as clean or clear-cut. We devised a way to do it that mostly works, though we have to tweak it somewhat for each release. I’ll show you here our method for doing just that, using the Snakefood program for dependency extraction and a custom script to fill in the gaps that Snakefood can’t quite bridge.

Python is an interpreted language, which means, very basically, that it will not compile down to something that will run natively on any platform. The standard way to get Python to operate is to use the CPython interpreter, a program written in C that reads Python code performs the actions it describes (called “interpreting” it). There are other options, too, like Jython and IronPython, which do basically the same thing as CPython except that they translate the Python code to Java and .NET, respectively. We stick with C. After all, the whole reason we’re doing any of this is that we can’t count on Python being installed. We certainly can’t count on Java of .NET being installed.

As a very basic step one, we need to bundle the CPython interpreter with our app. It’s only about 15MB and is highly compressible, so we can easily include the interpreter, but the standard libraries in Python make for a fairly large installation: the estimated size of Python 2.5.2 is about 180MB. Even if we compress that, it’s still a huge download and a not-so-inconsequential amount of hard drive space. The good news is that we don’t use all of the standard libraries. The even better news is that there’s a pretty simple way of extracting only the files you do need and packaging them into a much smaller distribution. The trick up our sleeve is a small program written in Python called Snakefood. It’s not perfect, but I’ll show ways to get the most out of it.

The first step, of course, is getting Snakefood and installing it. If Python is in your $PATH, just extract the source, then run:

% python setup.py install

from the Snakefood directory, which will install Snakefood to wherever your current Python installation is. You can then run it with:

% python sfood <target file>

from any directory. The target file is the main script of your program. With just that command, it will pull the dependencies from the ‘import’ statements in your main script. That’s probably not good enough, so use the option --follow, which follows all the import statements in each of the imported modules to their leaves. That gets most of what you need.

The output of running Snakefood on a target is not entirely intuitive. It is a list of tuples like the following:

((<source_package_root>, <source_file.py>), (<dest_package_root>, <dest_file.py>))

But sometimes the entry looks like this:

((<source_package_root>, <source_file.py>), (None, None))

It may be tempting, but you can’t skip these lines.

The format of the dependencies tells you that <source_file.py> depends on <dest_file.py>, so you need to preserve it in your pared-down distribution. For us, this is as simple as making a new directory called dist/, and copying the file at path os.path.join(<dest_package_root>, <dest_file.py>) into it. You can make a list of these files directly from the Snakefood output (piped from stdin) with the following script:

import sys   
import os   
files = set()
for dep in map(eval, sys.stdin):
    if dep[1][0] is not None:
        path = os.path.join(dep[1][0], dep[1][1])
        files.add(path)
    else:
        path = os.path.join(dep[0][0], dep[0][1])
        files.add(path)

Now take this set of files and copy them into your new directory. Preserving the directory hierarchy is nontrivial, but not that hard. Hopefully, you have already created a custom Python installation so that all of the relevant files are in one place anyway. From there, you must find the root of the dependency tree. My custom Python installation is at /Users/matthewmoskwa/ExpanDrive/python, so on each path in the file set, I split on 'python' and copy the new path into the dist/ directory (making sure to create new directory nodes first):

import shutil
for fi in files:
    distPath = os.path.join('dist', fi.split("python")[1])
    if not os.path.exists(os.path.dirname(distPath)):
        os.makedirs(os.path.dirname(distPath))
    shutil.copy(fi, distPath)

At this point, the writer of Snakefood claims 99% accuracy. I haven’t measured that claim, but I have found a major drawback: Snakefood misses all __init__.py files, and therefore any import statements in those files. Rather than being smart about it, I just use os.walk() to find all the __init__.py files and copy them into dist/. I then ru my code from dist/ and look for ImportErrors. When I see one, I modify my script to manually copy the missing file to dist/. Not perfect, but it works, and it’s still much faster than doing the whole thing by hand.

The final step is to compile all of the files down to .pyo and remove all the .py and .pyc files. We use a Python script called compileall.py, located in the standard library, to compile, and then

% find . -type f -name '*.pyc' -print0 | xargs -0 rm -rdf

to remove the files. Make sure to run compileall.py with the -OO option to get rid of docstrings and other unnecessary stuff.

Until someone writes an OS in Python or all OSes are guaranteed to have Python installed, this is a pretty good way to distribute Python code to the masses. The next step, actually getting it to run like an application, is up to you, though py2app and py2exe can certainly help.

ExpanDrive Version 1.2

Monday, May 12th, 2008

Fresh off the press, out today, come and get it while it’s hot. Since 1.2 seems to be the magic number, that’s what we’re calling ours too.

Big ticket items: free space remaining now displays correctly on servers that support python. A filter field’s been added to the Drive Manager for those of us that have oh-so-many drives. Public key support is far more robust - in addition, encrypted private keys are also now supported.

Also, you might want to try a little Dino Run.

Monopoles: Enjoying the Warmer Weather

Thursday, May 8th, 2008

“Architecture astronauts take over”
We’re not going to name names, but a lot of famous bloggers named Gruber and 37signals (for example) linked non-ironically (sometimes called “sincerely”) to this Joel on Software post. The first 900 words are just stupid; it’s the last paragraph where he really goes off the deep end. Joel can’t find talent because MS and Google pay so much that it’s “unethical”. Maybe the problem is that MS and Google have projects that are more exciting than web based bug tracking and home brewed forks of VBScript.

Headline of the Week

“Visible Borders in Designs”
There’s an internal dispute at Magnetk right now concerning the remodeling of our web presence. Some at the company, a self described “modernist”, is a big fan of the page-within-a-page. Someone else, a minimalist, thinks it’s a stupid metaphor. Look for the resolution to this dispute in days to come, right here, in the the margins of Magnanimous. (ps: if you’re a design professional with good arguments against page-in-a-page, email them to jonshea at this domain.)

Photos of the Chaiten Eruption in Chile
Click through the first few photos to get to the real Mordor / Mt. Doom stuff.

“On (not) Seeing Red”
I’m “colorblind”. I use quotation marks because people with anomalous trichromacy, like myself, have a different, but not necessarily inferior (and certainly not “blind”) color perception than most people. Props to Dean, though, for looking out for us.

Free Gas: Colbert takes on Gas Tax Breaks
With a “glowing review” from my favorite economist.

Activating ExpanDrive from the Command Line

Wednesday, May 7th, 2008

In version 1.15 we’ve included a little script called expan that lets you connect and eject drives right from the command line. Because nobody wants to have to hike all the way over to the gui when they’re already cranking on their keyboard in the Terminal. Am I right, or am I right? Play the expan command-line screencast

You can install expan with just a button press (and a password entry) from the ExpanDrive preferences window. It works exactly like you’re probably all ready guessing

 expan connect driveName
 expan eject driveName

The script will connect and eject every drive that has driveName in its URL or as part of its Drive Name. If you want to connect all your drives, then something like expan connect . will probably do the trick.

Finally, because even desktop apps can be Web 2.0, we’ve made a screencast so you can see expan in action.

ExpanDrive 1.15 is now available!

Wednesday, May 7th, 2008

Features and fixes include:

  • sftp:// URLs are now handled by ExpanDrive - clicking on sftp://username@server will add a session into the drive manager and make a connection
  • Easy control of ExpanDrive from the Terminal using the command expan. It allows you to connect using simple commands like expan connect drive or expan eject drive. The command can be installed in General Preferences.
  • Fixed bug where “error: -36″ would sometimes interrupt large transfers in Finder on high latency connections
  • Fixed bug which would require the user to enter admin credentials and then still fail a copy in certain situations
  • Auto update screen now displays correctly in all locales - some were seeing a all white screen previously
  • ExpanDrive now handles expandrivelicense:// style urls for registration
  • Drive Manager window position is now remembered between sessions
  • Many small bug fixes

As always, the release notes are here.

Jeff on MacFUSE at CocoaHeads Boston

Wednesday, May 7th, 2008

I’m going to be giving an informal talk about MacFUSE at tomorrow, May 8th, at the CocoaHeads Boston meeting. Along with an overview of MacFUSE, I’ll try to conjure up some interesting tidbits about ExpanDrive development and why we think developing filesystems is more interesting than making web applications. Stop by if you’re around: MIT building e51, room 149 - 7pm.

Finessing international characters out of Python

Tuesday, May 6th, 2008

Whilst we whittled our filesystem problems down to a remaining few and sent our first Release Candidate out into the wild, we discovered we had another specter on the horizon to deal with: International Filename Support. Python generally handles this pretty well: it defaults to the web standard, UTF-8, so if you received a UTF-8 string, python will print the correct representation upon your call to “print”. No other work is necessary. This does not go so smoothly if the string you get is not encoded in UTF-8 (or ascii, since it is a true subset of UTF-8). We learned this limitation, and how to overcome it, over the course of two frustating days.

In our testing, we used another commercial SFTP Client to put some files with international characters in their names onto our test server (to wit: the files were called Québécois and Dvořàk). Unbeknownst to us, the client we used defaulted to Latin-1, aka ISO-8859-1 encoding. However, at this point, we also did not know about encoding in python, so we just output the strings as we received them. What we saw was Qu?b?cois and Dvo??k from the Terminal, and even worse in Finder, Qu? and Dvo? (more on why this was so later).

Python does not auto-detect encodings. You can get some third-party modules to get Python to try and do this.

We knew we had international characters, and we also knew that Mac OS X likes its characters to be encoded as UTF-8 (sort of).

So we tried this:

output_string = input_string.encode('utf-8')

Exception!

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128)

It looks like python is guessing the string is ASCII. We think it’s UTF-8, so let’s try it again:

ouptut_string = input_string.decode('utf-8').encode('utf-8')

Exception!

UnicodeDecodeError: 'utf8' codec can't decode bytes in position 2-4: invalid data

Oh dear. At this point, I insisted the client we were using was definitely not encoding filenames as UTF-8 data, but Jeff insisted that it had to be (it’s the standard, after all). Then we had an argument about the semantics of decoding vs. encoding. On a whim, I tried decoding the string using ‘latin-1′ as an argument. Ta da! No more Unicode exception! We came to the following conclusion about python encoding/decoding: python always stores strings in an internal, canonical representation. Therefore strings are always implicitly decoded from ASCII to this form.

In short, python does this with every incoming string:

canonical_string = decode(input_string, 'ascii')

output_string = encode(canonical_string, 'ascii')

If the incoming strings are not ASCII-encoded, you must explicitly call decode() on them with the appropriate codec as an argument. Our codec in this case is Latin-1 (aka ISO-8859-1); so far so good.

Now that we have our string object, we must call encode() on it with ‘utf-8′ as an argument, since UTF-8 is almost what Mac OS X expects. I say “almost” because there are two possibilities for UTF-8 encoding: “Canonical From” and “Decomposed Form”. The difference is in how characters with diacritics, like à or é, are transmitted. Mac OS X uses decomposed form, which simply means that à is transmitted as two characters, ` and a, which are then combined. Python defaults to canonical form, so before we re-encode the strings as UTF-8, we’ve got to make this switch.

import unicodedata       
decomposed_string = unicodedata.normalize('NFD', \
   input_string.decode('latin-1'))

Now we can finish up the task.

output_string = decomposed_string.encode('utf-8')

Hooray! We’re done.

But wait… what happens if some other client uses a different encoding? Well, of course the characters will display incorrectly. We need some sort of default encoding that will work. We saw above that using UTF-8 as a default will not work, since there are encodings of characters in latin-1 (and probably other codecs) that are invalid in utf-8. We settled on defaulting to ASCII. This is acceptable in all cases because of a basic truth about text encoding: every single character is transmitted as at least one byte of data. ASCII has a printable representation of every possible byte. So while the character à does not have an encoding in ASCII, its byte sequence, \xc3\xa0, does, though it will usually just print as ?? since both those numbers are greater than 0x7F and ASCII is not standardized above 0x7F.

Putting it all together, this is basically the function we use to handle these strings.

import unicodedata

def re_encode(input_string, decoder = 'utf-8', encoder = 'utf=8'):   
   try:
     output_string = unicodedata.normalize('NFD',\ 
        input_string.decode(decoder)).encode(encoder)

   except UnicodeError:
     output_string = unicodedata.normalize('NFD',\ 
        input_string.decode('ascii', 'replace')).encode(encoder)
   return output_string

And that’s really all there is to it. Python wins the game. By defaulting to ASCII encoding, you won’t get any unhandled exceptions, and you’ll also know pretty quickly that something is wrong (just look for the ???????s). For a much lengthier discussion of what Unicode is and does, see Joel Spolsky’s verbose take on the matter.

Monopoles: Caffeine Overload

Thursday, May 1st, 2008

“Mail 2.0-style split views”
A free Cocoa NSSplitView subclass by Kayembi London which lets you achieve the Mail.app 2.0 style split view, complete with a single-pixel separator and a draggable thumb.

“Priest Vanishes on Balloon Flight”
“A Roman Catholic priest who floated off under hundreds of helium party balloons was missing Monday off the southern coast of Brazil… A video of Carli posted on the G1 Web site of Globo TV showed the smiling 41-year-old priest slipping into a flight suit, being strapped to a seat attached to a huge column green, red, white and yellow balloons, and soaring into the air to the cheers of a crowd.…”

“The Crazy Baseball Fan Rule”
If my livelihood depended on people paying money to watch baseball, then I too would try to keep the fans from realizing what it felt like to watch something exciting.

1979 Video Game Pitch Meeting
“You’re a truck driver and you try to kill frogs that cross the road”
“No. You are the frog” cheers
“You just blew my mind. He hops on alligators; he eats flies; and he makes it with lady frogs. Keep it coming…”
Also, worst online video experience ever.

“Not Funny, Lenovo”
“I honor the place where my work and your imitation become one.”

“Git Push: Just The Tip”
“Wherein our hero explores git-push. Just for a second. Just to see how it feels.”