John Smith's Blog

In praise of help() in Python's REPL

Posted by John Smith on Thu 17 May 2012

For various reasons, I'm doing a bit of JavaScript/CoffeeScript work at the moment, which involves use of some functions in the core libraries which I'd not really used in the past. A minor aspect of this involves logarithmic values, and I was a bit surprised and then disappointed that JavaScript's Math.log() isn't as flexible as its Python near-namesake math.log(): # Python >>> math.log(256,2) # I want the result for base 2 8.0 versus # CoffeeScript coffee> Math.log(256,2) Math.log(256,2) 5.545177444479562

Now, it's probably unreasonable of me to expect the JavaScript version of this function to behave exactly the same as the Python version, especially as the (presumably) underlying C function only takes a single value argument. (Although it might have been nice to get a warning about the ignored second argument, rather than silence...)

On the other hand though, it reminded me of how much more civilized Python is compared to JavaScript. When I'm hacking around, I almost always have a spare window open with a running REPL process, that allows me to quickly check and test stuff, and can very easily pull up the docs via the help() function if I need further info. In contrast, to do the same in JavaScript I have to move over to a browser window and search for info on sites like MDN, or resort to my trusty copies of The Definitive Guide, neither of which are anywhere near as convenient.

After a brief bit of Googling and a plea for help on Twitter, I was unable to find any equivalent to this functionality in the JavaScript world - and let's face it, help() is pretty basic stuff when compared to what the likes of IPython and bpython offer the fortunate Python developer.

I'd love to be corrected on this, and be told about some nice CLI-tool for JavaScript that can help me out. (But not some overblown IDE that would require me to radically change my established development environment, I hasten to add!) I'm not expecting this to happen though - Python's help() relies heavily on docstrings, and I'm not aware that anything such as JsDoc is in common usage in the JavaScript community?

App Engine: What the docs don't tell you about processing inbound mail

Posted by John Smith on Mon 10 January 2011

For a while, I've wanted to add functionality to this blog to allow me to submit content via email; initially just photos, but eventually actual posts as well. As seems par for the course for any code involving email, stuff you'd expect to be simple and straightforward turns out to be anything but :-(

It doesn't help that the App Engine docs on receiving email gloss over a lot of stuff, so this is my attempt to try to fill in the gaps and cover the gotchas, so that others don't have to go through as much hassle as I did.

dev_appserver doesn't simulate inbound attachments

First off, whilst the current dev_appserver (1.4.1) does allow you to simulate sending a mail in, it doesn't have any explicit functionality for email attachments. This means you have the joy of doing your testing on the real App Engine. Now, luckily for me, I was able to (a) test this code in isolation without affecting the public functionality, and (b) do my deployments without any of the hanging that App Engine has a habit of doing every now and again, but it's still a painful way of evolving and testing code.

(Theoretically I imagine it's possible to cut-and-paste in the "raw" email bodies with Content-Type, Content-Disposition, base64 encoded data etc, to test attachments in the dev_appserver but I haven't tried it personally.)

Sender addresses aren't (just) e-mail addresses

As part of the protection against spam (or worse), I have a whitelist of acceptable senders; mails from anyone else get ignored. My first attempt at code for this was along the line of: if mail_msg.sender not in VALID_SENDERS: logging.error("...") return However, the sender property contains the full value of the Sender: header, so it's likely to be set to something like Fred Bloggs <fred@bloggs.com>. Whilst code to support this isn't exactly difficult, it's something that you wouldn't realize you needed to do when doing pseudo-mails on dev_appserver. Here's my code: is_valid = False for valid_sender in blog_settings.VALID_MAIL_SENDERS: if mail_msg.sender.find(valid_sender) >= 0: is_valid = True break if not is_valid: logging.error("Received mail from invalid sender '%s' - ignoring" % mail_msg.sender) return Now, this isn't perfect by any means - it should probably look for an exact match within the angle brackets, so that it doesn't get fooled by an email address in the "real name part" - but given how easy it is to fake a sender, I'm not too concerned; I have other protections in place, this is just a basic filter.

If there aren't any attachments, the attachments property doesn't exist, rather than being None

It's covered in this short thread but in summary: rather than having the attachments property be None or [] if an email lacks attachments, it doesn't actually exist, and so you have to use a try/except handler. Again, this is nothing difficult, but it is something you wouldn't necessarily realize until it bit you. try: logging.debug("Mail from %s has %d attachments" % (mail_msg.sender, len(mail_msg.attachments))) except AttributeError, e: logging.warning("Mail from %s has zero attachments - ignoring" % (mail_msg.sender)) return

You have to work out the attachment MIME type for yourself

The attachments property (if it exists) is a list of 2-member tuples. The first part of the tuple is the filename, the second the content. It would be nice if App Engine provided another member containing the MIME type that's defined in the Content-Type header where the filename is also specified, but unfortunately not :-( Instead you have to work it out for yourself, whether from the filename suffix, doing a magic number check on the file or using the original property to parse the message yourself.

Now, it's true that what a sender says the file type is shouldn't be blindly trusted to be legit or correct. However, it wouldn't hurt to have that information to use in an initial check for the >99% of cases that it is OK.

If you're going to trust the file extension (which is probably easier to fake than the MIME type...), you might want to look at google.appengine.api.mail, which has an EXTENSION_MIME_MAP dictionary. I've not used it personally - I'm currently only interested in a handful of common image formats - but it might be a reasonable base for working out the MIME type.

Attachments need decoding

The second member of the tuple in the attachments list is a google.appengine.api.mail.EncodedPayload. This has to be decoded using something along the lines of: for att in mail_msg.attachments: filename, encoded_data = att data = encoded_data.payload if encoded_data.encoding: data = data.decode(encoded_data.encoding) ... That class doesn't seem to support the len() function, so I'm not sure how you might protect yourself against a huge attachment that either can't be decoded before the timeout hits, or takes up more memory than App Engine is prepared to give you. I'm also assuming that the .decode method covers all the encodings that you might potentially receive. (Although I'm yet to see anything that isn't base64 in my own tests.)

Plain text bodies need decoding as well

You can explicitly request the plain-text message bodies (as opposed to any HTML bodies), but somewhat surprisingly, these aren't actually plain text! Instead they are EncodedPayload objects, and need decoding in a similar manner to the attachments. for b in mail_msg.bodies("text/plain"): body_type, pl = b try: if pl.encoding: logging.debug("Body: %s" % (pl.payload.decode(pl.payload.encoding))) else: logging.debug("Body: %s" % (pl.payload)) except Exception, e: logging.debug("Body: %s" % (pl)) (It wouldn't surprise me if the above code might have Unicode issues on certain content, but that's unlikely to be an issue in my own personal use.)

Email processing does retry if the code bombs (I think)

I'm not 100% sure on this one, and IMHO it's more of a positive feature than a gotcha, but it doesn't seem to be in the docs, so it's worth mentioning - the mail processing seems to work similar to task queue jobs, in that if a failure occurs, there are retries at gradually increasing intervals.

I'm sure there are other nasties involved in processing incoming emails, but my code seems to work fine now, so hopefully the above lessons might be of use to anyone else about to venture into this area. (Doubtless about 5 minutes after posting this I'll find that either I've been doing this all wrong, or that all of the above is fully documented somewhere that I haven't seen...)

Ramblings (mostly) about technical stuff

In praise of help() in Python's REPL

App Engine: What the docs don't tell you about processing inbound mail

dev_appserver doesn't simulate inbound attachments

Sender addresses aren't (just) e-mail addresses

If there aren't any attachments, the attachments property doesn't exist, rather than being None

You have to work out the attachment MIME type for yourself

Attachments need decoding

Plain text bodies need decoding as well

Email processing does retry if the code bombs (I think)

About this blog

About the author

Popular tags

Other sites I've built or been involved with

Work

Personal/fun/experimentation