John Smith's Blog

Ramblings (mostly) about technical stuff

Visualization of, and musings on, recent Hacker News threads about liked and disliked languages

Posted by John Smith on

For a while now, I've been itching to find an excuse to something in SVG again, so when there were a couple of threads last week on Hacker News about people's most liked and most disliked languages, it felt like an ideal opportunity.

You can view a wider, more legible, version of the scatter plot via this link. I've used logarithmic scaling, as using a regular linear scale, there was just a huge mess in the bottom left corner.

'Like' votes are measured horizontally, 'dislikes' vertically - so the ideal place to be is low down on the right, and the worst is high up on the left. The results are as captured at 2012/03/28 - I'd taken a copy a couple of days earlier, and there had been some changes in the interim, but only by single-digit percentages.

Some thoughts and observations:

  • The poll this data comes from is somewhat imperfect, as already mentioned in the comments in the thread itself. I should also point out that another poster on that thread also did a similar like vs dislike analysis, but I didn't see that post until I'd already started on this.
  • HN is a very pro-Python place - just compare all the threads related to PyCon 2012 versus the lack of noise after most other conferences - so it's hardly surprising who the "winner" is in such a voter base. I do find it odd though that Python doesn't seem to have such a good showing in other corners of the HN world. e.g. of the (relatively few) HN London events I've been to, I don't recall hearing many (any?) of the speakers using Python for their projects/companies - whereas "losers" such as Java and PHP do get namechecked fairly often.
  • I'm amused that CoffeeScript is liked at exactly the same ratio as JavaScript - 76%.
  • I was tempted to do some sort of colour-coding by language type (interpreted vs compiled), age etc - but at initial glance, I don't see any real trends that might indicate why a certain school/group of languages do well or badly.

Alphabetically sorted list of pure Python stdlib modules

Posted by John Smith on

(This is a bit of a lame post - 99% was generated by a script - but I wanted an online copy for my own future reference.)

I was reading the notes about the new stuff in Python 3.3, and it struck me that I didn't know anything about a couple of the modules mentioned. (For the record, they were abc and sched - hopefully my ignorance of them isn't too shameful ;-)

This has motivated me to go through the Python standard library and make sure I have at least a cursory knowledge of all the modules - I'm aiming to do one per day. There is a list on python.org, but it is grouped by theme, and I'd rather have a bit of a change from one day to the next, which hopefully an alphabetically sorted list should have a fair chance of achieving.

To this end, I knocked up a basic script to churn through the stdlib directory, which I can then use as a tick list. Maybe it could be of use to someone else too? Important: the list omits libraries which are written in C - these have __doc__ properties formatted differently from the pure Python libraries, and I think the pure libraries are enough for me to be going on with for now :-)

BTW, after I'd written the script to generate this list, I found that there's a similar (but more nicely formatted) list on Doug Hellmann's site, which annoyingly didn't show up in my Google search queries when I started out on this. It does have references for the C libraries, but I also notice a few libraries in the list below that aren't on that page e.g. ast, bdb, code. As I don't (currently!) know what those libraries are, I don't know if there's a particular reason for their omission.

abc
Abstract Base Classes (ABCs) according to PEP 3119.
_abcoll
Abstract Base Classes (ABCs) for collections, according to PEP 3119.
aifc
Stuff to parse AIFF-C and AIFF files.
antigravity
{Undocumented}
anydbm
Generic interface to all dbm clones.
argparse
Command-line parsing library
ast
ast
asynchat
A class supporting chat-style (command/response) protocols.
asyncore
Basic infrastructure for asynchronous socket service clients and servers.
atexit
allow programmer to define multiple exit functions to be executed upon normal program termination.
audiodev
Classes for manipulating audio devices (currently only for Sun and SGI)
base64
RFC 3548: Base16, Base32, Base64 Data Encodings
BaseHTTPServer
HTTP server base class.
Bastion
ification utility.
bdb
Debugger basics
binhex
Macintosh binhex compression/decompression.
bisect
Bisection algorithms.
bsddb
Support for Berkeley DB 4.1 through 4.8 with a simple interface.
calendar
Calendar printing functions
cgi
Support module for CGI (Common Gateway Interface) scripts.
CGIHTTPServer
CGI-savvy HTTP Server.
cgitb
More comprehensive traceback formatting for Python scripts.
chunk
Simple class to read IFF chunks.
cmd
A generic class to build line-oriented command interpreters.
code
Utilities needed to emulate Python's interactive interpreter.
codecs
Python Codec Registry, API and helpers.
codeop
Utilities to compile possibly incomplete Python source code.
collections
{Undocumented}
colorsys
Conversion functions between RGB and other color systems.
commands
Execute shell commands via os.popen() and return status, output.
compileall
Module/script to "compile" all .py files to .pyc (or .pyo) file.
compiler
Package for parsing and compiling Python source code
config
{Not importable - ImportError}
ConfigParser
Configuration file parser.
contextlib
Utilities for with-statement contexts. See PEP 343.
Cookie
Here's a sample session to show how to use this module. At the moment, this is the only documentation.
cookielib
HTTP cookie handling for web clients.
copy
Generic (shallow and deep) copying operations.
copy_reg
Helper to provide extensibility for pickle/cPickle.
cProfile
Python interface for the 'lsprof' profiler. Compatible with the 'profile' module.
csv
CSV parsing and writing.
ctypes
create and manipulate C data types in Python
curses
curses
dbhash
Provide a (g)dbm-compatible interface to bsddb.hashopen.
decimal
This is a Py2.3 implementation of decimal floating point arithmetic based on the General Decimal Arithmetic Specification:
Demo
{Not importable - ImportError}
difflib
helpers for computing deltas between objects.
dircache
Read and cache directory listings.
dis
Disassembler of Python byte code into mnemonics.
distutils
distutils
Doc
{Not importable - ImportError}
doctest
a framework for running examples in docstrings.
DocXMLRPCServer
Self documenting XML-RPC Server.
dumbdbm
A dumb and slow but simple dbm clone.
dummy_thread
Drop-in replacement for the thread module.
dummy_threading
Faux ``threading`` version using ``dummy_thread`` instead of ``thread``.
email
A package for parsing, handling, and generating email messages.
encodings
Standard "encodings" Package
filecmp
Utilities for comparing files and directories.
fileinput
Helper class to quickly write a loop over all standard input files.
fnmatch
Filename matching with shell patterns.
formatter
Generic output formatting.
fpformat
General floating point formatting functions.
fractions
Rational, infinite-precision, real numbers.
ftplib
An FTP client class and some helper functions.
functools
Tools for working with functions and callable objects
__future__
Record of phased-in incompatible language changes.
genericpath
Path operations common to more than one OS Do not use directly. The OS specific modules import the appropriate functions from this module themselves.
getopt
Parser for command line options.
getpass
Utilities to get a password and/or the current user name.
gettext
Internationalization and localization support.
glob
Filename globbing utility.
gzip
Functions that read and write gzipped files.
hashlib
module - A common interface to many hash functions.
heapq
Heap queue algorithm (a.k.a. priority queue).
hmac
HMAC (Keyed-Hashing for Message Authentication) Python module.
hotshot
High-perfomance logging profiler, mostly written in C.
htmlentitydefs
HTML character entity references.
htmllib
HTML 2.0 parser.
HTMLParser
A parser for HTML and XHTML.
httplib
HTTP/1.1 client library
idlelib
{Undocumented}
ihooks
Import hook support.
imaplib
IMAP4 client.
imghdr
Recognize image file formats based on their first few bytes.
importlib
Backport of importlib.import_module from 3.x.
imputil
Import utilities
inspect
Get useful information from live Python objects.
io
The io module provides the Python interfaces to stream handling. The builtin open function is defined in this module.
json
JSON (JavaScript Object Notation) is a subset of JavaScript syntax (ECMA-262 3rd edition) used as a lightweight data interchange format.
keyword
Keywords (from "graminit.c")
lib-dynload
{Not importable - SyntaxError}
lib-tk
{Not importable - SyntaxError}
lib2to3
{Undocumented}
linecache
Cache lines from files.
locale
Locale support.
logging
Logging package for Python. Based on PEP 282 and comments thereto in comp.lang.python, and influenced by Apache's log4j system.
_LWPCookieJar
Load / save to libwww-perl (LWP) format files.
macpath
Pathname and path-related operations for the Macintosh.
macurl2path
Macintosh-specific module for conversion between pathnames and URLs.
mailbox
Read/write support for Maildir, mbox, MH, Babyl, and MMDF mailboxes.
mailcap
Mailcap file handling. See RFC 1524.
markupbase
Shared support for scanning document type declarations in HTML and XHTML.
md5
{Undocumented, with warnings - possibly deprecated?}
mhlib
MH interface -- purely object-oriented (well, almost)
mimetools
Various tools used by MIME-reading or MIME-writing programs.
mimetypes
Guess the MIME type of a file.
MimeWriter
Generic MIME writer.
mimify
Mimification and unmimification of mail messages.
modulefinder
Find modules used by a script, using introspection.
_MozillaCookieJar
Mozilla / Netscape cookie loading / saving.
multifile
A readline()-style interface to the parts of a multipart message.
multiprocessing
{Undocumented}
mutex
Mutual exclusion -- for use with module sched
netrc
An object-oriented interface to .netrc files.
new
Create new objects of various types. Deprecated.
nntplib
An NNTP client class based on RFC 977: Network News Transfer Protocol.
ntpath
Common pathname manipulations, WindowsNT/95 version.
nturl2path
Convert a NT pathname to a file URL and vice versa.
numbers
Abstract Base Classes (ABCs) for numbers, according to PEP 3141.
opcode
module - potentially shared between dis and other modules which operate on bytecodes (e.g. peephole optimizers).
optparse
A powerful, extensible, and easy-to-use option parser.
os
OS routines for Mac, NT, or Posix depending on what system we're on.
os2emxpath
Common pathname manipulations, OS/2 EMX version.
pdb
A Python debugger.
__phello__.foo
{Undocumented}
pickle
Create portable serialized representations of Python objects.
pickletools
"Executable documentation" for the pickle module.
pipes
Conversion pipeline templates.
pkgutil
Utilities to support packages.
platform
This module tries to retrieve as much platform-identifying data as possible. It makes this information available via function APIs.
plat-linux2
{Not importable - SyntaxError}
plistlib
a tool to generate and parse MacOSX .plist files.
popen2
Spawn a command with pipes to its stdin, stdout, and optionally stderr.
poplib
A POP3 client class.
posixfile
Extended file operations available in POSIX.
posixpath
Common operations on Posix pathnames.
pprint
Support to pretty-print lists, tuples, & dictionaries recursively.
profile
Class for profiling Python code.
pstats
Class for printing reports on profiled python code.
pty
Pseudo terminal utilities.
pyclbr
Parse a Python module and describe its classes and methods.
py_compile
Routine to "compile" a .py file to a .pyc (or .pyo) file.
pydoc
Generate Python documentation in HTML or text for interactive use.
pydoc_data
{Undocumented}
_pyio
Python implementation of the io module.
Queue
A multi-producer, multi-consumer queue.
quopri
Conversions to/from quoted-printable transport encoding as per RFC 1521.
random
Random variable generators.
re
Support for regular expressions (RE).
repr
Redo the builtin repr() (representation) but with limits on most sizes.
rexec
Restricted execution facilities.
rfc822
RFC 2822 message manipulation.
rlcompleter
Word completion for GNU readline 2.0.
robotparser
runpy
locating and running Python code using the module namespace
sched
A generally useful event scheduler class.
sets
Classes to represent arbitrary sets (including sets of sets).
sgmllib
A parser for SGML, using the derived class as a static DTD.
sha
{Undocumented, with warnings - possibly deprecated?}
shelve
Manage shelves of pickled objects.
shlex
A lexical analyzer class for simple shell-like syntaxes.
shutil
Utility functions for copying and archiving files and directory trees.
SimpleHTTPServer
Simple HTTP Server.
SimpleXMLRPCServer
Simple XML-RPC Server.
site
Append module search paths for third-party packages to sys.path.
site-packages
{Not importable - SyntaxError}
smtpd
An RFC 2821 smtp proxy.
smtplib
SMTP/ESMTP client class.
sndhdr
Routines to help recognizing sound files.
socket
This module provides socket operations and some related functions. On Unix, it supports IP (Internet Protocol) and Unix domain sockets. On other systems, it only supports IP. Functions specific for a socket are available as methods of the socket object.
SocketServer
Generic socket server classes.
sqlite3
{Undocumented}
sre
This file is only retained for backwards compatibility. It will be removed in the future. sre was moved to re in version 2.5.
sre_compile
Internal support module for sre
sre_constants
Internal support module for sre
sre_parse
Internal support module for sre
ssl
This module provides some more Pythonic support for SSL.
stat
Constants/functions for interpreting results of os.stat() and os.lstat().
statvfs
Constants for interpreting the results of os.statvfs() and os.fstatvfs().
string
A collection of string operations (most are no longer used).
StringIO
File-like objects that read from or write to a string buffer.
stringold
Common string manipulations.
stringprep
Library that exposes various tables found in the StringPrep RFC 3454.
_strptime
Strptime-related classes and functions.
struct
Functions to convert between Python values and C structs represented as Python strings. It uses format strings (explained below) as compact descriptions of the lay-out of the C structs and the intended conversion to/from Python values.
subprocess
Subprocesses with accessible I/O streams
sunau
Stuff to parse Sun and NeXT audio files.
sunaudio
Interpret sun audio headers.
symbol
Non-terminal symbols of Python grammar (from "graminit.h").
symtable
Interface to the compiler's internal symbol tables
sysconfig
Provide access to Python's configuration information.
tabnanny
The Tab Nanny despises ambiguous indentation. She knows no mercy.
tarfile
Read from and write to tar format archives.
telnetlib
TELNET client class.
tempfile
Temporary files.
test
{Undocumented}
textwrap
Text wrapping and filling.
this
The Zen of Python, by Tim Peters
threading
Thread module emulating a subset of Java's threading model.
_threading_local
Thread-local objects.
timeit
Tool for measuring execution time of small code snippets.
toaiff
Convert "arbitrary" sound files to AIFF (Apple and SGI's audio format).
token
Token constants (from "token.h").
tokenize
Tokenization help for Python programs.
Tools
{Not importable - ImportError}
trace
program/module to trace Python program or function execution
traceback
Extract, format and print information about Python stack traces.
tty
Terminal utilities.
types
Define names for all type symbols known in the standard interpreter.
unittest
Python unit testing framework, based on Erich Gamma's JUnit and Kent Beck's Smalltalk testing framework.
urllib
Open an arbitrary URL.
urllib2
An extensible library for opening URLs using a variety of protocols
urlparse
Parse (absolute and relative) URLs.
user
Hook to allow user-specified customization code to run.
UserDict
A more or less complete user-defined wrapper around dictionary objects.
UserList
A more or less complete user-defined wrapper around list objects.
UserString
A user-defined wrapper around string objects
uu
Implementation of the UUencode and UUdecode functions.
uuid
UUID objects (universally unique identifiers) according to RFC 4122.
warnings
Python part of the warnings subsystem.
wave
Stuff to parse WAVE files.
weakref
Weak reference support for Python.
_weakrefset
{Undocumented}
webbrowser
Interfaces for launching and remotely controlling Web browsers.
whichdb
Guess which db package to use to open a db file.
wsgiref
a WSGI (PEP 333) Reference Library
xdrlib
Implements (a subset of) Sun XDR -- eXternal Data Representation.
xml
Extended XML support for Python
xmllib
A parser for XML, using the derived class as static DTD.
xmlrpclib
An XML-RPC client interface for Python.
zipfile
Read and write ZIP files.

There are a few entries that are slightly odd, such as robotparser and ast, which are due to the __doc__ property of those modules being formatted differently from the rest, and me being too idle to fix them.

Caveat: the list was generated in Python 2.7 running on Fedora 15, so it's possible my stdlib isn't completely standard.

Thoughts on Windows 8 Consumer Preview

Posted by John Smith on

Microsoft released a "Consumer Preview" of Windows 8 last week, and I thought I'd download it and take a look, as it's the first version of Windows that I've ever had any curiosity about, mainly due to the new Metro UI.

I've only spent a few hours playing around in a fairly aimless manner, so this is by no means a thorough review. In general, I agree with most of the points made in this Orlowski piece at The Register, but this post will cover a few things I found of note.

(Just for background, I'm far from being a Windows aficionado or regular user - whilst I have a desktop, laptop and netbook with Windows 7, those machines spend most of their lives running some form of Linux, whether natively via dual-boot, or in a virtual machine. With regard to the Metro UI, I've never used Windows Phone 7 - in fact, I've only ever seen it being used once in the wild - and I really don't like how it it has been implemented in the latest Xbox 360 dashboard update. In fairness, most of the problems I have with the Xbox 360 implementation are far more to do with how MS have prioritized ads and general media over games, which doesn't have anything do with Metro per se, and would easily be resolved if the dashboard was configurable.)

  • I've only tried Win8 in a VirtualBox VM running atop Windows 7. For some reason I'm only able to run it in a limited number of resolutions, none of which are the native resolution of my monitor. Not quite sure whether this is the fault of Win8 or VirtualBox - it's the first time I've used the latter, normally I use VMWare for all my virtualized environments. (In a similar vein, I was unable to get USB memory sticks or external hard drives to be recognized, and I don't know where the fault lies.)
  • The login page confuses the hell out of me. It's super-minimal, which isn't a problem, but most of the time when I click the mouse on the login screen, all that happens is that the screen scrolls up and then back down by about half-an-inch. The same happens if I double-click, long-click, middle-click or right-click. Nothing happens if I hit the Windows key (which is used heavily in Win8, see later point). I've just discovered that pressing the Ctrl key, or rotating the mouse wheel, brings up the password prompt - prior to that point I'd just been randomly moving the mouse around and clicking until I triggered some magical gesture. Screengrab of the Windows 8 login screen
  • MS seem to push you towards using authentication based on Windows Live/Hotmail/Microsoft Live/whatever-they-brand-it-this-week accounts. This isn't necessarily a bad idea, but one thing that I'm not a big fan of is that they suggest that people might want to create a Windows account with the same name as their regular email account. From my experience on a project using Google accounts, where we suggested people might want to create a Google account named "joebloggs@hotmail.com" or "fred@myisp.com", this just leads to user confusion, as people mentally associate a particular account with a particular service. (Theoretically the same should apply to stuff like Amazon accounts, but the same issue doesn't really apply for various reasons. Probably something for a different post...)
  • MS seem to be really pushing Metro over the "traditional" Windows UI, but I'm really not sure how it's going to scale. I did a completely clean install, and just added Firefox, Opera, Safari+QuickTime and TortoiseSVN, and already the Start screen is full of crap and has more items than will fit on screen at once: Screengrab of Windows 8 start screen Note that in the above shot, I'd already reduced some of the boxes that default to double width (such as Weather and Calendar) down to single width. I'm not sure why the "packer" automatically moved some items into the space that was freed when I did that, but hasn't moved Music - the items can be manually dragged, but it seems odd that it sometimes works automatically and sometimes not.
  • That "submenu" items such as those for TortoiseSVN or Apple Software Update have appeared in the top menu seems incredibly lame. Again, they can be manually removed from the Start screen, but (a) I don't know why users should have to manually get rid of all the crap that a newly installed application might have added without asking, and (b) I'm not sure how easy it would be to find/restore such deleted items. (There doesn't seem to be any sort of application specific context menu associated with each box.
  • Metro applications launch full screen, and have no window controls. This meant that I was scratching my head trying to work out how to escape from an application. In the end I had to resort to a Google search, and found that (a) I wasn't alone in being confused, and that (b) the answer is to press the Windows key. I imagine the proper version of Windows 8 will have some sort of introductory tutorial that explains this to new users, but I foresee a lot of confused people stood at demo units in PC World wondering what the hell they're supposed to do next...
  • By comparison, losing the "start" button in the regular UI is actually less painful than I was expecting - with one caveat. I'm sure that it'll probably be fine on a regular desktop, but as I've had to run Win8 in a VM in a window much smaller than my overall screen, the experience was a tad fiddly.
  • One other minor point about Metro pushing the Windows key - MS seem to be very pleased with the new Windows logo they've come up with, and compared to some of the gouge-your-eyes-out rebranding monstrosities that come out, I'd consider it perfectly OK. However, they're up against millions (billions?) of existing keyboards that are sending a very different message about what the Windows logo is. If I had to tech support over the phone to non-technical people such as my parents, I'd expect to have to describe to them what "the Windows key" is, and the first description is "it looks like a flag", which the new logo doesn't.
  • Probably the most useful thing for me in Win8 is having IE10 to test. As yet, I haven't actually used it very much, so I don't know how comparable it is to the rest of the browser market. (Personally I consider IE9 a very weak release, far behind the rest of the pack - probably closer to Firefox 2 than 3. The summary page at caniuse.com agrees with me.) What is a bit odd though, is that in many ways there are two IE8 browsers: the one for the "traditional" Windows UI, and the one for Metro. Screen grab of IE10 browser running in the 'regular' Windows UI on Windows 8 Moving the browser chrome down to the bottom in Metro is a non-issue. Moving the "forward" button to the far right, rather than being adjacent to the "back" button, and losing the "home" & "bookmarks" buttons, are very questionable. But refusing to play Flash content on a machine that has Flash installed and working is absolutely batshit insane. Screen grab of IE10 browser running in the Metro UI on Windows 8 As an avowed Flash-hater, on one level I do want to welcome yet another nail in its coffin. However, this puritanical refusal to do something that the machine/OS/browser is clearly capable of, seems very user-hostile. I was aware that Flash and other plug-ins were not going to be available in some versions of Windows 8, but I'd assumed it was just going to be the ARM/mobile versions, which makes perfect sense. I guess that MS have decided though that they want to try to have a consistent experience across all Metro platforms, which is admirable in some respects. But given the failure though of non-iPad tablets, I would expect the ARM/mobile part of the overall Win8 user base to be a drop in the ocean for the foreseeable future, so making the experience worse for the 9x% of people on desktop/laptop machines just so the tiny fraction of people on ARM/mobile don't feel left out, strikes me as misguided.
  • There are also a few other "WTF?" things with IE10, that I've not personally experienced, but which are documented here. I do find it telling that that piece is (at best) neutral in tone, whereas Thurrott is normally "rah rah, isn't this great" about the vast majority of stuff that MS do...
  • Not that I was expecting anything, but the continued absence of any bread-and-butter tools like ssh or even telnet is very lame. I suppose that they are too Unixy for MS, and they'd rather not let on that there's an alternative to the world of Windows out there ;-)

Obviously, this is just a preview release, and it makes sense for MS to try out new ideas that might not work out, and which they can easily pull in the official release. Certainly Metro looks nice, and feels less cliched than OS X's brushed steel. (Although given that I mostly use Linux and Xfce, and turn off desktop wallpapers, fancy transitions (e.g. compiz) etc, my opinions on aesthetics probably shouldn't be paid too much heed ;-) Personally, it offends me far less than GNOME 3, the 2011 Google redesign, or OS X Lion - but that's probably because I don't have any great investment in the world of Windows. I do think that such radical changes for a product such as Windows are very "brave", but that's a subject I might elaborate on in another post, as this one is already more than long enough.

Some initial thoughts about Windows 8 Consumer Preview

Posted by John Smith on

Microsoft released a "Consumer Preview" of Windows 8 last week, and I thought I'd download it and take a look, as it's the first version of Windows that I've ever had any curiosity about, mainly due to the new Metro UI.

I've only spent a few hours playing around in a fairly aimless manner, so this is by no means a thorough review. In general, I agree with most of the points made in this Orlowski piece at The Register, but this post will cover a few things I found of note.

(Just for background, I'm far from being a Windows aficionado or regular user - whilst I have a desktop, laptop and netbook with Windows 7, those machines spend most of their lives running some form of Linux, whether natively via dual-boot, or in a virtual machine. With regard to the Metro UI, I've never used Windows Phone 7 (I've only ever seen it being used once in the wild), and I really don't like how it it has been implemented in the latest Xbox 360 dashboard update - although most of the problems I have with it are far more to do with how MS have prioritized ads and general media over games, which doesn't have anything do with Metro per se, and would easily be resolved if the dashboard was configurable.)

  • I've only tried Win8 in a VirtualBox VM running atop Windows 7. For some reason I'm only able to run it in a limited number of resolutions, none of which are the native resolution of my monitor. Not quite sure whether this is the fault of Win8 or VirtualBox - it's the first time I've used the latter, normally I use VMWare for all my virtualized environments. (In a similar vein, I was unable to get USB memory sticks or external hard drives to be recognized, and I don't know where the fault lies.)
  • The login page confuses the hell out of me. It's super-minimal, which isn't a problem, but most of the time when I click the mouse on the login screen, all that happens is that the screen scrolls up and then back down by about half-an-inch. The same happens if I double-click, long-click, middle-click or right-click. Nothing happens if I hit the Windows key (which is used heavily in Win8, see later point). I've just discovered that pressing the Ctrl key, or rotating the mouse wheel, brings up the password prompt - prior to that point I'd just been randomly moving the mouse around and clicking until I triggered some magical gesture.
  • MS seem to push you towards using authentication based on Windows Live/Hotmail/Microsoft Live/whatever-they-brand-it-this-week accounts. This isn't necessarily a bad idea, but one thing that I'm not a big fan of is that they suggest that people might want to create a Windows account with the same name as their regular email account. From my experience on a project using Google accounts, where we suggested people might want to create a Google account named "joebloggs@hotmail.com" or "fred@myisp.com", this just leads to user confusion, as people mentally associate a particular account with a particular service. (Theoretically the same should apply to stuff like Amazon accounts, but the same issue doesn't really apply for various reasons. Probably something for a different post...) Screengrab of the Windows 8 login screen

Artificial pagination of articles on news sites - thanks, but no thanks

Posted by John Smith on

TL; DR: the bazillionth whinge about content sites with paginated articles, but picking on businessweek.com as they seem to be doing something particularly iniquitous, that I haven't seen before. (Although it could just be that I'm unobservant or behind-the-times, and that this is old news.)

I was in a branch of W H Smith yesterday, and had a quick flick through the latest issue of Bloomberg Business Week magazine. As usual, there were a number of articles that looked like they'd be worth reading, so I handed over £3.30 for the dead tree edition made a mental note to visit businessweek.com later to digest those articles properly.

I got round to visiting the site just now, and it looked like they might have had a minor redesign since I last visited - nothing particularly outrageous though. The first article I checked was this week's cover story about Twitter. Reading down the first page, it was quite a good piece, if not telling me anything I hadn't previously been aware of.

As I scrolled down to the bottom of the page, there was a pagination nav that indicated the article had been split into four pages. As I clicked on "next >" to go to the following page, I was surprised about how quickly the second page appeared - suspiciously quick, in fact.

Screen grab of a browser showing the bottom of an article at businessweek.com, specifically the page navigation

Being a nosey bugger, I viewed the HTML source of the page, and found that in fact, all of the article content is present in the "first" page, and it isn't even sectionalized in any way.

Screen grab of a browser 'View Source' window, showing that there is article content beyond that shown to the user on a page of a businessweek.com article

It wasn't a big surprise to me to find out that if I turned off JavaScript, my browser would now show the whole story on a single page, with no pagination controls visible anywhere. On my 1050x1680 portrait display, the full story runs for 8 screens in Firefox, which doesn't strike me as especially long and in need of breaking up. (NB: I have Ghostery blocking Disqus comments in Firefox; when I viewed the regular paginated site in a browser, I found that just the first of the four pages ran to 6 screensworth, of which around 50% were the - mostly inane - Disqus user comments.)

Screen grab of the businessweek.com story with JavaScript disabled, now showing all the article content on a single page

I haven't bothered to check the site's JavaScript code, but I assume there's something that measures the height of the <p> elements, and once the height goes above a certain point, starts splitting/hiding them into pages, and inserts the page navigation controls.

Now, pagination of online content is a complicated subject, and I'm no UX guru, conversion wizard or SEO charlatan who can confidently spout chapter-and-verse about what you should or shouldn't do when building a content site. What I do know is that as a user, I don't like having to continually click-scroll-click-scroll-click-scroll to get through an article that could have easily been scrolled through. And I'm pretty sure I'm not alone.

The usual excuse for pagination is that it increases the number of page impressions or ads that can be shown, but I don't think that's valid here:

  • As there's no new page being loaded when I click 'next', a page hit in the traditional sense isn't occurring. I'm sure there'll be some JavaScript analytics code sending something back to the server when I navigate to another page, but surely this could be done by handling onscroll events, similar to how people such as Twitter (ironically enough) implement infinite scrolling.
  • The ad space on a long single page is pretty much the same as on multiple short pages, so the same number of ads could be run. Now, after viewing the story on BusinessWeek a few times, it looks very much like there are only a very small number of ads being repeated on each sub-page of the story, and having the same ads repeatedly shown on a longer single page would look pretty dumb, but this seems to me to be more a failure of their ad sales or syndication systems, than anything else.
  • Most sites that use pagination - Ars Technica is the whipping boy that usually comes to mind - do at least keep up the pretence of having to download new content (whether by a traditional page load, or via Ajax), but this is the first time I've been aware of a site using JavaScript to IMHO actively make things worse for end users.

Slightly odd MS Bing job spam

Posted by John Smith on

I received the following message via LinkedIn tonight:

Screengrab of an email sent to me by a Microsoft recruiter

I'm sure they bash out thousands of these every day, but I'm mildly curious why they decided I should be a lucky recipient of one:

  • I'm not sure what the "CIS" in the message subject refers to. If MS thinks I'm from the former USSR, that doesn't say much for their CV analyzing abilities. I did look for alternate definitions, but other than the meaninglessly generic "Computer Information Systems", I don't see anything that jumps out as being relevant.
  • What's with the weird intermittent capitalization of BING/Bing?
  • Other than a couple of years of DOS development nearly 20 years ago, a proper look at my CV or LinkedIn profile would show I have very minimal experience or interest in Microsoft technologies. A glance at Twitter posts would in fact show that taking the p*** out of MS is one of my favourite entertainments.

Now, I'm pretty sure that the reason they've contacted me is fairly obvious - I have the magic word "Google" in my CV/profile. One might have thought that even MS realize that not all of the ~30k people who work at Google are on the search teams. (Not that I ever counted as part of that number, being a lowly "red badge" contractor.)

Don't get me wrong, I'm quite happy for recruiters to send me unsolicited offers (within reason). But spamming me about roles that are clearly not a suitable match for me is a waste of everyone's time.

Man pages are not optional

Posted by John Smith on

Spent a few hours today playing with Puppet, which I'd been meaning to have a look at for ages.

I followed a beginner's tutorial, which didn't work - Puppet failed to stop or start ntp in the way the tutorial describes. In and of itself, this didn't bother me - in fact, I often find it useful when things screw up, as it forces you to start digging in and investigating, at which point you start learning how things really work.

What was a bit annoying though, was the lack of any obvious warnings or errors, either in the output, or in the log files. (Puppet logs to both /var/log/messages and files in /var/log/puppet/, but the former had nothing useful, and the latter just contained incomprehensible HTTP request URLs.)

Never mind, I thought, I'll just check the man page. This is what I got: Screengrab of the unhelpful output of 'man puppet'

It doesn't look to be an issue specific to my distro either.

I don't give a damn how good any project's online documentation is; something that is going to be interacted with from a *nix command line - especially when aimed at a system admin audience - should have a manpage that at least covers some of the basics in a modicum of detail. I don't care for the GNU stuff that pushes you towards the info command that I've never seen anyone actually use, but compared to Puppet, they're wonderful.

Obviously Unix manpages are far and away from being the new hotness, and the lack of stuff like proper hyperlinks is a bit annoying in this day and age, but the fact that they have survived this long shows how useful they are. If the author(s) of a tool can't be bothered to spend a few hours putting together a half-decent manpage, then I'm not sure I feel inclined on spending any time bothering to research and learn that tool.

EDIT: I've just seen that you can do puppet describe {string} to get something that's roughly comparable to a proper manpage, just inferior in pretty much every respect. (e.g. no nice formatting on a terminal that supports bold or underline, having to pipe into more/less/etc if you want to page it) Too late though, I'm moving on...

EDIT#2: On further investigation, I'm not sure what puppet describe or puppet doc actually do, but it's certainly not providing docs about the subcommands such as "agent", "apply", "cert", etc as mentioned in the manpage.

Detecting if a webpage is running in a background browser tab

Posted by John Smith on

TL;DR version: A hacky way of detecting if a web page is running in a browser tab which isn't currently active. Possibly useful for not autoplaying video or audio. Currently only works in Firefox and Chrome. A rough demo is here.

(This post refers to subject matter covered in more detail in my previous blog post. You don't really need to read that to understand the core of what this post is about though.)

I'm probably very weird in the way I browse, but something that I experience on a more-than-daily basis is:

  1. Go to a site that has a bunch of items/stories/articles, many of which will have links. I'm thinking of things like a Twitter feed, Hacker News, a Gawker Media site, the B3TA newsletter, etc
  2. As I skim down the page, I r-click->open in new tab any items which look interesting and have links. I prefer to do this rather than to hop backwards and forwards between the original page and the linked pages.
  3. Unfortunately, some of those links might be to YouTube or other video services - and this might not be immediately obvious, especially with embedded videos or if URL shorteners are in use. You can then end up with one or more videos starting to autoplay, and you're forced to start going through the tabs to pause the videos until you're ready to watch them.

It struck me that it would be much more user-friendly if videos didn't autoplay if the page was in a window or tab that wasn't currently active or visible. Now, either my webdev and Google skills are deficient (quite possible), or this isn't as easy as you might expect.

Browsers do have focus and blur event triggers, but these don't really help - a page opened in a tab won't fire either of these until the tab is clicked on, and only Firefox fires a focus event when a page loads in an active tab/window.

As far as I can determine, there's no property of the Window object along the lines of 'focusState' that would make all of this dead easy to determine.

I then remembered reading a few posts about requestAnimationFrame, which has been introduced into the latest generation of browsers. The intended use-case for this functionality is to stop browsers burning CPU on animations/game cycles that a user can't actually see or interact with. This seemed like something which would help provide a solution to the problem I was thinking about.

Unforunately, things aren't straightforward. First off, only Firefox and Chrome currently have proper requestAnimationFrame support. Now, there are easy-to-use shims to keep things working on other browsers - unfortunately they do this by just doing the animations as normal on a page in an inactive tab, which doesn't help in my scenario.

Furthermore, the behaviour in the two browsers that do implement it differs - and from a bit of reading, I get the impression that they're not likely to converge to identical implementations. The differences are covered in the previous blog post - suffice to say we need to apply a bit of a fudge-factor to be able to support both browsers.

So, here's an overview of the JavaScript code you'd need to add to a page with video, to make it work in a "civilized manner":

  1. Ensure the video is set to notautoplay
  2. On page load, initialize the following variables:
    • animationCounter = 0
    • stopAnimations = false
    • windowState = undefined
  3. Set up some onFocus() and onBlur() handlers. The handlers will set windowState to "focussed" or "blurred" respectively.
  4. Use requestAnimationFrame an incrementCounter() function. This function - unsurprisingly - increments animationCounter, and if stopAnimations is not true, uses requestAnimationFrame to call itself again
  5. Use setTimeout() to call a determineState() function a second after page load
  6. When determineState() is called, set stopAnimations to true, clear the onFocus() and onBlur() handlers, and do one of the following:
    • If windowState is "unknown"
    • If windowState is "focussed", set the video to autoplay, as the user can definitely see it
    • If windowState is "blurred", don't autoplay the video, as the user definitely can't see it. (I don't believe this state will ever happen, unless the user is juggling between tabs, but it makes sense to cover it for completeness.)
    • If animationCounter is more than 3, the user probably has the page active, so autoplay the video
    • If animationCounter is 2 or less, the user probably has the page inactive, so don't autoplay the video

Disparity in requestAnimationFrame behaviour between Chrome and Firefox

Posted by John Smith on

This is a brief prelude to another post I hope to make in a couple of days or so, once I've solved my problem to my satisfaction. In the meantime, here's a related curio that I hadn't seen documented online before I had to start digging...

requestAnimationFrame is something that's been pushed in the last year or two as a more efficient way of doing animations in JavaScript than the traditional technique of using setInterval. In particular, it aims to avoid having your machine burn CPU on executing animations in a tabbed page that's not currently visible.

At time of writing, only Firefox and Chrome seem to actually support this function, albeit with moz and webkit vendor prefixes. caniuse.com doesn't have too much information about future support in other browsers - it'll appear in IE10, but it's unclear about Safari or Opera. Certainly the Opera Next 12.0 I downloaded yesterday doesn't appear to have it.

Now, for the most part this isn't the end of the world, as there are published shims to implement a workable alternative using setInterval() or setTimeout(). Unfortunately, these will just churn away as normal in a background tab, whereas what I wanted to do was to see how things were different in a background tab.

It turns out that the two implementations we have so far differ in their behaviour. Chrome comes to a dead stop when a page is in a background tab, which is probably what you'd naively expect to happen. Firefox on the other hand does some gradual throttling - you'll get one frame in the first second of being backgrounded, then another after a further two seconds, then a further four seconds, eight seconds, sixteen seconds, etc.

I knocked up a very rough demo for this, so that you can see for yourself - take a look here, and see what happens when you r-click the link on the page and open it in another tab - the function called via requestAnimationFrame() updates the page title, so you can see how often it gets called from the text in the tab.

I'm not completely clear why Mozilla have implemented this the way they have - I've not dug out any official specs, but going by the year old Chromium issue to add this functionality, I don't expect this behaviour to show up in Chrome/Chromium.

In the next post I'll elaborate on the problem I've been trying to solve - suffice to say, using requestAnimationFrame was a bit of a hacky way of trying to achieve something that I'd have thought should have been extremely straightforward...

« Page 2 / 6 »

About this blog

This blog (mostly) covers technology and software development.

Note: I've recently ported the content from my old blog hosted on Google App Engine using some custom code I wrote, to a static site built using Pelican. I've put in place various URL manipulation rules in the webserver config to try to support the old URLs, but it's likely that I've missed some (probably meta ones related to pagination or tagging), so apologies for any 404 errors that you get served.

RSS icon, courtesy of www.feedicons.com RSS feed for this blog

About the author

I'm a software developer who's worked with a variety of platforms and technologies over the past couple of decades, but for the past 7 or so years I've focussed on web development. Whilst I've always nominally been a "full-stack" developer, I feel more attachment to the back-end side of things.

I'm a web developer for a London-based equities exchange. I've worked at organizations such as News Corporation and Google and BATS Global Markets. Projects I've been involved in have been covered in outlets such as The Guardian, The Telegraph, the Financial Times, The Register and TechCrunch.

Twitter | LinkedIn | GitHub | My CV | Mail

Popular tags

Other sites I've built or been involved with

Work

Most of these have changed quite a bit since my involvement in them...

Personal/fun/experimentation