I learned a lot of vocabulary just by going through typing lessons in fourth grade. Words such as “alfalfa”, “lads”, and “sass” were very common in the first few exercises, due to their exclusive use of the home row keys. I found most of these home-row words strange and esoteric, as if they were contrived solely for the purpose of teaching students how to type.
Fast forward to seven months ago, when I was learning the Dvorak layout. Home-row words actually made sense in this world! Instead of words about vegetables, one could type common words such as “the”, “an”, “ones”, and “thin” with just the home row. Then I wondered, how much more advantageous, if any, is the Dvorak home row versus the QWERTY home row?
To find out, I wrote a small Python script to go through a list of words (in this case, a dictionary), and select those which could be typed using the home row. The script itself was very easy to write, in part lending to Python’s fantastic list comprehension abilities. The “Methodology” section of this post details the code used in the analysis.
After running the scripts, the difference became obvious right away. Dvorak absolutely blows QWERTY out of the water when it comes to home-row typing. While QWERTY has a mere 822 words on the home-row list, Dvorak has over nine thousand. That’s over ten times more home-row words than QWERTY. Even when disregarding acronyms and proper nouns, the distinction is clear.
Not just obscure words
Out of the top 100 words in the English language, how many can be typed in each keyboard layout? To find out, I used Peter Norvig’s analysis of the Google Trillion Word Corpus. His website provides extensive analysis deriving from this and other corpuses, and includes a plethora of resources.
Using a text file containing 1/3 million most common words in the English language, here is what I found:
|Out of the top n words||Dvorak contains…||QWERTY contains…||Dvorak to QWERTY ratio|
Dvorak trumps QWERTY for its use of the home row, containing more frequently used words than QWERTY can even dream of. This property allows Dvorak users to type with more ease, whereas on the QWERTY keyboard, much of one’s time is spent on the top row.
The dictionary I used was the Unix dictionary, available on most Linux systems by typing
With Python’s comprehensions and set operations, it was very easy to determine whether a particular string was a subset of the characters on the home row.
# sys.argv is set by the user, and determines whether to # analyze with QWERTY layout or with Dvorak layout home = set('asdfghjkl;\'') if sys.argv is "q" else set('aoeuidhtns-') for line in open(sys.argv, 'r'): line = line.strip(); if set(line.lower()) <= home: print(line)
The result can be streamed to a file in the command line, e.g. using
py homerow.py unix-dict.txt q > row-q.txt. Afterwards,
row-q.txt can be analyzed against a text file of the frequently used words.
The source code for the entire analysis is available on my GitHub at home-row-explorer.