Blue Scholars and Tuba Man

First – probably for today only, the Blue Scholars new album, Cinemetropolis, is available for $5 from Amazon MP3. Everyone should buy this. If you’re unimpressed by the beats in the first cut you need to check your pulse.

While listening to their homage to Seattle’s neighborhoods and unsung heroes, Slick Watts, I went looking for information on Slick and found this short form video they made for the song and featuring the ex-Sonic.

The video seemed a little weird to me because I can’t pretend I’ve got the sports or local roots to fully appreciate it. But toward the end of the video while they’re sitting around a gabling table reflecting on the games, Saba nails it:

…out of all of that – nothing’s gonna compare to seeing Tuba Man playing outside the game.

Tuba Man (Ed McMichael) was practically a Seattle institution. He was this awesome, friendly guy who you’d find outside almost every sports event in the city and usually inside the game, too. Before the games, he would sit outside the Kingdome, Key Arena, or Safeco gently tooting his tuba. A lot of the time it didn’t really sound like music, but that was well beside the point – his bleating was the perfect soundtrack to a dark Seattle night before an NBA game or even in the summer. He played for money and was always happy to talk with anyone who approached. If I was on closer terms with him he might have shared some of the giant jugs of juice he always seemed to surround himself by, I don’t know, but I definitely remember introducing to my dog, Io, one winter while he was playing near the fountain at Seattle Center. Io was VERY curious what animal was making the noises coming from Tuba Man’s horn and wanted to go up and say hello but as soon as he heard Tuba Man’s booming voice he decided they wouldn’t have a lasting friendship. Tuba Man laughed about this and kept playing.

My other memory about Tuba Man is that trying to find him inside the games was always like a real life “Where’s Waldo?” He was usually there – I remember during Mariner’s games he had a big foam M he would put on his head and whenever the home team scored he would get up and dance (march) in his seat until they stopped playing the “we scored” music over the PA. He was awesome. His story ends sadly and if you’re interested, you can read about his murder on the wikipedia page. I think he’d be happier to be remembered as he was in life, like in this video:


Kim Jong Il is dead

Perhaps you’ve been living under a rock, or perhaps you just haven’t been connected to something electronic for the past hour, but Kim Jong Il died today at the age of 69. A couple months ago I started researching North Korea (or the Democratic Peoples Republic of Korea / DPRK) and found the whole thing fascinating. Here are some semi-structured collected thoughts on some of the things I found (and conclusions I drew).

  • North Korea is a wildly crazy country, unlike possibly anywhere else on the planet right now. My sense is that the control of information, propaganda, and quasi-abuse of the people living there is, on an ongoing basis, just about as bad as anywhere in the world. Consider that there is essentially no internet connectivity in the country and there are approximately 1 million telephone lines in a nation of about 24 million.  The “no internet” thing seems a little over the top, if not entirely surprising considering the amount of control over information that most people realize the government exercises, but the idea that 19 in 20 people do not own a telephone is just hard to fathom.
  • Vice TV, which is apparently some offshoot of MTV, went there a couple years ago and produced a multipart Guide to North Korea. I feel this is very much worth the time investment to watch. You can’t simply decide “I’d like to go to North Korea” and book a flight on Delta to Pyongyang, but it is possible for Americans to visit (and at least a couple flickr users have). There are a couple sites that describe how this works, but my impression is that the Vice guide has it right: applications are screened heavily, people who are likely to cause trouble are usually rejected (it’s a surprise Vice got in), and the itinerary is highly, highly controlled.
  • Pyongyang, despite its population of >3 million, seems like a ghost town.  By all accounts, if you walk around the city at any time of day, you’ll encounter no one.  People just don’t seem to go from place to place, out for lunch, out to shops or restaurants, walk pets, or socialize.  “Accounts” are sparse, so maybe this isn’t quite like it seems, but if you consider that this is probably a bigger city than Chicago and the capital of a state with nuclear bombs, I find it a little surprising and alarming.
  • People who visit North Korea seem to all wind up on a very tightly controlled and scripted tour of the country. I haven’t researched this too closely in a couple years, but it seemed like a couple organizations would help field your application that would go to the government and it seemed that most of the people who visited brought back artifacts that indicated that most of them had the same tours as one another and the same tour as is documented in the Vice video series. The tour inevitably leads to an incredible performance of “mass games” (emphasizing what can happen when millions of people perform in unison or some kind of Marxist dystopia / Stalinist wet dream) with hundreds of thousands of North Koreans performing for a couple international tourists.  It’s wildly, wildly crazy to think “this didn’t just happen ‘some time’ – one of these performances might be going on right now and there is an entire nation of people raising children whose greatest life memory might be a performance in one of these shows.”
  • Finally – you can’t get much information on North Korea.  For instance, what’s the hotel that every international tourist stays at?  Well, it’s here on this (easily controlled) island in the middle of the river in Pyongyang – but where is that?  Why can’t you find any hits for “Hilton Pyongyang” if you search Google maps?  In part, it’s because North Korea is essentially the only place on the planet where there is no information of this sort in the public domain. Even the Gaza Strip and Monrovia have some street, but as soon as you get to the border between South Korea and the North, it’s like you hit the astral plane and no one knows what’s there.  There is a project, though, at nkeconwatch where you can download a huge Google Earth database of roads, place names, and place markers of sites within North Korea.
This is enough for now.  This is a fascinating place on the planet right now from a social, political, military, and technological perspective. If you’re interested, I hope you’ll find some of these links helpful for further research.


Why is gmail so good at spam filtering?

Today I posted a message indicating gmail’s spam filters are good and Greg mentioned that it sounds like gmail is much better at this than the University of Washington.  This reminded me I like blogging and I wanted to write short background on what I know about this topic (which isn’t an incredible amount, but it’s a fair deal and more than most people probably want to know). Before talking about why gmail is good at spam filtering, it’s worth identifying a couple entities involved in junk email, or spam.

  • Spam is unsolicited junk email.
  • Ham is the email that gets through spam filters. Not all of this is email you want – it’s just what gets through the filters.
  • A false positive in spam filtering means something gets tagged as spam and gets filtered from your regular view of email but it was email that you wanted to see.
  • A false negative is a miss in spam detection.
  • A spam filter is a system used for sifting through your incoming email, applying a set of rules, and identifying its likelihood of being spam or not and taking an action based on that.
The title of this post asserts that gmail is good at filtering spam and I think most people who use it would agree with that.  Before switching to gmail, I maintained my own POP3 server with a private hosting company and immediately learned that doing this without some spam filtering system yields a totally unacceptable email experience.  It’s absolutely necessary for anyone who wants to use email (and not get overwhelmed with junk email and doesn’t try very, very hard to live “off the grid” in some sense) to not have some spam filtering.
So at the time I used SpamAssassin – it was very good. Most spam filters evaluate email messages against a set of rules that give the message a score indicating its likelihood to be spam or ham. These scores are evaluated with Bayes’ theorem to get some aggregate likelihood that the message is spam or not and a tolerance is defined in that system for ultimately deciding whether the message is shown.  I may be oversimplifying some details, but that’s the general approach and I suspect something like it is at least a part of gmail’s spam filtering (if not all of it).
So I mentioned SpamAssassin was good – why move away from it?  I don’t really think there is a good reason and if I were still maintaining my own mailserver, I would almost definitely continue to use it. But I’m not, and I don’t want to and there are tons of great engineers who work at Google who are trying to tempt me to not care about stuff like this and let them do that work for me and I let them.
Now to get to the point – why is gmail’s spam filtering good and why might it be better than a lot of other systems out there?
  • When you use gmail, you agree to give google a LOT of your personal information.  And they are very good at turning semi-unstructured data (like multiple GB of email) and finding patterns in it that can be useful for building rules that simpler systems don’t have access to.
  • Your mail and contacts are one. In most personal or hosted mail systems, your address book might seem like it’s in the server, but it might not be.  It might be stored on another server that sits right next to the server that your mail is on, but the spam filters might only have access to your email and not know who are the people in your personal address book that you want to always allow to send you email.  Google and gmail definitely know this, so even if your brother sends you a message that fires 10 alarms that make it look like spam to most spam filtering systems, Google might be able to have the “contact” rule trump those other rules.
  • Google has all the other email in gmail to use to identify spam, too. Say some spammer crafts a clever message and it gets through every spam filter in existence.  Now 5,000 gmail members all see it and mark it in their inboxes as spam – you are customer number 5,001.  I don’t know that google/gmail *do* this, but they could certainly use that as a filter, too, to retroactively identify the message as spam and yank it from your inbox and push it to the spam folder.
To summarize: Google have tons of engineers working on this.  They’re good at aggregating data.  They have a lot of data about you to pull from beyond simply “what’s in the email” to determine whether a message is probably spam.  And they have a lot of data from other people, too, to tell whether something is spam.  All of that adds up to, for me, almost never seeing spam and almost never having legitimate messages flagged as spam.’_theorem

Comments (1)

Adventures in stupidity

OPENING SCENE : our protagonist sits at his desk, working

ENTRANCE cheap RC helicopter and operator

OPERATOR (in the style of Butt-Head)

huh huh ... huh huh ... huh

The helicopter comes to rest on the edge of the protagonist's workspace.


If that gets within arm reach, I'm going to break it.



Here’s thing: I’m not good at hiding when I’m annoyed. And my body language is so clear that sometimes people shield their eyes when they walk past. But this fucking idiot can’t pick up on it even though I offered to break his toy.


Marathon pacers matter (Seattle Marathon 2007 to 2011)

Update 12/3/2011: Unfortunately, the information below can’t quite be trusted. I’ve taken a closer look at the results I pulled from the Seattle Marathon site at the time and the results today and I can definitely say that the data I pulled was not official and the results today are also incomplete. I’ll need to make an update to this post and all this research some time when official, complete results are available. What’s wrong? In the data I initially pulled, I see things like women’s winner, Trisha Steidl’s splits as 1:10:29 for the first half and 1:34:09 for the second half for a 3:03:38 (which is wrong and doesn’t add to that final chip time) – today the results say 1:29:32 and 1:34:09 (which seems right). This suggests today’s results are closer, however today the pacer chips are missing from the results. Anyway – I’ll work on this again when I can…

I’ve started collecting the data from the Seattle Marathon, 2007 to present and am doing some analysis on it, specifically from the perspective of the marathon pacers since I organized the pacers this year and we just finished the race. If I find the time to keep analyzing the data, this will probably be the first of many posts on the subject. If I don’t, this might be the first of one.

Assuming I’ll keep writing on this – here are the methods I’m using for the data source…

  1. I pulled down all full and half, male and female results from 2007-2011 and dumped the data into Excel.  This represents over 45,000 finishers over that period between the races.
  2. I did a little data cleansing – many records contained no data for the first split and just turned these into 0’s for processing in an Excel PivotTable.
  3. I used an Excel 5 minute rounding function to approximate the pacer that some finisher would be behind (e.g. a full finisher crossing the finish line at 4:13:35 evaluates to a 4:15 pacer, a full participant crossing the midway point at 2:02:41 would be behind the 4:10 full pacer, and so on).

Through 2010, the Seattle Marathon only offered full pacers for 3:30, 3:45, 4:00, and 4:45 (unhappy with this, in 2010 I lobbied for us to add 3:10 and paced that myself).  In 2011 I organized the pacing and changed the pacer structure to offer more times3:00, 3:10, 3:15, 3:20, 3:30, 3:40, 3:45, 4:00, 4:15, 4:30, and 4:45).

When trying to process this in the past, I had frequently tried to look at the finish result. This is pretty impossible to make any conclusions off of because (if you hadn’t heard) a marathon is hard and there are all sorts of reasons people do or do not make their results.  While it’s a lot more important to ultimately answer “did people make their goals?” without a questionnaire that’s fairly impossible to tell.  It *is* pretty easy to tell from the first half split, though, where people were setting their goals, and looking at some of that data, I see a clear indication that the pacers and pace groups matter.

The following chart shows a plot of full finishers over these 5 years of races and highlights the pacers that were offered for the races in those years.  The data shown is based on what 5 minute group they were running with at the first half split (not the finish) and the red rings highlight the 5 minute segments for which we had a pacer.

  • 2007-2010 there was some pretty clear clustering in most years of a large group clustered around the pacer segments. Sometimes the spike is a little outside the circled block, but I believe there is some pretty clear visual correlation (this includes 2010 when I had a group on track for 3:10 at the half)
  • 2007-2010 looking at the distribution of the field outside the pace groups shows a fairly smooth distribution of finishers. I think this further suggests that when there isn’t a pacer to associate with, runners tend to just distribute themselves more evenly.
  • In 2011, the distribution is much more choppy with more clusters of runners in the race with most clusters inside a pace group and most of the rarefied sections outside the pace groups.

This doesn’t help understand whether people are achieving tougher goals and there is no sophisticated analysis in here at all (maybe I’ll get to some of that in a later post) but I believe that it definitely indicates that runners will choose to run with a pace group if one is offered in the race.


Life in the northwest

A couple weeks ago as I was leaving my house and walking down the steps I felt and heard the familiar, distinctive, and disgusting sound of a snail shell being crushed under my shoe. At the time I had no reason to doubt my intuition that: “This was the single most disgusting thing I will experience all month.”

Fast forward to going out on a cold winter night and finding a fresh, live slug “incorporating” some of the crushed remains. Fast forward a couple minutes later to the time where I forgot that replacement slug was devouring its ancestor.

I was wrong.


emacs – the only guide you’ll ever need

I use emacs every day and for as much work as I can on a computer and have for about 9 years. It was not easy to learn, though, and I used it casually for about 8 years before starting to use it seriously and all the time in about 2002.

Learning was harder than I think it should have been – primarily because the main tutorial (invoked with C-h t) focuses on lesson after lesson of basic file and editing operations instead of trying to teach you just a couple very basic and core lessons about emacs itself. So, I attempt to present:

The only emacs tutorial you’ll ever need

Emacs does a lot and new users definitely needn’t try to understand all of it. I really ramped up dramatically faster in my learning curve once I discovered and mastered a very short list of basic functions that help explain the major interactions with the software.  Before this, I was very, very often feeling trapped by it and it convinced me (many times) to turn away (to vim, TextPad, WinEdit, notepad, and other software). Now I can’t imagine trying to use something else to get work done.

The short version: I believe that if you start by learning describe-function, describe-key (and where-is), apropos, modes (and describe-mode), and ctrl+g, you will ramp up on emacs much, much more quickly than if you do not.

  1. Every key press in emacs executes a function. Whether you press the “a” key or some key sequence involving the control (“C-“) or Meta (“M-“, usually by pressing ALT or the Escape key) keys, you are running some function. This is probably different from most software you normally work with.
  2. Every function has documentation. You can see this documentation by executing the function “describe-function” and typing the name of the function you want to get documentation on.
  3. Many functions can be invoked by name. You do this by pressing “M-x” and entering the function name in the minibuffer.  For example, if you type “M-x describe-function [ENTER]” emacs runs “describe-function” which asks you for a function name. Type “describe-function” and you will see the documentation on “describe-function”.  I said “many” and not “all” functions can be invoked by name – in the function’s definition it must be declared to be interactive for this to work. Emacs has a lot of non-interactive functions (e.g. basic lisp functions like car) which cannot be executed interactively.
  4. “describe-key” (and its close sibling “where-is”) can help you explore keymappings. I mentioned that when you press “a” it runs a function – to see what function that key sequence runs, type “M-x describe-key [ENTER] a”. This tells you pressing “a” executes “self-insert-command” and shows the documentation of self-insert-command (that it will “Insert the character you type.”). Similarly you could use “M-x describe-key [ENTER] M-x” to see that M-x is bound to execute-extended-command (which opens up the minibuffer and asks you for a function to run). Cool!  So let’s say that you know there is a function called “goto-line” which lets you jump to a specific line in a file.  You’re lazy, though, and don’t want to type that whole thing out whenever you want to use it.  “M-x goto-line” – so much typing!  Instead, you can type “M-x where-is [ENTER] goto-line [ENTER]” and emacs will tell you what keysequences goto-line is mapped to. In my setup, they are: M-g g, M-g M-g, <menu-bar> <edit> <goto> <go-to-line> – so I have three ways to get to it.  Another invocation of “where-is” and I learn that “describe-key” is bound to “C-h k” – so the quick way to do the first operation in this section (“what function is run when I press ‘a’?”) is: “C-h k a”.
  5. “apropos” can help you find (or remember) useful functions. Say you didn’t know that goto-line was the function to jump to a line in a file. If you type “M-x apropos [ENTER] goto” you’ll get a list of (interactive) functions that include “goto” in their name. Personally, I find this more useful to remind myself of a function I can’t quite remember than to find a function I don’t know at all, but it’s very useful. (short way: “C-h a goto”)
  6. Your major mode sets up a number of default behaviors about your interaction with emacs. All interactions take place in a single major mode and you can see this mode in the modeline it might be (“Lisp Interaction”, “Apropos”, “Shell” and others).  Depending on your mode, your keys will behave differently!  This can be very confusing to new emacs users.  For instance, when I press “C-h k <TAB>” (to inspect what the TAB key does) in Lisp Interaction mode it runs indent-for-tab-command (to indent some line for lisp programming), in Shell mode it runs comint-dynamic-complete (to try to tab-complete a function or file name), and in Apropos mode it runs forward-button (to navigate to the next linked entry in the apropos output).  “describe-mode” will tell you what mode you are in (and what minor modes are enabled) and what many of the major keybindings are for that mode. (short way: “C-h m”)
  7. Minor modes can be mixed in to add more customizations. Most of your keymap will be defined by the major mode you’re in, but there are some editing conveniences that can be put on top of this that may transcend any particular mode.  A pretty good example is “folding” – a behavior that lets you collapse large sections of a document and see a larger structure.
  8. Ctrl+g runs “keyboard-quit” You may find yourself locked in the minibuffer or with emacs trying to get you to complete some command you don’t understand – ctrl+g can frequently get you out of this.  (note: it’s not perfect – you might wind up in recursive edit but that’s another story).

These are the things I wish I knew before I started any of the tutorials.  The tutorials *are* good and the reference cards *are* handy, but I was frequently frustrated and confused why the keyboard didn’t react in the ways I wanted (I didn’t understand modes in general, least of all the one I was in), I didn’t understand how keys worked anyway (didn’t know about describe-key), didn’t know how to increase my proficiency once I started getting a little more comfortable (didn’t know about where-is or apropos), and didn’t know how to learn more about many of the functions (didn’t know about describe-function or apropos). Those are commands I still use every day when using emacs today.


New music, November 2, 2011 edition

Music is my life. Well, then again, not really – there’s friends, family, pets, computers and running.  But music is way up there.  And lately I’ve got a few things I’m newly into.  Here’s a short rundown – in no particular order. Every link is to a song that I think is worth listening to.

  • Male Bonding – I just posted the youtube clip of their incredible track – Bones – from their most recent album. I was on a training run about two weeks ago listening to their new album for the first time when I first heard it and it’s one of those incredible experiences when you first hear a song and it just stuns you. Previously I’d seen their video for Year’s Not Long, which I guess would probably be called gay-positive in the sense that it winds up with all the guys in the video making out with each other.  But Bones – *6 minutes* of pretty serious (if poppy) thrashing. There’s not a lot of complexity to these cats and you’ll probably immediately know you love them or they’ll bore you to tears.  I saw them play at Chop Suey as part of City Arts Fest and they were great, but it was a little strange to see a show so poorly attended (I’d say there were 50-100 people there and we basically all fit on the main floor).
  • Jay Reatard – died ahead of his time.  He looks and acts like a reject from the carny and the “pool-party-gone-wrong” theme of It Ain’t Gonna Save Me are an inspiring testament to someone I wish I’d gotten to see live.
  • Frank Turner – speaking of testaments – Eulogy is easily the most perfect <1 minute song I’ve ever heard (I was never a big D Boon fan). I saw him at Neumos and then, like in the linked clip, they led into “Try This at Home” which has some of the most perfect sing-along choruses I’ve heard in years. By the end of the show, he insisted on and succeeded in getting every member of the audience to sing along to Photosynthesis – and it was magic.
  • Carissa’s Wierd – It’s hard to know what to say about this band. Listen to Heather Rhodes and lines like “saw someone today who looked exactly like you – it’s funny how the years go by” or One Night Stand and “please don’t ask me what my thoughts are cause I don’t care about yours” and you’ll find tragic desperation that is just destined to be the soundtrack for sad memories and for the discount bins. Which is really unfortunate because they made incredible music and S still is.
  • Pajo – Keeping that thread going, David Pajo played guitar for Slint and apparently he’s still making music but as far as I can tell pretty much flying beneath the radar of everyone.  At least I just found a copy of “1968” used at Sonic Boom in Ballard and it had been getting marked down for the past 3 years.  When I listen to his cover of Where Eagles Dare or basically anything from 1968, I think “this must be what people got out of Elliot Smith.”
  • The Gglitch – this is hard to write about because this is the band that my excellent and incredibly talented cousin was in before he died of cancer. I just visited with his brother and he travelled a little this summer and was pursuing an excellent effort to try to get their last album into some public libraries. Anyway, my cousin’s keyboards on the lead track from their last album (which is Angeldust if you have Spotify) shows their amazing range. I don’t even know what style to call it, but I know that I love good, passionate music and that beyond missing my cousin – I believe this is it.
  • Jay-Z and Kanye – somewhere this post turned very melancholy and I want to turn it to an uplifting note and that comes from the Frank Ocean cut off Watch the Throne – Made in America. I could listen to the layers they put down on this over and over – and have. And I can do all that and look past the Big Ghost Chronicles review which trashes this track pretty hard because even Big Ghost has to eventually concede that “its still a pretty tight project son”

Give me some advice on what to listen to next!


Pacer seeding

The 2011 Seattle Marathon is just a couple weeks away. I’m organizing the pacers this year…

Aside: being pacer organizer is good but there are some weird experiences. I got email from some guy in Toronto who wanted to pace and another email from someone interested in pacing but who wouldn’t be in Seattle for the next year or two, so how about pacing then? But I digress…

…and in past years in the start area there have not been any signs helping the starters line up by pace.  There’s just one giant start area for both the half and the full, though the races start at very different times.  I tried to lobby to get some signs set up on one side of the start chute for the half with minute/mile pace markers and signs on the other side for the full (there are >3x as many half finishers as full finishers, so the pace groups for those paces will definitely be very different). It seems like we’re not going to get that, so I ran some numbers.

I went to the Seattle Marathon’s website for results and pulled down results from 2008-2010 for the half and full (mens and womens).  I sorted results by chip time to figure out where people really *ought* to line up given how fast they finish the race, and here are the results I found:

The key part for where we should line up with our signs are (assuming we can approximate how far back “the back” is and that the crows is uniformly distributed):

  • 1:45 half / 3:30 full should be about 1/10 of the way back from the start.
  • 2:07 half / 4:15 full should be about 1/2 way back from the start.
  • A 2:30 half / 5:00 full would be ~4/5 back from the start (however Team in Training are going to provide 5+ hour pacers this year)



This is my favorite song for 22 October 2011:


« Previous Page« Previous entries « Previous Page · Next Page » Next entries »Next Page »