Word Play

August 15th, 2008

The other day I happened to come across a nifty new feature on the data sharing site “Many Eyes” called “Wordles”. As you may know from my last post, I’ve already written about Many Eyes before (here and here) as well as another data sharing site, Swivel, here.

But Wordles are something new: highly customizable tag clouds. Many Eyes already has tag cloud functionality, of course; I was an especially big fan of their tag cloud comparsion tool. But tag clouds don’t have the best reputation. Some people outright hate them. And the people at Many Eyes admit they’re a bit of a “toy”. (A toy for nerds, anyway.) Why bother adding another tag cloud tool, then?

Here’s what Many Eyes has to say about the issue: “People have reported finding value beyond entertainment in creating these word clouds. Teachers have used Wordles in classrooms as conversation catalysts; others have created them to express their identities, and scholars have used them to visualize the output of statistical explorations of texts.” So obviously they’re novel at visualization. Let’s look a little deeper, though.

One thing you can use Wordles for is to analyze texts you’re already familiar with. In fact, I thought you could use them as a sort of “Cliffs Notes” for blog posts. To test this idea, I took four sample posts of mine (1, 2, 3, and 4) and made Wordles of all of them: 1, 2, 3, and 4.

If you want to experiment along with me, feel free to read those posts first, if you haven’t. Then, try looking at each of those visualizations and seeing how well they summarize those posts. Alternately, you could try just looking at my Wordles first and see how they do (especially if you’ve already read them). I’ll give you a short break here, if you need it.

INTERMISSION

Back to our regularly scheduled programming. Personally I found Wordles worked fairly well as an “emotional Cliffs Notes” for a post. I’m not sure you get a lot of analytical content out of these things, but it seems like the major concepts and the overall “feel” of a post are conveyed quite well by a Wordle. If a picture is worth a thousand words, I’d say a Wordle is worth a few hundred. Wordles definitely hit the highlights of jargon-heavy posts like 1. They also did a good job at giving you the gist of new technology posts like 2, which are often highly focused. Lastly, I think Wordles captured the feel of cuter, human-interest style posts like 3 and 4 quite well.

I don’t know if I could tell you exactly what a blog post was about after seeing a Wordle for it, but I could probably tell you the major concepts and how important they were, relatively speaking, and I’d also probably have a rough idea about the post’s topic. Seems like a good thing to look at just before or just after reading a post proper, either to prepare you beforehand or reinforce what you’ve just seen. Wordles are kind of like an artistic way of taking notes, I guess.

Of course, I wrote all those posts, so perhaps my objectivity is lacking. Maybe I already knew those posts too well to give an accurate account of what Wordles can do. To account for this, I decided to make a Wordle for something I knew intimately (but didn’t write) and a Wordle for something I only had a vague idea about to compare.

For something I knew intimately, I used Lady Van Tassel’s ending monologue from the climax of the movie “Sleepy Hollow” (directed by Tim Burton). I don’t quite know why, but I’m semi-obsessed with this movie. Actually, I probably do know. After several years, I figured out the movie involved an almost stereotypical love story between - in MBTI personality theory terms - a Rational (Ichabod Crane) and an Idealist (Katrina Van Tassel). (Also, I love the science/religion dichotomy and Tim Burton in general, too.) If you want some background on MBTI theory, feel free to check out my post about it here.

The main point, though, is that I know “Sleepy Hollow” well. I know it so well that after the third or fourth viewing (I’ve seen the movie over 20 times) I found a plot hole shortly after the monologue in question, which I found out was fixed in the original script, but was later changed for reasons I can’t understand. (The idea that Lady Van Tassel can, without explanation, return to Sleepy Hollow and collect her inheritance after every other heir has died and she herself was presumed dead is totally preposterous. So much for her plan. It’s not obvious the first time you watch it, though.)

As for something I knew less well, I used Lyndon Johnson’s “Great Society” speech. Lyndon Johnson’s been on my mind ever since I started playing the excellent board game Making the President, which simulates the 1960 presidential election between Kennedy and Nixon. (Johnson, of course, was Kennedy’s vice president.) Knowing roughly what the “Great Society” was, I had the vaguest idea what the speech was about, but it’s nothing I really studied.

I made Wordles for both speeches here and here. Again, feel free to try this experiment along with me. (And leave a comment about your experience, if you’re so inclined.)

INTERMISSION

I must say, Wordles definitely gave me the gist of both posts and really conveyed the emotional content well. All the right words were really popping out, and whether I knew the content beforehand or not, the Wordles seemed to do the job admirably.

For example, you can easily see the most important happenings, names, and events in the Sleepy Hollow monologue. And in Lyndon Johnson’s speech, the biggest concepts and buzzwords really jump out at you. Word frequency, when done well, really does seem to work like an “emotional Cliffs Notes”.

Of course, you can make Wordles that focus more on style, at the expense of content. I purposely used only four distinct colors and horizontal orientations for words so that the words were as clear as possible. If you use different color shadings and orientations, everything gets a little harder to read, but it might look cooler. I also purposely chose a serif font so that the words would be spaced farther apart and be maximally clear. Normally, a sans-serif font is better to read online (as mentioned in that Wikipedia link) due to less on-screen clutter, but in the particular case of Wordles serifs appear to actually reduce the clutter, since words can appear anywhere in relation to each other, often near the serifs. I also chose the calm default colors, since they have a lot of contrast without being “loud”.

The customization is definitely something that can be a problem - you can easily make text that’s downright unreadable if you want. There’s some other issues I had with Wordles as well; I’ll admit that Many Eyes has definitely improved the uploading interface since the last time I used it (uploading is often one of the worst parts of these data-sharing websites), but still, I often found myself going back to the source to read the data I uploaded because it was hard to read on the site, and the interface was a bit clunky. And the uploading process is still not something “your Grandma could do”, in the words of the Data Basin folks. And though it’s a small thing, for some reason Wordle also gives you semi-random settings each time you make a Wordle. I’d much rather have it remember the settings I used last time, or at least have a default option or something.

The worst thing I found, though, was the showstopping Firefox 3 bug in the Wordle application. It will often, without warning, crash Firefox 3 upon loading. That’s bad. Especially since I have almost never had this happen on any other page in any recent release of Firefox. It’s happened to me at least 15-20 times, so I know it’s not a fluke. And it’s apparently happened to other people as well. I don’t know what’s causing this, but hopefully they fix it soon. (Wordle crashed Firefox the last time I made a visualization, actually. It caused me to lose part of this post. OK, deep breath, exhale. Deep breath, exhale… that’s better.)

But I don’t want to be too negative. I think Wordles are actually pretty cool, and if you use my method (I’m sure there are others) or some of the uses mentioned by Many Eyes, maybe they’re more than just “a self-described ‘toy’”. Obviously, the next logical step is tag cloud comparisons for Wordles, right? Well, actually, I’m not sure that’s the best idea.

| | del.icio.us

One Response to “Word Play”

  1. Q Says:

    The way they’ve made tag clouds more flexible is really cool. I bet they could combine it with something like Amazon.com’s statistically improbable phrases or semantic maps to come up with something really useful. In any event, I can imagine this being a great tool for graphic designers who want to convey a very broad sense of what the content is. Heck, I wouldn’t be surprised if someone hasn’t already used this as the inspiration for the cover art of a book.

    I think the one limitation might be that without some kind of factor for structure or relationships or context the clouds do not always tell us much. In particular the speeches by politicians will always have the same key words yet won’t really help clarify the philosophies or policies. I bet you could take Obama and McCain and input a bunch of speeches from them and get nearly identical clouds simply because they are talking about the same issues. Unfortunately, you’d only be seeing the topics and not the message. Of course, to a great degree the topics really *are* the message, but I hope you can take my point that this format paints with a very broad brush. I think the statistically improbable phrases approach, by contrast, may be too specific. There are pros and cons to both. Maybe clouds can adopt the best of both. Or maybe there will always be a variety of clouds for different purposes.

    That’s a really interesting parallel between tag clouds and those statistically improbable phrases. I think that’s a good insight. I too think there’s artistic potential there.

    I also agree Wordles are short on “analytic” content. And politicians are trained to always be on message, yes.

    Thanks for the great comment!

    - Dave

Leave a Reply