Naming A.D.
March 14th, 2008There’s an awful lot that’s been written over the years about what to name your kids, in the U.S. especially. A good recent example is from a book I mention frequently in here, Freakonomics. In it, the authors discovered that your kid’s name doesn’t correlate at all with how they do in life, and talk about what the most popular baby names were and where they come from. (For the record, my name, David, has always been popular. It’s nice to be popular in at least one thing in my life.)
Another good example was in the New York Times, where they discussed bad names in a recent article. Turns out it’s a popular topic - I found an entire website and book dedicated just to it. I admit I’m personally guilty here of abusing bad names. I used to know a kid named Justin Case, and I actually thought it was kind of funny at the time. (You have to cut me some slack here - I was in elementary school and still thought fart jokes were the epitome of humor back then.)
From what I read, apparently it’s mostly fathers that give kids these stupid names, and their waning influence in naming has reduced the bad naming trend somewhat. But with people out there choosing names like “Trout Fishing in America” for themselves (check the “Legacy” section), bad names will probably never completely go away.
That’s not all. There’s even a “Baby Name Wizard”, with an accompanying book. I happen to own the book, and it’s excellent, I must say. It uses pseudo-sparklines for the popularity of baby names over time that would make Edward Tufte proud.
But nowhere in those links do they discuss the origin of the names in detail. What I was really curious about was the proportion of names (in the U.S.) that come originally from the Bible. I’d always read that an absurdly high number of U.S. male names were Biblical in origin, but I never saw any evidence to back that up. I managed to root up an article where some people discuss naming your kids after people in the Bible, but as usual there wasn’t a lot of hard data on the subject.
That meant that Dave the Data Vulture had to go back into action. I figured additionally that this would be a good opportunity to finally write about something religious in here, as comparative religous study has been a long-term interest of mine. I recently read both the Bible and the Koran, but I was struggling to find a way to incorporate those here without causing a lot of controversy.
Plus, data-related angles on either book are not easy to find. What sorts of data would you use? Miracles per page? Number of times the word “Muhammad” appears? These kind of measures risk being flippant and glib with these highly revered holy books, which is certainly not something I’m trying to do. So it all works out, in a way. (Still not sure how I’ll incorporate the Koran, though. Though I doubt there are many devout Islamic statisticians reading, if you’re out there, feel free to drop me a comment!)
Luckily for me, I quickly found a site - Biblical Baby Names.com - that would give me the data I wanted. You can easily view the top 200 male and female baby names by decade (and each year for more recent data), and whether they’re from the Bible or not. And if the names are from the Bible, you can see what their specific origin is for each of them. Pretty cool, if you ask me. (By the way, they use the Social Security Administration as their source, so you know their data is on the level. Here’s some background on the SSA’s methodology if you’re curious.)
I looked quickly through the names and I could tell they did their homework on the sources. They even include nickname variants like “Dave” as Biblical. (Maybe that’s how King Saul referred to David back when they were Biblical buddies.) I also found out names like Jack (derivative of John) and Eric (as Erick) were Biblical in nature, even though I always thought of them as Anglo-Saxon and Scandanavian in origin, respectively. (However, many of the non-Biblical names do seem to fit these categories, so I don’t think it was too bad of a guess.)
Still, the data wasn’t quite in the format I needed. As I’ve discovered, being a data vulture often means heavily processing the data you find. (Does that correspond to the cast-iron stomachs of scavengers? Maybe I’m taking the analogy a little too far here…)
( *** Feel free to skip to the next section if data processing isn’t your thing. *** )
After some playing around, I found out I could copy and paste the tables that Biblical Baby Names used into Excel. 19 copy-and-pastes later (!), I had all the data in a single spreadsheet. But then the only way I could tell which names were Biblical was the fact they had hyperlinks (even in Excel). Alas, there’s no standard way you can detect whether a cell has a link in Excel. So I used a function in the extremely useful (and free) ASAP Utilities to convert those links into proper HYPERLINK formulas.
Then I used this user-defined function I found in VBA to detect whether the cells had formulas or not. The Biblical names had the HYPERLINK formulas, whereas the others didn’t have any formulas at all. After I autofilled my formulas (and the years) across all the data, I had a proper data set. Since the 2000 decade data was split up into 7 years (2000-2006), I had to average all that data into one decade figure to make it comparable to all the other decades. Then I just used some tricks involving the SUBTOTAL function in Excel, and I was good to go. It was harder than usual to wrangle the data, so I was actually pretty proud of myself for scraping all that data off the website and putting it all together.
( *** Start reading again here if you skipped the last part. *** )
What, then, were the results? Historically (from 1880 to 2006), 48.5% of boys got Biblical names, compared with 56.8% of girls. Thus, 52.6% of U.S. kids in total got Biblical names. (Remember, only the top 200 names from each decade, for both boys and girls, were counted.)
Today, things are a bit different. So far this decade, 57.6% of boys got Biblical names compared with 58.6% of girls. In total, that means 58.1% of kids born in the U.S. from 2000-2006 had names from the Bible.
Obviously, a lot more kids have Biblical names these days, which surprised me. However, the overall figure in each decade surprised me as well. I expected many more kids on the whole (especially boys) to have names from the Bible. Here’s the same results by decade in chart form:
As you can see, until recently boys have always had fewer Biblical names than girls. (Which is ironic, considering there are probably 5-10 times as many male names as female ones in the Bible.) But for some reason, in the ’40s, the ’70s, and the ’90s, the number of boys with Biblical names shot up. I don’t really have an explanation here. Maybe WWII, Vietnam, and the Gulf War caused surges in Biblical naming interest? That’s a total guess, really. (At least we know there was probably less boys in America named Saddam.) It’s an interesting trend, nevertheless.
Anyway, largely because of those surges, boys and girls now have about the same proportion of Biblical names overall (which have grown as a whole). Girls by themselves have experienced the same upward trend but much more muted. There’s always been a lot of girls with Biblical names, and that hasn’t changed much over the last 120 years; there were spikes in the ’40s, ’60s, and 2000s, but not nearly as much as with boys. (Not a lot of female dictators out there to mess up the stats, I guess.)
Granted, the Biblical naming trend hasn’t always risen. The number of Biblical boys’ names dropped heavily in the 1890s and 1900s. Girls saw big drops in the 1890s too, and much later on in the 1990s. I have no great explanation for these either, “Trout Fishing in America” notwithstanding.
Given that the U.S. has been the only major industrialized nation to experience an increase in religious fervor over the last century, you might indeed expect such a trend in names. But is this trend accurate? You might wonder if there’s unfair weight being given to the less popular names that still are in the Top 200. After all, I’m assuming that each name carries equal weight in my analysis, but in fact, the current top male name, Jacob, accounts for 1 percent of boys’ names, and Emily, the current top female name, accounts for around 1 percent of girls’. The numbers drop off very quickly from there.
Thus, if you can see a trend between the popularity of baby names and whether they’re Biblical in origin, my assumption that all of the top 200 names should have equal weight would be wrong. We’d have to weight certain naming segments differently to account for the fact that a lot more (or a lot less) babies have Biblical names total. Thankfully for my results, there doesn’t seem to be a trend between the popularity of a baby name and it being Biblical in nature. Here’s a quick chart to demonstrate:
I certainly don’t see a trend over the years between the percentage of kids with a Biblical name and how popular each name was (regardless of gender). Thus, it seems reasonable to assume that we should weight each name equally, regardless of popularity.
In summary, I think I’ve pretty much answered my original question (and then some). Like in that famous fight against Goliath, the problem was bigger than I expected, but with the right tools and a little thought I’ve managed to do all right. As long as I don’t meet any women named Bathsheba, that is.
(As usual, here’s the data set I used, minus the fancy Excel tricks and the linked cells.)
| | | del.icio.us |
March 21st, 2008 at 12:31 am
[…] by Google, I have faith this will eventually happen, something that would make me, a self-confessed “data vulture”, very […]
March 26th, 2008 at 2:38 pm
Pretty neat trick to determine whether or not the cells contained hyperlinks. I’ve never encountered a situation where that information was vital, so I had no clue how to tackle that issue. I would have thought that there was a way to use paste special or a format-based function to either strip out the hyperlinks or determine whether or not a cell’s text was underlined. I tinkered with it for about fifteen minutes before I realized it’s just not that easy. I wound up giving up on built-in functionality and wrote a quick macro that was able to do it. It’s not something common enough to warrant a user-defined function, but I was thinking about doing it just for silly fun.
Thanks, Chris. It was the first time I’ve ever run across that situation either. I don’t think there is a built-in way to tackle the situation, like you say. I figured it was easier to let someone else write that functionality for me, so I used copy/paste code and the ASAP Utilities. Writing the code is always a solution, though. Thanks for the feedback!
- Dave
March 28th, 2008 at 12:34 am
[…] a description. A lot of datasets I had didn’t fit that criteria, but the one I used for my recent post on Biblical baby names worked nicely. All I had to do is pare the data down to the bare bones, […]