20 December 2009

Names: How many of each?

Names, names. How many of each name are there and how do the respondents identify in regard to their ethnicity or race?

A Facebook blog entry was the focus for a Wall Street Journal article on comparing diversity of its members with the US Census Genealogy Project.

How do Facebook members compare to the Genealogy Project listing the most popular surnames?
Facebook data team researchers Lars Backstrom, Jonathan Chang, Cameron Marlow and Itamar Rosenn wrote that they were interested in finding out more about the composition of race and ethnicity among Facebook users. So they compared Facebook users’ surnames with data from the US Census Bureau's Genealogy Project, which measured the frequency of the most popular 150,000 surnames in the United States along with the race and ethnicity associated with the person who had each one.
The Wall Street Journal story appeared in Digits, its blog on technology news and insights, which referred readers to the U.S. Census Bureau’s Genealogy Project.

That link provides three files. The first is the methodology and definitions. The second: an Excel list of the top 1,000 names; the third: a list of names appearing 100 times or more, for a total of more than 151,000 names.

To read more about what the data scientists at Facebook found, read the links above.

Tracing the Tribe did its own look-see on the third list of names for DARDASHTI and TALALAY, as I certainly didn't expect to see them among the 1,000 most popular surnames.

DARDASHTI ranked 112,365 with a count of 145 individuals.

That was surprising as we have more than 1,000 relatives in Greater Los Angeles with this name. If it were only 145 individuals, family weddings - and even Passover seders - would be much smaller.

The probability of finding the name in every 100,000 people is 0.05; 80.69% identified White, 0% Black, 0% Asian, 17.93% as two races, and data was suppressed for American Indian/Alaska Native and Hispanic.

This is understandable as Persians believe they are Asian (Iran is in Asia), but the Los Angeles School District and other city/federal similar lists put Asian under Caucasian. And there are a few marriages in the family that include other races.

TALALAY did not appear - it is such a rare name anywhere and appeared in no variant spelling of the 35 original name spellings. However, reflecting the name change in 1905 by my immigrant great-grandfather, the adopted surname TOLLIN was in the list and ranked 84.310 with a count of 207 individuals.

While most TALALAY became TOLLIN, so did others named TOLCHINSKY - as I've discovered over the years. Not all TOLLIN are TALALAY. The percentage found in every 100,000 individuals was 0.08%; 94.2% identified white; 3.38% identified black, 0% American Indian/Alaska Native, Data was suppressed for two race and Hispanic.

Have you checked your own names of interest?

No comments:

Post a Comment