Dataclysm explores the insights we can glean about human behavior from massive data sets. The author, a co-founder of OkCupid, one of the longest-running dating sites, discusses his role in analyzing the vast amounts of data collected by the platform. It’s astonishing to consider the revelations derived from the data generated by a dating website: the number of messages exchanged before contact information is shared, the types of people an individual message, how they rate others, and with whom they ultimately connect.
Let’s begin with some intriguing statistics derived from the data. Women, on average, rate men lower when assessing their looks. When asked about their preferred age range for partners, they tend to skew 3-4 years within their own age bracket.
In contrast, men consistently rate women in their early 20s the highest, yet they also tend to give higher ratings to women overall. This suggests two things: 1) women may reach peak perceived beauty around their early 20s, and 2) men are generally less critical of looks.
Another fascinating and potentially useful observation is the phenomenon of extreme message reception. While it’s logical that highly-rated (conventionally attractive) individuals receive more messages, those with significantly lower ratings also experience a surge in messages. This is often due to a distinctive or polarizing trait that elicits strong reactions, either positive or negative. A quirky feature, such as blue hair, can generate nearly as many messages as those received by conventionally attractive individuals.
The book delves into numerous aspects of human nature, far too many to list here. One challenge in reading Dataclysm is that it was published in 2014, making some of its predictions feel dated. For instance, sections discussing the prediction of states likely to legalize same-sex marriage based on Google Trends and Facebook relationship statuses, and another about flu outbreaks, reflect topics of that time, but are less relevant now. However, the book accurately predicted the rise of Twitter mobs.
Other startling revelations gleaned from the data include:
- The ability to determine ethnicity and gender based on the most frequently used words in Facebook posts and OkCupid messages.
- The prediction of sexual orientation based on Facebook friends and later, Facebook likes.
- The potential to predict IQ with sufficient data and analytical tools.
One particularly interesting finding concerned bisexuality. Although approximately 5% of OkCupid users registered as bisexual, most primarily messaged individuals of one gender. Only 2.3% actively messaged both genders.
As a computer scientist and former CEO, I found this book incredibly insightful. It underscores the advancements in data mining that must have occurred since its publication.11 years ago.
5 stars.
0 Comments