The Data on Datamatch

20-year-old algorithm matches thousands of undergraduates

Over half of Harvard undergraduates awoke last Friday on Valentine's Day morning to find a personalized list of students with which a Harvard Computer Society algorithm determined they are most compatible.

A total of 3,672 students, 2,074 women and 1,598 men, participated this year year's Datamatch, which 20 years after its founding still uses the same algorithm.

“The pairing is actually pretty complicated,” HCS President William S. Xiao ’16 said.


HCS meets the night before Datamatch results come out to sort all of the questions from the survey into ten personality trait-based categories. Every answer is then plotted based on the traits it tests for.

Questions are then weighted based on the intuition of HCS members, distribution of answers, and the correlation between the answers of survey participants who indicated that they were in a relationship with another participant.

“A lot of people ask if it’s random, and it’s not random,” said Xiao, who added that there is “some secret sauce in it that I can’t reveal.”

Questions for the survey are generated less scientifically. “With the questions we just think about anything that will be funny, and we think about the implications of the questions later,” Xiao said.

Questions commonly contain Harvard cultural references, some of which Xiao said are generated from The Crimson’s Ten Stories That Shaped 2013. There are also generic questions, like “what would you like to see in a Valentine,” which also have comical answers.

Datamatch appealed most to freshmen this year, with 1,007 taking the survey. By contrast, 903 sophomores, 741 juniors, and 750 seniors participated.

One freshman, Maggie H. Schell ’17, who completed Datamatch with friends, said, “It was funny. We did it as a joke to see who we would get.”

Schell had not yet contacted or been contacted by any of her matches.


Recommended Articles