Tuesday, July 27, 2010

Data Sorting World Record Falls



It's hard to imagine sorting a terabyte of data in one minute...but that's what computer scientists at the Jacobs School did...and for their efforts, they got themselves a world record at the Sort Benchmark competition. (Check out the full Sort Benchmark / UC San Diego press release.) But before you go, consider the fact that data sorting is a big deal, for a variety of reasons. Facebook ads and Amazon product suggestions are generated thanks to heavy duty data sorting techniques. Companies across the world are turning to data sorting to sift through the mountains of potentially relevant data that are piling up...data analytics in action.

The lead computer science graduate student on the project, Alex Rasmussen (pictured below), explained to me during our photo shoot in a Calit2 server room, that data sorting is a good way to flex a whole bunch of computing / networking / systems muscles. He explained it this way:



“Sorting is also an interesting proxy for a whole bunch of other data processing problems. Generally, sorting is a great way to measure how fast you can read a lot of data off a set of disks, do some basic processing on it, shuffle it around a network and write it to another set of disks,” explained Rasmussen. “Sorting puts a lot of stress on the entire input/output subsystem, from the hard drives and the networking hardware to the operating system and application software.”

For anyone following along, this is the follow up to the 10,318 seconds post.

Meet the (Electrical) Engineer in the San Diego Union Tribune

Electrical and computer engineering professor Gert Lanckriet was profiled in the "Meet the Engineer/Scientist" feature in the July 26 "Quest" section of the San Diego Union Tribune.

So check it out before the paper makes its way to the recycling bin. An image of the story is below.