Teaching Computers to Listen

Email ThisPrint ThisDeliciousDigg ThisGoogleShare to FacebookLinkedInRedditStumble ItShare to Twitter

In this second installment of a three-part series, Dartmouth Now profiles faculty appointed to newly endowed chairs this year. See photos of them all, including the six faculty appointed to established chairs, at Dartmouth’s Flickr page.

Dartmouth’s first endowed chair, the John Phillips Chair in Divinity, was established in 1787. Since then, endowed chairs have helped keep Dartmouth at the forefront of higher education by honoring and supporting faculty who advance the discovery of knowledge and return the gift of learning to their students.

What would it take to make searching for related songs in a music archive as straightforward as finding related phrases in an online library? Professor Michael Casey, James Wright Professor and chair of the Department of Music, is enlisting computers to gather and label the data that will underlie that sort of search.

Michael Casey

Professor of Music Michael Casey works on programming computers to extract useful information from sound. He is also applying his computer algorithms to film with a grant from the NEH. (photo by Joseph Mehling ’69)

Casey works in machine listening—programming computers to extract useful information from sound. Machines listen much faster than humans can, notes Casey, who also directs the Bregman Music and Audio Research Studio at Dartmouth, “and they can listen on a large scale.” After the equivalent of millions of hours of human listening, the computers can tell us details about musical culture and how it has evolved over the last 100 years, he explains. “Who was the first person to put a straight eight beat in a pop song as opposed to a swing beat? Where did the first reggae back-beat start? Where did certain styles of singing start? When you have access to archives that consist of most of the world’s recorded output, you can start to ask really interesting questions about culture.”

Thanks to a recent Faculty Research Award from Google, Casey and his research group are teaching machines to listen for the different parts of a song, so that the music can be searched by rhythm (groove) and sound (timbre). “Search by Groove” will involve close collaboration among Dartmouth’s Department of Music and Department of Computer Science, and Google Labs. Companies like Google, Casey says, are interested in such technologies for future search engines and music services. “There’s a lot of interest in helping people to find music and in connecting people to music that connects them to other people,” he observes.

But the implications of being able to search by sound and find related recordings reach far beyond the ability to track down a haunting melody. Casey is at the helm of a suite of such projects, leading research teams at the forefront of music information retrieval, machine hearing, and audio-visual archive search and navigation.

Searching Cinema

Casey is applying his computer algorithms to film through a National Endowment for the Humanities (NEH) Digital Humanities project called ACTION: Audio-visual Cinematic Toolbox for Interaction, Organization, and Navigation. “The main idea is to take the kind of computational tools that we’ve been using for a while now on collections of music to see if we can apply that to cinema as well.” He is collaborating on this project with Mark Williams, associate professor in the Department of Film and Media Studies, and will utilize machine listening and machine vision software to analyze such things as camera shot changes and soundtrack usage.

The pair will start their research with the films of Alfred Hitchcock and then progress to the Hollywood greats, pursuing the evolution of film’s visual language. Along the way, they’ll also be laying new ground rules and methodologies for the emerging field of digital humanities.

Opening the Archive

Currently, Casey is starting a project with the BBC, the United Kingdom’s national and international broadcasting service, in collaboration with Professors Dan Rockmore and Scott Pauls in the Department of Mathematics, and with Goldsmiths College at the University of London. The project hopes to deploy algorithms to listen to the BBC archives to reveal connections within it that aren’t currently well known, he explains. “Such archives are huge, hundreds of thousands of hours of music and audio, millions of documents. Only a tiny fraction of them are currently visible,” says Casey, who grew up listening to the BBC in Leicester, England. One of the possible uses of this technology is to “recommend items from recorded archives that overlap with your current listening or viewing habits, thereby extending your knowledge horizon.”

Casey’s inaugural appointment to the James Wright Professorship is a further nod to his habit of working across traditional academic boundaries. Established by the Sherman Fairchild Foundation in 2009 to honor Dartmouth’s sixteenth president upon his retirement from the post, the chair’s purpose is to further the College’s commitment to maintaining an excellent faculty and to foster an interdisciplinary learning environment. “There are so many smart people at Dartmouth, and I’m continually learning by working with other people who have different insights,” says Casey. “Those insights improve your own focus and research.”

Making Connections

Casey, who received his doctorate from the MIT Media Laboratory, is a musician as well as a scholar: a trombonist who has performed with the Barbary Coast Jazz Ensemble and an internationally honored composer of electronic music.

“It used to be, when I was a boy, that I’d save up my pocket money, and I’d go to the record store maybe once a week and buy one or two songs,” Casey says. “I had access to a hundred or so tracks each year. And now there are millions and millions of tracks to choose from, thanks to the Internet.”

Casey continues, “One of the prevalent trends in music in the last few years is mash-ups, a few seconds of 30 or 40 tracks put all together as one track. Woven together,” he notes, a mash-up is “a palimpsest, a new document from fragments of lots of other documents within it—and those other documents have meaning. I think there are some interesting questions to be asked about that,” he says.

“But,” Casey predicts, “I think one of the ways that people will consume culture is they won’t just want to watch or just listen: they’ll want to know what it is connected to.” Thanks to the work of Casey and his colleagues, searching for the answers to that sort of question is becoming more and more possible.