CS547 Human-Computer Interaction Seminar  (Seminar on People, Computers, and Design)

Fridays 12:50-2:05 · Gates B01 · Open to the public
Next
Archive
Panos Ipeirotis · NYU Stern
Crowdsourcing: Achieving Data Quality with Impefect Humans
October 26, 2012

You need Flash player 8+ and JavaScript enabled to view this video.
Crowdsourcing is a great tool to collect data and support machine learning -- it is the ultimate form of outsourcing. But crowdsourcing introduces budget and quality challenges that must be addressed to realize its benefits. In this talk, I will discuss the use of crowdsourcing for building robust machine learning models quickly and under budget constraints. I'll operate under the realistic assumption that we are processing imperfect labels that reflect random and systematic error on the part of human workers. I will also describe how our "beat the machine" system engages humans to improve a machine learning system by discovering cases where the machine fails and fails while confident on being correct. I'll use classification problems that arise in online advertising. Finally, I'll discuss our latest results showing that mice and Mechanical Turk workers are not that different after all.


Panos Ipeirotis is an Associate Professor and George A. Kellner Faculty Fellow at the Department of Information, Operations, and Management Sciences at Leonard N. Stern School of Business of New York University. He is also the Chief Scientist at Tagasauris, and in 2012-2013 serves as "academic-in-residence" at oDesk Research. His recent research interests focus on crowdsourcing and on mining user-generated content on the Internet. He has received multiple best paper and is also a recipient of a CAREER award from the National Science Foundation. In his spare time, he writes about crowdsourcing and various other topics on his blog, "A Computer Scientist in a Business School," an activity that seems to generate more interest and recognition than any of the above.