Max Kaufmann ’12 stepped off the plane from a 14-hour flight to Calcutta, with no cash in his pocket and no experience navigating international travel. So what brought the independent major from Spring, Texas, to this diverse country?
Linguistics and Twitter.
Kaufmann came to Grinnell considering an art major but when he took Introduction to General Linguistics, he was hooked. The linguistics concentration at Grinnell is interdisciplinary, with core courses in psychology, philosophy, English, computer science, and anthropology. Kaufmann enjoyed the concentration courses, but wanted more intensive instruction, especially in the field of computational linguistics, or the use of computers to understand language.
So Kaufmann applied for an independent major. Independent majors create, with their advisers, a program of study tailored to their individual interests. One of Kaufmann’s advisers, computer scientist John Stone, suggested that he look for a summer opportunity to augment his interest in computational linguistics.
Kaufmann secured a summer research position at the University of Colorado, Colorado Springs, through the National Science Foundation’s Research Experience for Undergraduates (REU) program. His three-month research project focused on Twitter and making the abbreviated language of the social medium “less noisy” and more readable through the use of Natural Language Processing (NLP).
“In the field of linguistics, we call Twitter a ‘noisy text,’ because there’s a lot of stuff in there that you don’t want. I was focused on stripping that stuff out using natural language processing and making the messages more readable,” Kaufmann said. His research involved using computer translation tools to turn Twitter English into normal English, comparing human and machine translations.
The outcome of his short summer experience was a research paper submitted to the International Conference on Natural Language Processing (ICON) at the encouragement of his summer mentor, Dr. Jugal Kalita.
Although the conference has only a 20 percent acceptance rate, Kaufmann’s paper was accepted for presentation in December in Kharagapur, India. "Once I found out I had gotten accepted, I still had to deal with getting to India somehow,” he said. Stone suggested Kaufmann request funding from the College’s Center for International Studies (CIS), where director David Harrison guided him through the grant application process.
“I got experience doing research, getting published and applying for funding all at once,” Kaufmann said, referring to experiences often held for graduate students. With funding from CIS, the Division of Student Affairs, and personal resources, Kaufmann met his mentor in Calcutta to make the arduous cross-country trip to Kharagapur.
“India is a linguistically diverse place, and very little English is spoken in Kharagapur,” Kaufmann said. “It was a huge culture shock to be one of only two or three Caucasians at the conference. But we were there to solve problems with NLP that could help solve problems in the country.” Kaufmann’s research is of particular interest in India, where some relatively common languages have small digital footprints. The recent increase in mobile phones and texting offers linguists studying these languages a rich source of data.
During spring semester Kaufmann will study at the Institute for International Education of Students in Salamanca, Spain, where he will focus on syntax and semantics in his linguistics pursuit. Just don’t expect any Twitter posts about the Iberian experience — he’s not a Twitter user. “I think it’s pretty silly from a social standpoint, but from a research standpoint, it’s interesting and exciting,” he said … in less than 140 characters.