Calvin prof using AI to hear whisper in Twitter's whirlwind
When looking at Twitter,聽 辫谤辞蹿别蝉蝉辞谤听Keith Vander Linden formerly saw noise: a continuous roar of chaotic 280-character messages. From this tumult, however, he now discerns meaningful patterns: 鈥渋f you look at enough tweets,鈥 says Vander Linden, 鈥渨ith the right kind of statistical models, you can derive a signal from that, you can find out information about what people are saying about stuff, and from that you can infer what they are thinking.鈥
An international effort
Vander Linden and Roy Adams, a Calvin senior majoring in computer science and mathematics, are analyzing Twitter to assess the Australian public鈥檚 stance towards mining companies. The duo is working with colleagues at the Commonwealth Scientific and Industrial Research Organization鈥擜ustralia鈥檚 national research labs鈥攁nd the University of Tokyo. Specifically, they hope to identify if mining companies have what Vander Linden calls a 鈥渟ocial license to operate,鈥 or public support for their actions.
鈥淸The mining companies] are a big deal in Australia,鈥 VanderLinden said, 鈥淎ustralia has significant reserves of natural resources, precious metals, coal, natural gas. [The mining companies] are the Microsofts, the Facebooks, the IBMs; they鈥檙e very influential, but they鈥檙e controversial, so we鈥檙e looking at Twitter to figure out what people think about them.鈥
Assisted by artificial intelligence
This process is performed by an artificial intelligence (AI) program that Vander Linden and his colleagues are designing, which 鈥渞eads鈥 through a 600,000-tweet database, analyzing each tweet word by word, and then classifies each tweet by the stance with which its word usage is associated. Vander Linden and Adams can then evaluate the quantity and content of tweets for or against a given issue.
The AI techniques used by the program have only become effective in the last ten years: 鈥渢his paradigm shift happened in 2012,鈥 said Vander Linden, 鈥渢his shift from what were called symbolic AI systems to statistical AI systems; everybody鈥檚 moved to statistical, mathematical models.鈥 He added, 鈥渨hat we are doing is riding the wave of that huge shift to the statistical mechanisms.鈥
Adams said that he joined the project as a student researcher because of the project鈥檚 focus on statistical machine learning, which allowed him to apply many of the skills he had developed in recent mathematics courses, like linear algebra. He added, 鈥渢his project has been really interesting and enjoyable, especially because it鈥檚 such a hot field right now.鈥
Seeking patterns in creation
Vander Linden and his colleagues are aware of the ethical issues which surround big data analysis; having worked in computational linguistics for three decades, he鈥檚 observing real change: 鈥渢he industry is slowly policing itself, and we are following those ethics guidelines as they develop.鈥 He and Adams are taking steps to prevent data misuse: they collect only relevant tweets and tweet information, and they do not publish user information.
Vander Linden and Adams鈥 work in the field is ultimately driven by more than a desire to understand public opinion: 鈥淭here are statistical patterns in this gift of language, it鈥檚 not random, there is a signal there, it鈥檚 part of the created order,鈥 said Vander Linden; 鈥淕od gave us language, and language has meaning; we鈥檙e looking for those meanings and stances hidden in the text."