Project description: MixedEmotions developed innovative multilingual multi-modal Big Data analytics applications that brought together emotion analyses of user behaviour from multiple input channels: multilingual text data sources, A/V signal input (multilingual speech, audio, video), social media (social network, comments), and structured data. Commercial applications (implemented as pilot projects) will be in Social TV, Brand Reputation Management and Call Centre Operations. Making sense of accumulated user interaction from different data sources, modalities and languages is challenging and had not been explored in fullness in an industrial context.
Introduction: The Insight part of this project dealt with analysing text for emotion using social media data from social media networks like Twitter and Facebook. The aim of the project was to have the system automatically recognise and annotate the emotions present in a collection of tweets or Facebook statuses. The purpose of the research was to provide customer sentiment feedback to businesses in order for businesses to improve. This was an enterprise led research project.
The project itself had limited ethical issues. It was using the data of Twitter and Facebook users, but this data was publicly available and users sign terms and conditions that allow the use of this data in research.
Privacy and anonymisation
However, researchers did raise data privacy issues. Posts are public and one could argue that there is now privacy issue because people are putting this information on the web. On the other hand, these public posts can contain a lot of identifying data.
Key question: Does using social media data lead to privacy issues?
Key question: Can social media data be properly anonymised?
Who owns the data published on Facebook and Twitter? This is unclear. For researchers, publishing anything that is derived from this publicly available data, can be problematic.
Researchers would like to take the data, apply their annotations – in the particular case of the Mixed Emotions Project they’re taking a tweet, say, and annotating it to say this person is angry/happy/upset etc. Ideally they would like to have a very simple corpus where they simply publish the tweet with an emotion attached to it. This would be a very valuable resource for researchers but they are not allowed to do this. It’s not just about data privacy. It’s actually also a copyright issue.
If you tweet, you retain the copyright and you license that copyright to Twitter. The terms and conditions state that you give the service provider license to use that material as they choose. Twitter doesn’t own your tweet but Twitter has a right to do anything it likes with it.
Twitter could extend that license to researchers but it doesn’t. It has done it in a few cases but as a whole, it’s not done.
Researchers generally get around this by having a permanent link to the tweet or the Facebook status and they then write a script that means that anyone who wants to use the corpus have to download the actual text. This seems to be the only way to proceed on the issue given all the privacy and copyright issues that come up there.
Key question: Are copyright and licensing issues impeding collaborative research?
Key question: Is there a better solution to this problem that researchers are encountering?
Consent and intent
Social media data is freely available and published online. Social media providers such as Twitter allow the information published on their sites to be used for research purposes but there is a question of intent. Do the owners of the information intend for it to be used for research purposes? Is intent something that should be considered. The use of the data is legal and because of the terms and conditions that the social media users have agreed to but the question arises about the ethics of that. Does ticking a box, agreeing to a complex set of terms and conditions constitute consent?
Key question: Does intent matter? Should companies step up and ensure that social media users are aware that their data may be used for research purposes?
While the ethical issues involved in the MixedEmotions project were few, researchers raised some concerns about issues that might arise in the future, where this research enables some problematic actions.
For example, similar technology could be used for decisions about giving customers credit cards. The tools are wrong as much as they’re right so that’s a problem. It really needs to be done carefully and correctly and there is a chance that this would not be how the technology is used. On the other hand, businesses need to be able to realistic assessments of risk. That is an ethical question in itself
Key question: Could this technology be used in a way that compromises social media users?