RENh-4k Dataset

Introduction

Emotions are highly useful to model human behavior being at the core of what makes us human. Today, people abundantly express and share emotions through social media. Technological advancements in such platforms enable sharing opinions or expressing any specific emotions towards what others have shared, mainly in the form of textual data. This entails an interesting arena for analysis; as to whether there is a disconnect between the writer’s intended emotion and the reader’s perception of textual content.In this context we procure a Readers’ Emotion News datasets by using the social news network, Rappler and its award-winning Mood Meter widget. Mood Meter enables readers to cast their emotion votes towards several categories of emotions (Afraid, Amused, Angry, Annoyed, Don’t care, Happy, Inspired, and Sad) and records the total percentage of votes obtained for each emotion. Unlike other sources, we choose Rappler due to its simplicity, popularity, and ease of organizing several news articles under multiple genres and associated emotion profiles. We manually collect only the popular news articles by checking for high emotion votings represented in the Rappler Mood Meter, to ensure that the selected news articles have a high social reach. RENh-4k is a short-text dataset with 4000 news documents and associated readers’ emotion profiles. News headlines and associated abstract/snippet are combined to form the documents, and corresponding readers’ emotion profiles are obtained from readers’ votings on Mood Meter for emotion classes: Afraid, Angry, Happy, Inspired, and Sad. We also assign documents into either of the categories, Health & well-being, Social issues or Others, after manually verifying news genres.

Dataset Sample

News Headline: Countries ban China arrivals as virus death toll hits 213
News Abstract: Nearly 10,000 people have been infected in China by the new coronavirus and new cases are found abroad, with more than ...
News Content: BEIJING, China – Countries stepped up travel restrictions on arrivals from China on Friday, January 31, after a global health emergency was declared over a viral epidemic that has killed 213 people. Nearly 10,000 people have been infected in China by the new coronavirus and ...
News Category: Health & well-being
Readers' Emotion:

Anger = 5%

Fear = 75%

Joy = 0%

Sadness = 20%

Surprise = 0%

People

Anoop K, University of Calicut, Kerala, India. (anoopk_dcs@uoc.ac.in)
Deepak P, Queen’s University Belfast, Northern Ireland, UK. (deepaksp@acm.org)
Savitha Sam Abraham , School of Science and Technology, Örebro University, Örebro, Sweden.
Lajish V L, University of Calicut, Kerala, India.
Manjary P Gangan, University of Calicut, Kerala, India.

Related Publication

Anoop K., Deepak P., Savitha Sam Abraham, Lajish V. L., Manjary P. Gangan. Readers’ affect: predicting and understanding readers’ emotions with deep learning. J Big Data June 2022, 9:82, Springer Nature, ISSN: 2196-1115, DOI: https://doi.org/10.1186/s40537-022-00614-2

Abstract: Emotions are highly useful to model human behavior being at the core of what makes us human. Today, people abundantly express and share emotions through social media. Technological advancements in such platforms enable sharing opinions or expressing any specific emotions towards what others have shared, mainly in the form of textual data. This entails an interesting arena for analysis; as to whether there is a disconnect between the writer’s intended emotion and the reader’s perception of textual content. In this paper, we present experiments for Readers’ Emotion Detection through multi-target regression settings by exploring a Bi-LSTM-based Attention model, where our major intention is to analyze the interpretability and effectiveness of the deep learning model for the task. To conduct experiments, we procure two extensive datasets REN-10k and RENh-4k, apart from using a popular benchmark dataset from SemEval-2007. We perform a two-phase experimental evaluation, first being various coarse-grained and fine-grained evaluations of our model performance in comparison with several baselines belonging to different categories of emotion detection, viz., deep learning, lexicon based, and classical machine learning. Secondly, we evaluate model behavior towards readers’ emotion detection assessing attention maps generated by the model through devising a novel set of qualitative and quantitative metrics. The first phase of experiments shows that our Bi-LSTM+Attention model significantly outperforms all baselines. The second analysis reveals that emotions may be correlated to specific words as well as named entities.

RENh-4k Datasets Download

Acknowledgements

Dataset Download Request Form