Introduction

Identifying fake news is increasingly being recognized as an important computational task with high potential social impact. Misinformation is routinely injected into almost every domain of news including politics, health, science, business, etc., among which, the fake news in the health domain poses serious risk and harm to health and well-being in modern societies. With most fake news datasets being focused on microblogging websites in the political domain making them less suitable for content-focused misinformation identification tasks as warranted by the domain of health, we curated a dataset of fake and legitimate news articles within the topic of health and well being. For legitimate news, we crawled 500 health and well-being articles from reputable sources such as CNN, NYTimes, New Indian Express and, many others, manually double-checked for truthfulness. For fake news, we crawled 500 articles on similar topics from well reported misinformation websites such as BeforeItsNews, Nephef, MadWorldNews, and many others.


Dataset Statistics

Dataset Class Total Number of
Documents in the Class
Average Words
per Document
Average Sentences
per Document
Total Number
of Words

Health and Well Being
(HWB)
Real 500 724 31 362117
Fake 500 578 28 289477


Examples of health fake news headlines and excerpts

Russian Scientist Captures Soul Leaving Body; Quantifies Chakras
It uses a small electrical current that is connected to the fingertips and takes less than a millisecond to send signals from. When these electric charges are pulsed through the body, our bodies naturally respond with a kind of ‘electron cloud’ made up of light photons. Korotkov also used a type of Kirlian photography to show the exact moment someone’s soul left their body at the time of death! He says there is a blue life force you can see leaving the body. He says the navel and the head are the first parts of us to lose their life force and the heart and groin are the last. In other cases, he’s noted that the soul of people who have had violent or unexpected deaths can manifest in a state of confusion and their consciousness doesn’t actually know that they have died.


People

  1. 1. Anoop K, University of Calicut, Kerala, India. (anoopk_dcs@uoc.ac.in)
  2. 2. Deepak P, Queen’s University Belfast, Northern Ireland, UK. (deepaksp@acm.org)
  3. 3. Lajish V L, University of Calicut, Kerala, India. (lajish@uoc.ac.in)

Related Publication

Anoop K, Deepak P, and Lajish V L. 2020. Emotion Cognizance Improves Health Fake News Identification. In 24th International Database Engineering & Applications Symposium (IDEAS 2020), August 12–14, 2020, Seoul, Republic of Korea. ACM, New York, NY, USA, Article 12, 1-10, https://doi.org/10.1145/3410566.3410595.


logo

Abstract: Identifying fake news is increasingly being recognized as an important computational task with high potential social impact. Misinformation is routinely injected into almost every domain of news including politics, health, science, business, etc., among which, the fake news in the health domain poses serious risk and harm to health and well-being in modern societies. In this paper, we consider the utility of the affective character of news articles for fake news identification in the health domain and present evidence that emotion cognizant representations are significantly more suited for the task. We outline a simple technique that works by leveraging emotion intensity lexicons to develop emotion-amplified text representations and evaluate the utility of such a representation for identifying fake news relating to health in various supervised and unsupervised scenarios. The consistent and notable empirical gains that we observe over a range of technique types and parameter settings establish the utility of the emotional information in news articles, an often overlooked aspect, for the task of misinformation identification in the health domain.


HWB Dataset Download

Please follow the steps below to download the HWB Fake News Dataset.

Step 1: Please fill the request form.
Step 2: You will be given a download link within few days of submitting the request.
Step 3: If you use this dataset in your research, please acknowledge the Health & Well Being (HWB) Fake News Dataset and its authors as the citation below:

Cite the work:
Anoop K, Deepak P, and Lajish V L. 2020. Emotion Cognizance Improves Health Fake News Identification. In 24th International Database Engineering & Applications Symposium (IDEAS 2020), August 12–14, 2020, Seoul, Republic of Korea. ACM, New York, NY, USA, Article 12, 1-10, https://doi.org/10.1145/3410566.3410595


Other Related Publications

1. Anoop K., Affect-oriented Fake News Detection using Machine Learning, AWSAR Awarded Popular Science Stories By Scientists for the People, ISBN: 978-81-7480-337-5, Published by Vigyan Prasar (An Autonomous Organization of Department of Science and Technology), pp. 402-404, Augmenting Writing Skills for Articulating Research (AWSAR) Awards - 2019, Instituted by the Department of Science and Technology, Govt. of India, [article]

2. Iknoor Singh, Deepak P., Anoop K. (2020). On the Coherence of Fake News Articles. ECML PKDD 2020 Workshops. ECML PKDD 2020. Communications in Computer and Information Science, vol 1323. Springer, Cham. https://doi.org/10.1007/978-3-030-65965-3_42 [paper]

3. Anoop K, Manjary P Gangan, Deepak P, & Lajish V L (2019). Leveraging heterogeneous data for fake news detection, Linking and Mining Heterogeneous and Multi-view Data. Unsupervised and Semi-Supervised Learning. pp. 229-264. Springer, Cham. https://doi.org/10.1007/978-3-030-01872-6_10. [paper]


Media Coverage for the Work
[The Hindu - Print], [The Hindu - Web], [Deshabhimani - Print], [Mathrubhumi - Print], [Mathrubhumi - Web], [New Indian Express - Print], [Keesa - Web]

Dataset Download Request Form

Please enter a working email address.
We will use this email to send the confirmation.

HWB Terms of Use

Copyright © 2020 by the Computational Intelligence and Data Analytics Lab, Department of Computer Science, University of Calicut, Kerala, India.
If it is your intent to use this dataset for non-commercial purposes, such as in academic research, this dataset is free.
If you use this dataset in your research, please acknowledge the Health & Well Being (HWB) Fake News Dataset and its authors as the citation below :-
Anoop K, Deepak P, and Lajish V L. 2020. Emotion Cognizance Improves Health Fake News Identification. In 24th International Database Engineering & Applications Symposium (IDEAS 2020), August 12–14, 2020, Seoul, Republic of Korea. ACM, New York, NY, USA, Article 12, 1-10, https://doi.org/10.1145/3410566.3410595

I have read and agree to these terms of use