approved
The Hackernews dataset

This corpus has been extracted from The Hacker News website (https://thehackernews.com), a CS news platform that attracts over 8 million readers monthly, which is daily updated with the latest CS news and provides in-depth reports on current and future CS trends. We developed a web crawler and scraper for this website, which retrieves, extracts, collects and normalises only the text of each posted news. The scraping task was performed bimonthly since 25/03/2022, making this dataset constantly updated.

Tags
Data and Resources
To access the resources you must log in
  • the-hackenews

    The resource: 'the-hackenews' is not accessible as guest user. You must login to access it!
Personal Data Attributes

Description: Personal Data related Information

Field Value
Anonymised Pseudo Anonymized
ChildrenData No
Cross Border Authorised Yes
Data Protection Impact Assessment No
Ethics Committee Approval No
General Data Yes
Informed Consent Template No
Personal Data No
Personal data was manifestly made public by the data subject No
Sensitive Data No
Additional Info
Field Value
Accessibility Both
Accessibility Mode Download
Availability On-Line
Basic rights Download
Creation Date 2023-07-31
Creator Silvestri, Stefano, stefano.silvestri@icar.cnr.it, orcid.org/0000-0002-9890-8409
Dataset Citation Silvestri, S.; Islam, S.; Papastergiou, S.; Tzagkarakis, C.; Ciampi, M. A Machine Learning Approach for the NLP-Based Analysis of Cyber Threats and Vulnerabilities of the Healthcare Ecosystem. Sensors 2023, 23, 651. https://doi.org/10.3390/s23020651
Dataset Re-Use Safeguards None
Field/Scope of use Non-commercial research only
Group Pervasive Intelligence in Cyber-Physical Systems for Future Society
Language eng, English
License term 2023-09-28 /2027-09-28
Manifestation Type Virtual
Processing Degree Primary
Retention Period 2023-09-28 /2027-09-28
SoBigData Node SoBigData EU
SoBigData Node SoBigData IT
Sublicense rights No
Territory of use World Wide
Thematic Cluster Web Analytics [WA]
Time Coverage 2022-03-25 /2023-07-31
system:type Dataset
Management Info
Field Value
Author SILVESTRI STEFANO
Maintainer SILVESTRI STEFANO
Version 1
Last Updated 11 December 2023, 15:38 (CET)
Created 28 September 2023, 11:42 (CEST)