67 items found

Tags: Text mining

Filter Results
  • Dataset

    Amazon reviews

    This (link to the) dataset contains product reviews and metadata from Amazon, including 142.8 million reviews spanning May 1996 - July 2014. This dataset includes reviews...
    • HTML
      The resource: 'Julian McAuley's repository.' is not accessible as guest user. You must login to access it!
  • Access required...

    ×

    Experiment

    Private Annotazione semantica di delibere comunali

    Progetto POC per l'uso delle tecniche di text mining su documenti della pubblica amministrazione per migliorare la trasparenza e l’accesso alle informazioni da parte dei...
  • Method

    Distance Calculator

    The program is intended for calculating semantic distances between input texts. As a commandline script it takes a list of tab-separated text pairs (line-per-pair) and returns...
    • The resource: 'Code' is not accessible as guest user. You must login to access it!
  • Access required...

    ×

    TrainingMaterial

    Private Jupyter Notebooks

    King’s College London has developed complete stories around Jupyter Notebooks that form easy recipes for reproducible methods in social data science. Jupyter...
  • Access required...

    ×

    TrainingMaterial

    Private Efficiency - Effectiveness Trade-offs in Learning to Rank

    This tutorial provides an 'Introduction to Learning to Rank' and focuses on 'Dealing with the Efficiency/Effectiveness trade-off in Web Search'. Moreover, it provides two...
  • Access required...

    ×

    TrainingMaterial

    Private Text Analytics and Opinion Mining Module

    The goals of this module are: - Have a general knowledge of text mining problems and methods. - Recognize situations in which Sentiment Analysis techniques can solve...
  • Access required...

    ×

    TrainingMaterial

    Private Database Module

    The 'Database Module' aims to introduce database analysis, focusing on DBMS architecture, Relational Models, SQL language and SQL nested queries. It is part of the Master in...
  • Access required...

    ×

    TrainingMaterial

    Private Interactive Learning Environments

    King’s College London developed a variety of data science materials based on R and Python. R is a de facto standard in statistical computing and visualisation, while our...
  • Access required...

    ×

    TrainingMaterial

    Private SOS Online Abuse of Politicians

    The material is a 20 minute video presentation describing the GATE team's work investigating online abuse of UK politicians. The video comprises slides and a voice-over. The...
  • Access required...

    ×

    TrainingMaterial

    Private Data Journalism and Story Telling

    The module aims to teach how to present the knowledge extracted from big data using multimedia story telling. It also shows some of the most recent and meaningful experiences...
  • Access required...

    ×

    TrainingMaterial

    Private High Performance and Scalable Analytics Module

    Mining with big data or big data mining has become an active research area. Running current analytical methodologies and software tools on a single personal computer cannot...
  • Access required...

    ×

    TrainingMaterial

    Private Data Management for Business Intelligence Module

    This module provides an introduction to information storage and management performed in order to support business decisions of organizations. It is part of the Master in Big...
  • Dataset

    Wikipedia Word Embeddings

    Embeddings were created through applying word2vec skipgram to a corpus of wikipedia non-stub articles from a December 2015 English dump with the following parameters: -cbow 0...
    • The resource: 'Embeddings' is not accessible as guest user. You must login to access it!
  • Access required...

    ×

    TrainingMaterial

    Private Data Mining and Machine Learning for Social Science

    An introductory course for data mining and machine learning for social science. The course focuses on presenting typical data mining and machine learning techniques by using a...
  • Access required...

    ×

    TrainingMaterial

    Private Data Mining and Machine Learning Module

    The module provides an introduction to base concepts of data mining and knowledge extraction process, introducing analytical models and algorithms for clustering,...
  • Access required...

    ×

    TrainingMaterial

    Private GATE Course

    The material is the 2017 version of a week-long training course delivered annually by the GATE team. Over almost ten years, this course has been developed to provide basic and...
  • Dataset

    WIRE dataset

    This dataset consists of 503 pairs of Wikipedia entities drawn from the New York Times dataset with a human assigned relatedness score. The domain experts based their...
    • HTML
      The resource: 'WikipediaRelatedness' is not accessible as guest user. You must login to access it!
    • CSV
      The resource: 'WIRE dataset' is not accessible as guest user. You must login to access it!
  • TrainingMaterial

    Introduction to Data Curation

    This course is an introduction to data collection, data preparation & transformation and data analysis. It contains the essential concepts for a researcher in order to...
    • PDF
      The resource: 'Introduction to Data Curation' is not accessible as guest user. You must login to access it!
  • Method

    Dictionary creator

    This tool creates a dictionary with inverse document frequency (idf) values from the Google NGrams dataset.
    • The resource: 'Source code' is not accessible as guest user. You must login to access it!
  • Method

    ArchiveSpark

    ArchiveSpark is an Apache Spark framework for easy data access, processing, extraction as well as derivation for Web archives and archival collections. It has a simple and...
    • The resource: 'ArchiveSpark on GitHub' is not accessible as guest user. You must login to access it!
You can also access this registry using the API (see API Docs).