3 items found

Licenses: Creative Commons Attribution 4.0 Groups: sobigdata-eu Tags: Text mining

Filter Results
  • Dataset

    DNA 12-mers

    A 179 MB dataset containing all the ~14M unique 12-mers in the DNA sequences available in the Pizza&Chili Corpus (https://pizzachili.dcc.uchile.cl/texts.html). This dataset...
    • ZIP
      The resource: 'DNA 12-mers' is not accessible as guest user. You must login to access it!
  • Dataset

    BioTAGME: A comprehensive platform for biological knowledge network analysis

    This Network was built through BioTAGME, a system that combines TAGME, an entity-annotation framework based on Wikipedia corpus with a network-based inference methodology (i.e.,...
  • Dataset

    Broad Twitter Corpus

    The Broad Twitter Corpus is a named entity-annotated dataset of tweets, collected in order to capture temporal, spatial and social diversity. The goal of the corpus is to...
    • JSON
      The resource: 'Broad Twitter Corpus' is not accessible as guest user. You must login to access it!
You can also access this registry using the API (see API Docs).