1. crfsuite::airbnb
    Dutch reviews of AirBnB customers on Brussels address locations available at www.insideairbnb.com
  2. crfsuite::airbnb_chunks
    Dutch reviews of AirBnB customers on Brussels address locations manually tagged with entities
  3. doc2vec::be_parliament_2020
    Corpus with Questions asked in the Belgium Federal Parliament in 2020
  4. nametagger::europeananews
    Tagged news paper articles from Europeana
  5. ruimtehol::dekamer
    Dataset from 2017 with Questions and Answers in the Belgium Federal Parliament
  6. ruimtehol::dekamer_theme_terminology
    Dataset containing relevant terminology for each theme of the 'dekamer' dataset
  7. textplot::example_btm
    Example Biterm Topic Model
  8. textplot::example_embedding
    Example word embedding matrix
  9. textplot::example_embedding_clusters
    Example words emitted in a ETM text clustering model
  10. textplot::example_udpipe
    Example annotation of text using udpipe
  11. textrank::joboffer
    The text of a job offer, annotated with the package udpipe
  12. tokenizers.bpe::belgium_parliament
    Dataset from 2017 with Questions asked in the Belgium Federal Parliament
  13. udpipe::brussels_listings
    Brussels AirBnB address locations available at www.insideairbnb.com
  14. udpipe::brussels_reviews
    Reviews of AirBnB customers on Brussels address locations available at www.insideairbnb.com
  15. udpipe::brussels_reviews_anno
    Reviews of the AirBnB customers which are tokenised, POS tagged and lemmatised
  16. udpipe::brussels_reviews_w2v_embeddings_lemma_nl
    An example matrix of word embeddings
    matrix|2687 x
  17. udpipe::udpipe_annotation_params
    List with training options set by the UDPipe community when building models based on the Universal Dependencies data