1. doc2vec::be_parliament_2020
    Corpus with Questions asked in the Belgium Federal Parliament in 2020
  2. nametagger::europeananews
    Tagged news paper articles from Europeana
  3. textplot::example_btm
    Example Biterm Topic Model
  4. textplot::example_embedding
    Example word embedding matrix
  5. textplot::example_embedding_clusters
    Example words emitted in a ETM text clustering model
  6. textplot::example_udpipe
    Example annotation of text using udpipe
  7. textrank::joboffer
    The text of a job offer, annotated with the package udpipe
  8. tokenizers.bpe::belgium_parliament
    Dataset from 2017 with Questions asked in the Belgium Federal Parliament
  9. topicmodels.etm::ng20
    Bag of words sample of the 20 newsgroups dataset
  10. udpipe::brussels_listings
    Brussels AirBnB address locations available at www.insideairbnb.com
  11. udpipe::brussels_reviews
    Reviews of AirBnB customers on Brussels address locations available at www.insideairbnb.com
  12. udpipe::brussels_reviews_anno
    Reviews of the AirBnB customers which are tokenised, POS tagged and lemmatised
  13. udpipe::brussels_reviews_w2v_embeddings_lemma_nl
    An example matrix of word embeddings
    matrix|2687 x
  14. udpipe::udpipe_annotation_params
    List with training options set by the UDPipe community when building models based on the Universal Dependencies data