Package: udpipe 0.8.11
udpipe: Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing with the 'UDPipe' 'NLP' Toolkit
This natural language processing toolkit provides language-agnostic 'tokenization', 'parts of speech tagging', 'lemmatization' and 'dependency parsing' of raw text. Next to text parsing, the package also allows you to train annotation models based on data of 'treebanks' in 'CoNLL-U' format as provided at <https://universaldependencies.org/format.html>. The techniques are explained in detail in the paper: 'Tokenizing, POS Tagging, Lemmatizing and Parsing UD 2.0 with UDPipe', available at <doi:10.18653/v1/K17-3009>. The toolkit also contains functionalities for commonly used data manipulations on texts which are enriched with the output of the parser. Namely functionalities and algorithms for collocations, token co-occurrence, document term matrix handling, term frequency inverse document frequency calculations, information retrieval metrics (Okapi BM25), handling of multi-word expressions, keyword detection (Rapid Automatic Keyword Extraction, noun phrase extraction, syntactical patterns) sentiment scoring and semantic similarity analysis.
Authors:
udpipe_0.8.11.tar.gz
udpipe_0.8.11.zip(r-4.5)udpipe_0.8.11.zip(r-4.4)udpipe_0.8.11.zip(r-4.3)
udpipe_0.8.11.tgz(r-4.4-x86_64)udpipe_0.8.11.tgz(r-4.4-arm64)udpipe_0.8.11.tgz(r-4.3-x86_64)udpipe_0.8.11.tgz(r-4.3-arm64)
udpipe_0.8.11.tar.gz(r-4.5-noble)udpipe_0.8.11.tar.gz(r-4.4-noble)
udpipe_0.8.11.tgz(r-4.4-emscripten)udpipe_0.8.11.tgz(r-4.3-emscripten)
udpipe.pdf |udpipe.html✨
udpipe/json (API)
NEWS
# Install 'udpipe' in R: |
install.packages('udpipe', repos = c('https://bnosac.r-universe.dev', 'https://cloud.r-project.org')) |
Bug tracker:https://github.com/bnosac/udpipe/issues
- brussels_listings - Brussels AirBnB address locations available at www.insideairbnb.com
- brussels_reviews - Reviews of AirBnB customers on Brussels address locations available at www.insideairbnb.com
- brussels_reviews_anno - Reviews of the AirBnB customers which are tokenised, POS tagged and lemmatised
- brussels_reviews_w2v_embeddings_lemma_nl - An example matrix of word embeddings
- udpipe_annotation_params - List with training options set by the UDPipe community when building models based on the Universal Dependencies data
conlldependency-parserlemmatizationnatural-language-processingnlppos-taggingr-pkgrcpptext-miningtokenizerudpipe
Last updated 2 years agofrom:6a974c52fe. Checks:OK: 1 NOTE: 8. Indexed: yes.
Target | Result | Date |
---|---|---|
Doc / Vignettes | OK | Nov 06 2024 |
R-4.5-win-x86_64 | NOTE | Nov 06 2024 |
R-4.5-linux-x86_64 | NOTE | Nov 06 2024 |
R-4.4-win-x86_64 | NOTE | Nov 06 2024 |
R-4.4-mac-x86_64 | NOTE | Nov 06 2024 |
R-4.4-mac-aarch64 | NOTE | Nov 06 2024 |
R-4.3-win-x86_64 | NOTE | Nov 06 2024 |
R-4.3-mac-x86_64 | NOTE | Nov 06 2024 |
R-4.3-mac-aarch64 | NOTE | Nov 06 2024 |
Exports:as_conlluas_cooccurrenceas_fasttextas_phrasemachineas_word2veccbind_dependenciescbind_morphologicalcollocationcooccurrencedocument_term_frequenciesdocument_term_frequencies_statisticsdocument_term_matrixdtm_aligndtm_cbinddtm_chisqdtm_colsumsdtm_conformdtm_cordtm_rbinddtm_remove_lowfreqdtm_remove_sparsetermsdtm_remove_termsdtm_remove_tfidfdtm_reversedtm_rowsumsdtm_sampledtm_svd_similaritydtm_tfidfkeywords_collocationkeywords_phraseskeywords_rakepaste.data.framephrasesstrsplit.data.frametxt_collapsetxt_containstxt_contexttxt_counttxt_freqtxt_grepltxt_highlighttxt_nexttxt_nextgramtxt_overlaptxt_pastetxt_previoustxt_previousgramtxt_recodetxt_recode_ngramtxt_sampletxt_sentimenttxt_showtxt_tagsequenceudpipeudpipe_accuracyudpipe_annotateudpipe_download_modeludpipe_load_modeludpipe_read_conlluudpipe_trainunique_identifierunlist_tokens
Dependencies:data.tablelatticeMatrixRcpp
UDPipe Natural Language Processing - Text Annotation
Rendered fromudpipe-annotation.Rmd
usingknitr::rmarkdown
on Nov 06 2024.Last update: 2021-06-01
Started: 2017-08-30
UDPipe Natural Language Processing - Basic Analytical Use Cases
Rendered fromudpipe-usecase-postagging-lemmatisation.Rmd
usingknitr::rmarkdown
on Nov 06 2024.Last update: 2021-06-01
Started: 2018-02-06
UDPipe Natural Language Processing - Model Building
Rendered fromudpipe-train.Rmd
usingknitr::rmarkdown
on Nov 06 2024.Last update: 2021-06-01
Started: 2017-08-31
UDPipe Natural Language Processing - Parallel
Rendered fromudpipe-parallel.Rmd
usingknitr::rmarkdown
on Nov 06 2024.Last update: 2021-06-01
Started: 2019-05-17
UDPipe Natural Language Processing - Topic Modelling Use Cases
Rendered fromudpipe-usecase-topicmodelling.Rmd
usingknitr::rmarkdown
on Nov 06 2024.Last update: 2021-06-01
Started: 2018-03-06
UDPipe Natural Language Processing - Try it out
Rendered fromudpipe-tryitout.Rmd
usingknitr::rmarkdown
on Nov 06 2024.Last update: 2020-10-09
Started: 2018-01-15
UDPipe Natural Language Processing - Universe
Rendered fromudpipe-universe.Rmd
usingknitr::rmarkdown
on Nov 06 2024.Last update: 2021-12-02
Started: 2020-10-09