Package: tokenizers.bpe 0.1.3
tokenizers.bpe: Byte Pair Encoding Text Tokenization
Unsupervised text tokenizer focused on computational efficiency. Wraps the 'YouTokenToMe' library <https://github.com/VKCOM/YouTokenToMe> which is an implementation of fast Byte Pair Encoding (BPE) <https://aclanthology.org/P16-1162/>.
Authors:
tokenizers.bpe_0.1.3.tar.gz
tokenizers.bpe_0.1.3.zip(r-4.5)tokenizers.bpe_0.1.3.zip(r-4.4)tokenizers.bpe_0.1.3.zip(r-4.3)
tokenizers.bpe_0.1.3.tgz(r-4.4-x86_64)tokenizers.bpe_0.1.3.tgz(r-4.4-arm64)tokenizers.bpe_0.1.3.tgz(r-4.3-x86_64)tokenizers.bpe_0.1.3.tgz(r-4.3-arm64)
tokenizers.bpe_0.1.3.tar.gz(r-4.5-noble)tokenizers.bpe_0.1.3.tar.gz(r-4.4-noble)
tokenizers.bpe_0.1.3.tgz(r-4.4-emscripten)tokenizers.bpe_0.1.3.tgz(r-4.3-emscripten)
tokenizers.bpe.pdf |tokenizers.bpe.html✨
tokenizers.bpe/json (API)
NEWS
# Install 'tokenizers.bpe' in R: |
install.packages('tokenizers.bpe', repos = c('https://bnosac.r-universe.dev', 'https://cloud.r-project.org')) |
Bug tracker:https://github.com/bnosac/tokenizers.bpe/issues
- belgium_parliament - Dataset from 2017 with Questions asked in the Belgium Federal Parliament
bpebyte-pair-encodingtext-miningtokenization
Last updated 1 years agofrom:72ecec49fe. Checks:OK: 9. Indexed: yes.
Target | Result | Date |
---|---|---|
Doc / Vignettes | OK | Oct 11 2024 |
R-4.5-win-x86_64 | OK | Oct 11 2024 |
R-4.5-linux-x86_64 | OK | Oct 11 2024 |
R-4.4-win-x86_64 | OK | Oct 11 2024 |
R-4.4-mac-x86_64 | OK | Oct 11 2024 |
R-4.4-mac-aarch64 | OK | Oct 11 2024 |
R-4.3-win-x86_64 | OK | Oct 11 2024 |
R-4.3-mac-x86_64 | OK | Oct 11 2024 |
R-4.3-mac-aarch64 | OK | Oct 11 2024 |
Exports:bpebpe_decodebpe_encodebpe_load_model
Dependencies:Rcpp
Readme and manuals
Help Manual
Help page | Topics |
---|---|
Dataset from 2017 with Questions asked in the Belgium Federal Parliament | belgium_parliament |
Construct a Byte Pair Encoding model | bpe |
Decode Byte Pair Encoding sequences to text | bpe_decode |
Tokenise text alongside a Byte Pair Encoding model | bpe_encode |
Load a Byte Pair Encoding model | bpe_load_model |