Package: text 1.9

Oscar Kjell

text: Analyses of Text using Transformers Models from HuggingFace, Natural Language Processing and Machine Learning

Link R with Transformers from Hugging Face to transform text variables to word embeddings; where the word embeddings are used to statistically test the mean difference between set of texts, compute semantic similarity scores between texts, predict numerical variables, and visual statistically significant words according to various dimensions etc. For more information see <https://www.r-text.org>.

Authors:Oscar Kjell [aut, cre], Salvatore Giorgi [aut], Andrew Schwartz [aut]

text_1.9.tar.gz
text_1.9.zip(r-4.7)text_1.9.zip(r-4.6)text_1.9.zip(r-4.5)
text_1.9.tgz(r-4.6-any)text_1.9.tgz(r-4.5-any)
text_1.9.tar.gz(r-4.7-any)text_1.9.tar.gz(r-4.6-any)
text_1.9.tgz(r-4.6-emscripten)
manual.pdf |manual.html
DESCRIPTION |NEWS
card.svg |card.png
text/json (API)

# Install 'text' in R:
install.packages('text', repos = c('https://oscarkjell.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/oscarkjell/text/issues

Pkgdown/docs site:https://r-text.org

Datasets:

On CRAN:

Conda:

deep-learningmachine-learningnlptransformers

12.99 score 160 stars 1 packages 936 scripts 1.2k downloads 9 mentions 63 exports 149 dependencies

Last updated from:1fa93c87b9. Checks:9 OK. Indexed: yes.

TargetResultTimeFilesSyslog
linux-devel-x86_64OK262
source / vignettesOK308
linux-release-x86_64OK253
macos-release-arm64OK227
macos-oldrel-arm64OK226
windows-develOK182
windows-releaseOK172
windows-oldrelOK220
wasm-releaseOK201

Exports:find_textrpp_envtextAssesstextBERTopicstextCentralitytextCentralityPlottextClassifytextCleantextCleanNonASCIItextDescriptivestextDiagnosticstextDimNametextDistancetextDistanceMatrixtextDistanceNormtextDomainComparetextEmbedtextEmbedLayerAggregationtextEmbedRawLayerstextEmbedReducetextEmbedStatictextExamplestextFindNonASCIItextFineTuneDomaintextFineTuneTasktextGenerationtextLBAMtextModelLayerstextModelstextModelsRemovetextNERtextPCAtextPCAPlottextPlottextPredicttextPredictAlltextPredictTesttextProjectiontextProjectionPlottextQAtextrpp_initializetextrpp_installtextrpp_install_virtualenvtextrpp_uninstalltextSimilaritytextSimilarityMatrixtextSimilarityNormtextSumtextTokenizetextTokenizeAndCounttextTopicstextTopicsReducetextTopicsTesttextTopicsTreetextTopicsWordcloudtextTraintextTrainListstextTrainNtextTrainNPlottextTrainRandomForesttextTrainRegressiontextTranslatetextWordPredictiontextZeroShot

Dependencies:base64encbitbit64bslibcachemclassclicliprclockcodetoolscolorspacecommonmarkcowplotcpp11crayoncurldata.tablediagramdialsDiceDesigndigestdplyrevaluatefarverfastmapfastmatchfloatfontawesomefsfurrrfuturefuture.applyGauProgenericsggforceggplot2ggrepelggwordcloudglobalsgluegowergridtextgtablegtoolshardhatherehighrhmshtmltoolsipredisobandISOcodesjpegjquerylibjsonliteKernSmoothknitrlabelinglatticelavalbfgslgrlifecyclelistenvlitedownlubridatemagrittrmarkdownMASSMatrixMatrixExtramemoisemimemixoptmlapimodelenvnnetnumDerivparallellyparsnippatchworkpillarpkgconfigpngpolyclipprettyunitsprodlimprogressprogressrpurrrquantedaR6rappdirsRColorBrewerRcppRcppArmadilloRcppEigenRcppProgressRcppTOMLreadrrecipesreticulateRhpcBLASctlrlangrmarkdownrpartrprojrootrsamplersparseRSpectraS7sassscalessfdshapesliderSnowballCsparsevctrssplitfngrSQUAREMstopwordsstringistringrsurvivalsystemfontstailortext2vectextmineRtibbletidyrtidyselecttimechangetimeDatetinytextopicstunetweenrtzdbutf8vctrsviridisLitevroomwarpwithrworkflowsxfunxml2yamlyardstick

HuggingFace Transformers in R: Word Embeddings Defaults and Specifications
textEmbed: Reflecting standards and state-of-the-arts | textEmbed() | The Language model | The Layers | textEmbedRawLayers: Get tokens and all the layers | textEmbedLayerAggreation: Testing different layers

Last update: 2026-06-13
Started: 2022-07-13

Extended Installation Guide
Set up a python environment for the text-package | System-level dependencies | Windows | Microsoft C++ Build Tools | Terms of Service conflict -- Annaconda forge channel | MacOS | 1 Homebrew & libomp | 2. Install libomp: | Ubuntu | General troubleshooting | 1. Check if you have install permissions | 2. Remembered to initialize the python environment. | 3. Install the development version from GitHub. | 4. Force reinstallation of the environment | 5. Install the python environment using reticulate | 6. Inspect diagnostic information | Virtual environments | Solving OMP errors and R/Rstudio crashes

Last update: 2026-06-13
Started: 2025-07-29

Getting started
textEmbed(): mapping text to numbers using HuggingFace language models | textTrain(): Examine the relationship between text and numeric variables | Plot statistically significant words | textProjection(): Pre-process data for plotting | textProjectionPlot(): A two-dimensional word plot | Articles using the text-package | Other relevant references

Last update: 2025-12-02
Started: 2019-12-30

The Language-Based Assessment Model (L-BAM) Library
Overview of L-BAM pipelines | References

Last update: 2025-11-14
Started: 2024-11-01

Creating a Singularity Container to Run HuggingFace Transformers Models in R
Code to build a singularity container with HuggingFace models in R

Last update: 2025-09-01
Started: 2022-07-13

Pre-registration and Researcher Degrees of Freedom
Flexibility: The double edged sword in data analyses | Specific pre-registration requirements for text analyses | References

Last update: 2025-08-29
Started: 2022-12-01

Installing and Managing Python Environments with reticulate
Overview | Step 1: Install reticulate | Step 2: Create a Conda Environment | Step 3: Install Python Packages | Step 4: Activate the Environment | Show available conda environments | Remove the conda environment

Last update: 2025-07-21
Started: 2025-07-21

Implicit Motives Tutorial

Last update: 2025-02-17
Started: 2024-11-25

L-BAM Tutorial

Last update: 2024-11-26
Started: 2024-11-25

Psychological Methods: the Text Tutorial

Last update: 2022-12-01
Started: 2022-12-01

How to best manage computationally heavy analyses
Your system's capacity | References

Last update: 2022-07-13
Started: 2022-07-13

HuggingFace language models are downloaded in .cache

Last update: 2022-07-13
Started: 2022-07-13

Readme and manuals

Help Manual

Help pageTopics
Example data for plotting a Semantic Centrality Plot.centrality_data_harmony
Data for plotting a Dot Product Projection Plot.DP_projections_HILS_SWLS_100
Example text and numeric data.Language_based_assessment_data_3_100
Text and numeric data for 10 participants.Language_based_assessment_data_8
Example data for plotting a Principle Component Projection Plot.PC_projections_satisfactionwords_40
Word embeddings from textEmbedRawLayers functionraw_embeddings_1
Semantic similarity score between single words' and an aggregated word embeddingstextCentrality
Plots words from textCentrality()textCentralityPlot
Cleans text from standard personal informationtextClean
Clean non-ASCII characterstextCleanNonASCII
Compute descriptive statistics of character variables.textDescriptives
Run diagnostics for the text packagetextDiagnostics
Change dimension namestextDimName
Semantic distancetextDistance
Semantic distance across multiple word embeddingstextDistanceMatrix
Semantic distance between a text variable and a word normtextDistanceNorm
Compare two language domainstextDomainCompare
textEmbed() extracts layers and aggregate them to word embeddings, for all character variables in a given dataframe.textEmbed
Aggregate layerstextEmbedLayerAggregation
Extract layers of hidden statestextEmbedRawLayers
Pre-trained dimension reduction (experimental)textEmbedReduce
Apply static word embeddingstextEmbedStatic
Identify language examples.textExamples
Detect non-ASCII characterstextFindNonASCII
Domain Adapted Pre-Training (EXPERIMENTAL - under development)textFineTuneDomain
Task Adapted Pre-Training (EXPERIMENTAL - under development)textFineTuneTask
Text generationtextGeneration
The LBAM librarytextLBAM
Number of layerstextModelLayers
Check downloaded, available models.textModels
Delete a specified modeltextModelsRemove
Named Entity Recognition. (experimental)textNER
textPCA()textPCA
textPCAPlottextPCAPlot
Plot wordstextPlot
textPredict, textAssess and textClassifytextAssess textClassify textPredict
Predict from several models, selecting the correct inputtextPredictAll
Significance testing for model prediction performancetextPredictTest
Supervised Dimension ProjectiontextProjection
Plot Supervised Dimension ProjectiontextProjectionPlot
Question Answering. (experimental)textQA
Initialize text required python packagestextrpp_initialize
Install text required python packages in conda or virtualenv environmenttextrpp_install textrpp_install_virtualenv
Uninstall textrpp conda environmenttextrpp_uninstall
Semantic SimilaritytextSimilarity
Semantic similarity across multiple word embeddingstextSimilarityMatrix
Semantic similarity between a text variable and a word normtextSimilarityNorm
Summarize texts. (experimental)textSum
Tokenize text-variablestextTokenize
Tokenize and counttextTokenizeAndCount
BERTopictextBERTopics textTopics
textTopicsReduce (EXPERIMENTAL)textTopicsReduce
Wrapper for topicsTest function from the topics packagetextTopicsTest
textTopicsTest (EXPERIMENTAL) to get the hierarchical topic treetextTopicsTree
Plot word cloudstextTopicsWordcloud
Trains word embeddingstextTrain
Train lists of word embeddingstextTrainLists
Cross-validated accuracies across sample-sizestextTrainN
Plot cross-validated accuracies across sample sizestextTrainNPlot
Trains word embeddings usig random foresttextTrainRandomForest
Train word embeddings to a numeric variable.textTrainRegression
Translation. (experimental)textTranslate
Compute word-level prediction scores for plotting with textProjectionPlot().textWordPrediction
Zero Shot Classification (Experimental)textZeroShot
Word embeddings for 4 text variables for 40 participantsword_embeddings_4