R-universe - psychbruce (Bruce H. W. S. Bao)

bruceR - Broadly Useful Convenient and Efficient R Functions

Broadly useful convenient and efficient R functions that bring users concise and elegant R data analyses. This package includes easy-to-use functions for (1) basic R programming (e.g., set working directory to the path of currently opened file; import/export data from/to files in any format; print tables to Microsoft Word); (2) multivariate computation (e.g., compute scale sums/means/... with reverse scoring); (3) reliability analyses and factor analyses; (4) descriptive statistics and correlation analyses; (5) t-test, multi-factor analysis of variance (ANOVA), simple-effect analysis, and post-hoc multiple comparison; (6) tidy report of statistical models (to R Console and Microsoft Word); (7) mediation and moderation analyses (PROCESS); and (8) additional toolbox for statistics and graphics.

Last updated

anovadata-analysisdata-sciencelinear-modelslinear-regressionmultilevel-modelsstatisticstoolbox

8.70 score 191 stars 3 dependents 500 scripts 2.3k downloads

ChineseNames - Chinese Name Database 1930-2008

A database of Chinese surnames and given names (1930-2008). This database contains nationwide frequency statistics of 1,806 Chinese surnames and 2,614 Chinese characters used in given names, covering about 1.2 billion Han Chinese population (96.8 percent of the Han Chinese household-registered population born from 1930 to 2008 and still alive in 2008). This package also contains a function for computing multiple indices of Chinese surnames and given names for social science research (e.g., name uniqueness, name gender, name valence, and name warmth/competence). Details are provided at <https://psychbruce.github.io/ChineseNames/>.

Last updated

big-datachinesechinese-namechinese-namesdatabasenamenames

5.44 score 183 stars 10 scripts 333 downloads

FMAT - The Fill-Mask Association Test

The Fill-Mask Association Test ('FMAT') <doi:10.1037/pspa0000396> is an integrative, probability-based social computing method using Masked Language Models to measure conceptual associations (e.g., attitudes, biases, stereotypes, social norms, cultural values) as propositional semantic representations in natural language. Supported language models include 'BERT' <doi:10.48550/arXiv.1810.04805> and its variants available at 'Hugging Face' <https://huggingface.co/models?pipeline_tag=fill-mask>. Methodological references and installation guidance are provided at <https://psychbruce.github.io/FMAT/>.

Last updated

aiartificial-intelligencebertbert-modelbert-modelscontextualized-representationfill-in-the-blankfill-maskhuggingfacelanguage-modellanguage-modelslarge-language-modelsmasked-language-modelsnatural-language-processingnatural-language-understandingnlppretrained-modelstransformertransformers

4.75 score 16 stars 7 scripts 567 downloads

PsychWordVec - Word Embedding Research Framework for Psychological Science

An integrative toolbox of word embedding research that provides: (1) a collection of 'pre-trained' static word vectors in the '.RData' compressed format <https://psychbruce.github.io/WordVector_RData.pdf>; (2) a group of functions to process, analyze, and visualize word vectors; (3) a range of tests to examine conceptual associations, including the Word Embedding Association Test <doi:10.1126/science.aal4230> and the Relative Norm Distance <doi:10.1073/pnas.1720347115>, with permutation test of significance; and (4) a set of training methods to locally train (static) word vectors from text corpora, including 'Word2Vec' <doi:10.48550/arXiv.1301.3781>, 'GloVe' <doi:10.3115/v1/D14-1162>, and 'FastText' <doi:10.48550/arXiv.1607.04606>.

Last updated

bertcosine-similarityfasttextglovegptlanguage-modelnatural-language-processingnlppretrained-modelspsychologysemantic-analysistext-analysistext-miningtsneword-embeddingsword-vectorsword2vec

4.57 score 25 stars 9 scripts 354 downloads

DPI - The Directed Prediction Index for Causal Direction Inference from Observational Data

The Directed Prediction Index ('DPI') is a causal discovery method for observational data designed to quantify the relative endogeneity of outcome (Y) versus predictor (X) variables in regression models. By comparing the coefficients of determination (R-squared) between the Y-as-outcome and X-as-outcome models while controlling for sufficient confounders and simulating k random covariates, it can quantify relative endogeneity, providing a necessary but insufficient condition for causal direction from a less endogenous variable (X) to a more endogenous variable (Y). Methodological details are provided at <https://psychbruce.github.io/DPI/>. This package also includes functions for data simulation and network analysis (correlation, partial correlation, and Bayesian Networks).

Last updated

causal-discoverycausal-inferencecausalitydirected-acyclic-graphsdirected-networksinfluencelinear-regressionpredictionsimulationstatistics

4.34 score 4 stars 4 scripts 259 downloads