
bruceR - Broadly Useful Convenient and Efficient R Functions
Broadly useful convenient and efficient R functions that bring users concise and elegant R data analyses. This package includes easy-to-use functions for (1) basic R programming (e.g., set working directory to the path of currently opened file; import/export data from/to files in any format; print tables to Microsoft Word); (2) multivariate computation (e.g., compute scale sums/means/... with reverse scoring); (3) reliability analyses and factor analyses; (4) descriptive statistics and correlation analyses; (5) t-test, multi-factor analysis of variance (ANOVA), simple-effect analysis, and post-hoc multiple comparison; (6) tidy report of statistical models (to R Console and Microsoft Word); (7) mediation and moderation analyses (PROCESS); and (8) additional toolbox for statistics and graphics.
Last updated
anovadata-analysisdata-sciencelinear-modelslinear-regressionmultilevel-modelsstatisticstoolbox
8.61 score 192 stars 3 dependents 354 scripts 2.7k downloads
ChineseNames - Chinese Name Database 1930-2008
A database of Chinese surnames and given names (1930-2008). This database contains nationwide frequency statistics of 1,806 Chinese surnames and 2,614 Chinese characters used in given names, covering about 1.2 billion Han Chinese population (96.8 percent of the Han Chinese household-registered population born from 1930 to 2008 and still alive in 2008). This package also contains a function for computing multiple indices of Chinese surnames and given names for social science research (e.g., name uniqueness, name gender, name valence, and name warmth/competence). Details are provided at <https://psychbruce.github.io/ChineseNames/>.
Last updated
big-datachinesechinese-namechinese-namesdatabasenamenames
5.43 score 180 stars 6 scripts 254 downloads
FMAT - The Fill-Mask Association Test
The Fill-Mask Association Test ('FMAT') <doi:10.1037/pspa0000396> is an integrative, probability-based social computing method using Masked Language Models to measure conceptual associations (e.g., attitudes, biases, stereotypes, social norms, cultural values) as propositional semantic representations in natural language. Supported language models include 'BERT' <doi:10.48550/arXiv.1810.04805> and its variants available at 'Hugging Face' <https://huggingface.co/models?pipeline_tag=fill-mask>. Methodological references and installation guidance are provided at <https://psychbruce.github.io/FMAT/>.
Last updated
aiartificial-intelligencebertbert-modelbert-modelscontextualized-representationfill-in-the-blankfill-maskhuggingfacelanguage-modellanguage-modelslarge-language-modelsmasked-language-modelsnatural-language-processingnatural-language-understandingnlppretrained-modelstransformertransformers
4.75 score 16 stars 7 scripts 511 downloads
PsychWordVec - Word Embedding Research Framework for Psychological Science
An integrative toolbox of word embedding research that provides: (1) a collection of 'pre-trained' static word vectors in the '.RData' compressed format <https://psychbruce.github.io/WordVector_RData.pdf>; (2) a group of functions to process, analyze, and visualize word vectors; (3) a range of tests to examine conceptual associations, including the Word Embedding Association Test <doi:10.1126/science.aal4230> and the Relative Norm Distance <doi:10.1073/pnas.1720347115>, with permutation test of significance; and (4) a set of training methods to locally train (static) word vectors from text corpora, including 'Word2Vec' <doi:10.48550/arXiv.1301.3781>, 'GloVe' <doi:10.3115/v1/D14-1162>, and 'FastText' <doi:10.48550/arXiv.1607.04606>.
Last updated
bertcosine-similarityfasttextglovegptlanguage-modelnatural-language-processingnlppretrained-modelspsychologysemantic-analysistext-analysistext-miningtsneword-embeddingsword-vectorsword2vec
4.70 score 25 stars 9 scripts 209 downloads
DPI - The Directed Prediction Index for Causal Direction Inference from Observational Data
The Directed Prediction Index ('DPI') is a causal discovery method for observational data designed to quantify the relative endogeneity of outcome (Y) versus predictor (X) variables in regression models. By comparing the coefficients of determination (R-squared) between the Y-as-outcome and X-as-outcome models while controlling for sufficient confounders and simulating k random covariates, it can quantify relative endogeneity, providing a necessary but insufficient condition for causal direction from a less endogenous variable (X) to a more endogenous variable (Y). Methodological details are provided at <https://psychbruce.github.io/DPI/>. This package also includes functions for data simulation and network analysis (correlation, partial correlation, and Bayesian Networks).
Last updated
causal-discoverycausal-inferencecausalitydirected-acyclic-graphsdirected-networksinfluencelinear-regressionpredictionsimulationstatistics
4.45 score 4 stars 4 scripts 160 downloads