NEWS

FMAT 2024.7 (2024-07-29)

Added the DOI link for the online published JPSP article: https://doi.org/10.1037/pspa0000396.

Fixed bugs: Now only BERT_download() connects to the Internet, while all the other functions run in an offline way.
Improved installation guidance for Python packages.

Added BERT_info().
Added add.tokens and add.method parameters for BERT_vocab() and FMAT_run(): An experimental functionality to add new tokens (e.g., out-of-vocabulary words, compound words, or even phrases) as [MASK] options. Validation is still needed for this novel practice (one of my ongoing projects), so currently please only use at your own risk, waiting until the publication of my validation work.
All functions except BERT_download() now import local model files only, without automatically downloading models. Users must first use BERT_download() to download models.
Deprecating FMAT_load(): Better to use FMAT_run() directly.

Added BERT_vocab() and ICC_models().
Improved summary.fmat(), FMAT_query(), and FMAT_run() (significantly faster because now it can simultaneously estimate all [MASK] options for each unique query sentence, with running time only depending on the number of unique queries but not on the number of [MASK] options).
If you use the reticulate package version ≥ 1.36.1, then FMAT should be updated to ≥ 2024.4. Otherwise, out-of-vocabulary [MASK] words may not be identified and marked. Now FMAT_run() directly uses model vocabulary and token ID to match [MASK] words. To check if a [MASK] word is in the model vocabulary, please use BERT_vocab().

The FMAT methodology paper has been accepted (March 14, 2024) for publication in the Journal of Personality and Social Psychology: Attitudes and Social Cognition (DOI: 10.1037/pspa0000396)!
Added BERT_download() (downloading models to local cache folder "%USERPROFILE%/.cache/huggingface") to differentiate from FMAT_load() (loading saved models from local cache). But indeed FMAT_load() can also download models silently if they have not been downloaded.
Added gpu parameter (see Guidance for GPU Acceleration) in FMAT_run() to allow for specifying an NVIDIA GPU device on which the fill-mask pipeline will be allocated. GPU roughly performs 3x faster than CPU for the fill-mask pipeline. By default, FMAT_run() would automatically detect and use any available GPU with an installed CUDA-supported Python torch package (if not, it would use CPU).
Added running speed information (queries/min) for FMAT_run().
Added device information for BERT_download(), FMAT_load(), and FMAT_run().
Deprecated parallel in FMAT_run(): FMAT_run(model.names, data, gpu=TRUE) is the fastest.
A progress bar is displayed by default for progress in FMAT_run().