| Title: | Interactive Validation App for 'quallmer' |
|---|---|
| Description: | Companion package to 'quallmer' providing an interactive 'shiny' application for manual coding, reviewing large language model (LLM) generated annotations, and computing inter-rater reliability metrics. Supports three modes: blind manual coding, LLM output validation, and agreement calculation. Computes standard reliability metrics including Krippendorff's alpha (Krippendorff 2019 <doi:10.4135/9781071878781>), Cohen's kappa, Fleiss' kappa (Fleiss 1971 <doi:10.1037/h0031619>), intraclass correlation coefficient (ICC), and percent agreement for nominal, ordinal, interval, and ratio data. Also computes gold-standard validation metrics including accuracy, precision, recall, and F1 scores following Sokolova and Lapalme (2009 <doi:10.1016/j.ipm.2009.03.002>). |
| Authors: | Seraphine F. Maerz [aut, cre] (ORCID: <https://orcid.org/0000-0002-7173-9617>), Kenneth Benoit [aut] (ORCID: <https://orcid.org/0000-0002-0797-564X>) |
| Maintainer: | Seraphine F. Maerz <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.0 |
| Built: | 2026-06-07 10:38:39 UTC |
| Source: | https://github.com/quallmer/quallmer.app |
Starts the Shiny app for manual coding, LLM checking, and validation / agreement calculation.
qlm_app(base_dir = getwd())qlm_app(base_dir = getwd())
base_dir |
Base directory for saving uploaded files and progress.
Defaults to current working directory. Use |
In LLM mode, you can also select metadata columns.
In Validation mode, select unit ID and coder columns (no text column), and optionally specify a gold-standard coder.
A shiny.appobj
if (interactive()) { # Launch the app qlm_app() # Use a temporary directory (useful for testing) qlm_app(base_dir = tempdir()) }if (interactive()) { # Launch the app qlm_app() # Use a temporary directory (useful for testing) qlm_app(base_dir = tempdir()) }
A sample dataset for demonstrating the quallmer app's validation functionality. Contains movie review texts coded by multiple coders with a gold standard.
sample_datasample_data
A data frame with 20 rows and 6 variables:
Unique identifier for each text
Movie review text
Gold standard sentiment label (positive, negative, neutral)
Human coder's sentiment classification
LLM coder's sentiment classification (moderate accuracy)
Another LLM coder's sentiment classification (lower accuracy)
This dataset is useful for:
Testing inter-rater reliability calculations (using coder1, coder2, coder3)
Testing gold-standard validation (using gold_sentiment as reference)
Learning how to use the quallmer app
Demonstrating nominal measurement level metrics
if (interactive()) { # Option 1: Use the pre-made sample file from the package # Get the path to the sample data file sample_file <- system.file("extdata", "sample_data.rds", package = "quallmer.app") # Launch the app and upload this file through the UI qlm_app() # Option 2: Load the data and save your own copy data(sample_data) saveRDS(sample_data, "my_sample.rds") # Then load my_sample.rds in qlm_app() }if (interactive()) { # Option 1: Use the pre-made sample file from the package # Get the path to the sample data file sample_file <- system.file("extdata", "sample_data.rds", package = "quallmer.app") # Launch the app and upload this file through the UI qlm_app() # Option 2: Load the data and save your own copy data(sample_data) saveRDS(sample_data, "my_sample.rds") # Then load my_sample.rds in qlm_app() }