Unseen Words
unseen_words( code, unweighted = TRUE, include_handcoded = FALSE, exclude_matched_by = c("excerpts", "word") )
code | Code object |
---|---|
unweighted | logical TRUE (default), binarized the document matrix, so multiple occurrences of single word in one line doesn't count multiple times |
include_handcoded | logical FALSE (default), will not use words from the handcoded set. If TRUE, the handcoded set will be used |
exclude_matched_by | character, either "excerpts" (default) or "word". "excerpts" will remove full excerpts that match an expression, "word" will remove words from WDM to not include in sum of excerpts |
Excerpts that contain unseen words