Computes speakers' term usage rates

fit_term_usage(
  x,
  speaker,
  terms,
  smooth,
  term_weights,
  fill_method,
  fill_weight,
  weight_varname
)

Arguments

x

Text vector. May be a corpus_frame object

speaker

Vector of speaker labels. Should be the same length as x

terms

Vocabulary for document term matrix

smooth

Numeric value used smooth term frequencies

term_weights

Dataframe of distances (or any weights) per word in the vocab. This dataframe should have one column $word and a second column $weight_var containing the weight for the word

fill_method

if "value" (default), fill_weight is used to fill any terms with NA weight. If "mean", the mean term_weight should be used as the fill value

fill_weight

numeric value to fill in as weight for any term which does not have a weight specified in term_weights

weight_varname

Name of the column in term_weights containing the weights

Value

named list of: terms, vector of num tokens uttered by each speaker, smoothing value, term weights (NULL if no weights), terms whose weights were imputed (NULL if no term_weights=NULL), fill_weight used to fill missing weights (NULL if no term_weights=NULL), and (smoothed) term usage rate matrix