Fit speaker_model to a corpus

The main function in stylest, stylest_fit fits a model using a corpus of texts labeled by speaker.

stylest_fit(
  x,
  speaker,
  terms = NULL,
  filter = NULL,
  smooth = 0.5,
  term_weights = NULL,
  fill_method = "value",
  fill_weight = 0,
  weight_varname = "mean_distance"
)

Arguments

x	Text vector. May be a `corpus_frame` object
speaker	Vector of speaker labels. Should be the same length as `x`
terms	If not `NULL`, terms to be used in the model. If `NULL`, use all terms
filter	If not `NULL`, a text filter to specify the tokenization. See `corpus` for more information about specifying `filter`
smooth	Numeric value used smooth term frequencies instead of the default of 0.5
term_weights	Dataframe of distances (or any weights) per word in the vocab. This dataframe should have one column $word and a second column $weight_var containing the weight for the word. See the vignette for details.
fill_method	if `"value"` (default), `fill_weight` is used to fill any terms with `NA` weight. If `"mean"`, the mean term_weight should be used as the fill value
fill_weight	numeric value to fill in as weight for any term which does not have a weight specified in `term_weights`, default=`0.0` (drops any words without weights)
weight_varname	Name of the column in term_weights containing the weights, default=`"mean_distance"`

Value

A S3 stylest_model object containing: speakers Vector of unique speakers, filter text_filter used, terms terms used in fitting the model, ntoken Vector of number of tokens per speaker, smooth Smoothing value, weights If not NULL, a named matrix of weights for each term in the vocab, rate Matrix of speaker rates for each term in vocabulary

Details

The user may specify only one of terms or cutoff. If neither is specified, all terms will be used.

Examples

data(novels_excerpts)
speaker_mod <- stylest_fit(novels_excerpts$text, novels_excerpts$author)

Arguments

Value

Details

Examples

Contents