See Available Models for possible values.
Set to true to enable streaming. See Chat Streaming
Positive number representing the temperature to use for generation. Higher values will make the output more unformly random or creative. 0.0 means greedy decoding. Defaults to 0.4.
Parameter which forces the model to only consider the tokens with the top_k
highest probabilities at the next step. Defaults to 1024.
Parameter used to do nucleus sampling, i.e. only consider tokens comprising the top_p
probability of the next token’s distribution. Defaults to 0.95.