See Available Models for possible values.
Set to true to enable streaming. See Chat Streaming
Positive number representing the temperature to use for generation. Higher values will make the output more unformly random or creative. 0.0 means greedy decoding. Defaults to 0.4.
Parameter which forces the model to only consider the tokens with the top_k
highest probabilities at the next step. Defaults to 1024.
Parameter used to do nucleus sampling, i.e. only consider tokens comprising the top_p
probability of the next token’s distribution. Defaults to 0.95.
Whether to consider using search engine to complete the request. Note that even if this is set to True
, the model might decide to not use search.