Parallel Thinking
Reka Research supports a version of test time compute called Parallel Thinking for tasks and applications that require the highest quality results.
Example Usage
To enable this, you need to set mode
to either "high"
or "low"
. This controls the amount of compute used to generate the answer. The default, "none"
, is equivalent to the default Reka Research behavior.
Parallel Thinking requests are not supported when streaming is enabled.
The response is in the exact same format as a response from a standard Reka Research request. This allows you to seamlessly transition between the two modes and decide dynamically when you need the extra compute and quality that parallel thinking gives you.
Performance Impact
Under the hood, Reka Research is using more compute to generate a higher quality response. This means that there will be a slight performance impact on the request.