Reward-Based Token Modelling with Selective Cloud Assistance

Nov 25, 2024

This method not only reduces the traffic to the cloud LLM, thereby lowering costs, but also allows for flexible control over response quality depending on the reward score threshold.

Related Research & Thoughts