Google News
logo
Reinforcement Learning - Interview Questions
What's the difference between on-policy and off-policy evaluation?
On-policy evaluation is used to assess the quality of a policy by running it in an environment and measuring the resulting rewards. This is the most common form of evaluation used in reinforcement learning. Off-policy evaluation is used to assess the quality of a policy by running it in an environment and measuring the rewards that would have been received if a different policy had been used. This is less common, but can be useful in certain situations.
Advertisement