Correct Answer : In comparison to SARSA, QL directly learns the optimal policy, whereas SARSA learns a policy that is "near" the optimal
Explanation : In comparison to SARSA, QL directly learns the optimal policy, whereas SARSA learns a policy that is "near" the optimal.