Correct Answer :   In comparison to SARSA, QL directly learns the optimal policy, whereas SARSA learns a policy that is "near" the optimal
                                        
                                                                                
                                        Explanation : In comparison to SARSA, QL directly learns the optimal policy, whereas SARSA learns a policy that is "near" the optimal.