Correct Answer : On-policy
Explanation : In the on-policy learning algorithm target policy is equal to behavior policy.