Correct Answer : Target policy
Explanation : A target policy is a type of policy that an agent is trying to learn.
Correct Answer : Discount factor
Explanation : Gamma (γ) in the bellman equation is known as the Discount factor.
Correct Answer : Markov state
Explanation : Represent the agent state in reinforcement learning Markov state.
Explaination : P[St+1 | St ] = P[St +1 | S1,......, St], in the following condition St represents the Markov state.