Correct Answer : On-policy
Explanation : SARSA is based upon an on-policy learning algorithm.
Correct Answer : Discount factor
Explanation : Gamma (γ) in the bellman equation is known as the Discount factor.
Correct Answer : Markov state
Explanation : Represent the agent state in reinforcement learning Markov state.
Explaination : P[St+1 | St ] = P[St +1 | S1,......, St], in the following condition St represents the Markov state.