Correct Answer : except input layer, all units in other layers should be non – linear
Explanation : To provide generalization capability to a network, except input layer, all units in other layers should be non – linear.