As you can see, the LSTM model can become quite complex. In order to still retain the functionality of retaining information across time and yet not make a too complex model, we need GRUs.
Basically, in GRUs, instead of having an additional Forget gate, we combine the input and Forget gates into a single Update Gate :
It is this reduction in the number of gates that makes GRU less complex and faster than LSTM.