DQN¶

class DQN(model, gamma=None, lr=None)[source]¶

__init__(model, gamma=None, lr=None)[source]¶

DQN algorithm

Parameters:	model (parl.Model) – forward neural network representing the Q function. gamma (float) – discounted factor for accumulative reward computation lr (float) – learning rate.

learn(obs, action, reward, next_obs, terminal)[source]¶: update the Q function (self.model) with DQN algorithm

predict(obs)[source]¶: use self.model (Q function) to predict the action values

sync_target()[source]¶: assign the parameters of the training network to the target network