DDQN

class DDQN(model, gamma=None, lr=None)[source]

Bases: parl.core.paddle.algorithm.Algorithm

__init__(model, gamma=None, lr=None)[source]

DDQN algorithm

Parameters:
  • model (parl.Model) – forward neural network representing the Q function.
  • gamma (float) – discounted factor for accumulative reward computation
  • lr (float) – learning rate.
learn(obs, action, reward, next_obs, terminal)[source]

update the Q function (self.model) with DDQN algorithm

predict(obs)[source]

use self.model (Q function) to predict the action values