# Create Customized Algorithms¶

Goal of this tutorial:

• Learn how to implement your own algorithms.

## Overview¶

To build a new algorithm, you need to inherit class `parl.Algorithm` and implement three basic functions: `sample`, `predict` and `learn`.

## Methods¶

• `__init__`

As algorithms update weights of the models, this method needs to define some models inherited from `parl.Model`, like `self.model` in this example. You can also set some hyperparameters in this method, like learning_rate, reward_decay and action_dimension, which might be used in the following steps.

• `predict`

This function defines how to choose actions. For instance, you can use a policy model to predict actions.

• `sample`

Based on `predict` method, `sample` generates actions with noises. Use this method to do exploration if needed.

• `learn`

Define loss function in `learn` method, which will be used to update weights of `self.model`.

## Example: DQN¶

This example shows how to implement DQN algorithm based on class `parl.Algorithm` according to the steps mentioned above.

Within class `DQN(Algorithm)`, we define the following methods:

• __init__(self, model, act_dim=None, gamma=None, lr=None)

We define `self.model` and `self.target_model` of DQN in this method, which are instances of class `parl.Model`. And we also set hyperparameters act_dim, gamma and lr here. We will use these parameters in `learn` method.

```def __init__(self,
model,
act_dim=None,
gamma=None,
lr=None):
""" DQN algorithm

Args:
model (parl.Model): model defining forward network of Q function
hyperparas (dict): (deprecated) dict of hyper parameters.
act_dim (int): dimension of the action space
gamma (float): discounted factor for reward computation.
lr (float): learning rate.
"""
self.model = model
self.target_model = copy.deepcopy(model)

assert isinstance(act_dim, int)
assert isinstance(gamma, float)
assert isinstance(lr, float)
self.act_dim = act_dim
self.gamma = gamma
self.lr = lr
```
• predict(self, obs)

We use the forward network defined in `self.model` here, which uses observations to predict action values directly.

```def predict(self, obs):
""" use value model self.model to predict the action value
"""
return self.model.value(obs)
```
• learn(self, obs, action, reward, next_obs, terminal)

`learn` method calculates the cost of value function according to the predict value and the target value. `Agent` will use the cost to update weights in `self.model`.

```def learn(self, obs, action, reward, next_obs, terminal):
""" update value model self.model with DQN algorithm
"""

pred_value = self.model.value(obs)
next_pred_value = self.target_model.value(next_obs)
best_v = layers.reduce_max(next_pred_value, dim=1)
target = reward + (
1.0 - layers.cast(terminal, dtype='float32')) * self.gamma * best_v

action_onehot = layers.one_hot(action, self.act_dim)
action_onehot = layers.cast(action_onehot, dtype='float32')
pred_action_value = layers.reduce_sum(
layers.elementwise_mul(action_onehot, pred_value), dim=1)
cost = layers.square_error_cost(pred_action_value, target)
cost = layers.reduce_mean(cost)
Use this method to synchronize the weights in `self.target_model` with those in `self.model`. This is the step used in DQN algorithm.
```def sync_target(self, gpu_id=None):