tianshou.utils¶
-
class
tianshou.utils.
MovAvg
(size: int = 100)[source]¶ Bases:
object
Class for moving average. It will automatically exclude the infinity and NaN. Usage:
>>> stat = MovAvg(size=66) >>> stat.add(torch.tensor(5)) 5.0 >>> stat.add(float('inf')) # which will not add to stat 5.0 >>> stat.add([6, 7, 8]) 6.5 >>> stat.get() 6.5 >>> print(f'{stat.mean():.2f}±{stat.std():.2f}') 6.50±1.12
-
tianshou.utils.
pre_compile
()[source]¶ Since Numba acceleration needs to compile the function in the first run, here we use some fake data for the common-type function-call compilation. Otherwise, the current training speed cannot compare with the previous.
-
class
tianshou.utils.net.common.
Net
(layer_num: int, state_shape: tuple, action_shape: Optional[Union[tuple, int]] = 0, device: Union[str, torch.device] = 'cpu', softmax: bool = False, concat: bool = False, hidden_layer_size: int = 128, dueling: Optional[Tuple[int, int]] = None, norm_layer: Optional[torch.nn.modules.module.Module] = None)[source]¶ Bases:
torch.nn.modules.module.Module
Simple MLP backbone. For advanced usage (how to customize the network), please refer to Build the Network.
- Parameters
concat (bool) – whether the input shape is concatenated by state_shape and action_shape. If it is True,
action_shape
is not the output shape, but affects the input shape.dueling (bool) – whether to use dueling network to calculate Q values (for Dueling DQN), defaults to False.
norm_layer (nn.modules.Module) – use which normalization before ReLU, e.g.,
nn.LayerNorm
andnn.BatchNorm1d
, defaults to None.
-
training
: bool¶
-
class
tianshou.utils.net.common.
Recurrent
(layer_num, state_shape, action_shape, device='cpu', hidden_layer_size=128)[source]¶ Bases:
torch.nn.modules.module.Module
Simple Recurrent network based on LSTM. For advanced usage (how to customize the network), please refer to Build the Network.
-
forward
(s, state=None, info={})[source]¶ In the evaluation mode, s should be with shape
[bsz, dim]
; in the training mode, s should be with shape[bsz, len, dim]
. See the code and comment for more detail.
-
training
: bool¶
-
-
tianshou.utils.net.common.
miniblock
(inp: int, oup: int, norm_layer: torch.nn.modules.module.Module) → List[torch.nn.modules.module.Module][source]¶
-
class
tianshou.utils.net.discrete.
Actor
(preprocess_net, action_shape, hidden_layer_size=128)[source]¶ Bases:
torch.nn.modules.module.Module
For advanced usage (how to customize the network), please refer to Build the Network.
-
training
: bool¶
-
-
class
tianshou.utils.net.discrete.
Critic
(preprocess_net, hidden_layer_size=128)[source]¶ Bases:
torch.nn.modules.module.Module
For advanced usage (how to customize the network), please refer to Build the Network.
-
training
: bool¶
-
-
class
tianshou.utils.net.discrete.
DQN
(c, h, w, action_shape, device='cpu')[source]¶ Bases:
torch.nn.modules.module.Module
For advanced usage (how to customize the network), please refer to Build the Network.
Reference paper: “Human-level control through deep reinforcement learning”.
-
training
: bool¶
-
-
class
tianshou.utils.net.continuous.
Actor
(preprocess_net, action_shape, max_action=1.0, device='cpu', hidden_layer_size=128)[source]¶ Bases:
torch.nn.modules.module.Module
For advanced usage (how to customize the network), please refer to Build the Network.
-
training
: bool¶
-
-
class
tianshou.utils.net.continuous.
ActorProb
(preprocess_net, action_shape, max_action=1.0, device='cpu', unbounded=False, hidden_layer_size=128)[source]¶ Bases:
torch.nn.modules.module.Module
For advanced usage (how to customize the network), please refer to Build the Network.
-
training
: bool¶
-
-
class
tianshou.utils.net.continuous.
Critic
(preprocess_net, device='cpu', hidden_layer_size=128)[source]¶ Bases:
torch.nn.modules.module.Module
For advanced usage (how to customize the network), please refer to Build the Network.
-
training
: bool¶
-
-
class
tianshou.utils.net.continuous.
RecurrentActorProb
(layer_num, state_shape, action_shape, max_action=1.0, device='cpu', unbounded=False, hidden_layer_size=128)[source]¶ Bases:
torch.nn.modules.module.Module
For advanced usage (how to customize the network), please refer to Build the Network.
-
training
: bool¶
-
-
class
tianshou.utils.net.continuous.
RecurrentCritic
(layer_num, state_shape, action_shape=0, device='cpu', hidden_layer_size=128)[source]¶ Bases:
torch.nn.modules.module.Module
For advanced usage (how to customize the network), please refer to Build the Network.
-
training
: bool¶
-