critic#


class CriticEnsembleFactory[source]#
abstract create_module(envs: Environments, device: str | device, ensemble_size: int, use_action: bool) Module[source]#
create_module_opt(envs: Environments, device: str | device, ensemble_size: int, use_action: bool, optim_factory: OptimizerFactory, lr: float) ModuleOpt[source]#
class CriticEnsembleFactoryContinuousNet(hidden_sizes: Sequence[int])[source]#
create_module(envs: Environments, device: str | device, ensemble_size: int, use_action: bool) Module[source]#
class CriticEnsembleFactoryDefault(hidden_sizes: Sequence[int] = (64, 64))[source]#

A critic ensemble factory which, depending on the type of environment, creates a suitable MLP-based critic.

DEFAULT_HIDDEN_SIZES = (64, 64)#
create_module(envs: Environments, device: str | device, ensemble_size: int, use_action: bool) Module[source]#
class CriticFactory[source]#

Represents a factory for the generation of a critic module.

abstract create_module(envs: Environments, device: str | device, use_action: bool, discrete_last_size_use_action_shape: bool = False) Module[source]#

Creates the critic module.

Parameters:
  • envs – the environments

  • device – the torch device

  • use_action – whether to expect the action as an additional input (in addition to the observations)

  • discrete_last_size_use_action_shape – whether, for the discrete case, the output dimension shall use the action shape

Returns:

the module

create_module_opt(envs: Environments, device: str | device, use_action: bool, optim_factory: OptimizerFactory, lr: float, discrete_last_size_use_action_shape: bool = False) ModuleOpt[source]#

Creates the critic module along with its optimizer for the given learning rate.

Parameters:
  • envs – the environments

  • device – the torch device

  • use_action – whether to expect the action as an additional input (in addition to the observations)

  • optim_factory – the optimizer factory

  • lr – the learning rate

  • discrete_last_size_use_action_shape – whether, for the discrete case, the output dimension shall use the action shape

Returns:

class CriticFactoryContinuousNet(hidden_sizes: ~collections.abc.Sequence[int], activation: type[~torch.nn.modules.module.Module] = <class 'torch.nn.modules.activation.ReLU'>)[source]#
create_module(envs: Environments, device: str | device, use_action: bool, discrete_last_size_use_action_shape: bool = False) Module[source]#

Creates the critic module.

Parameters:
  • envs – the environments

  • device – the torch device

  • use_action – whether to expect the action as an additional input (in addition to the observations)

  • discrete_last_size_use_action_shape – whether, for the discrete case, the output dimension shall use the action shape

Returns:

the module

class CriticFactoryDefault(hidden_sizes: ~collections.abc.Sequence[int] = (64, 64), hidden_activation: type[~torch.nn.modules.module.Module] = <class 'torch.nn.modules.activation.ReLU'>)[source]#

A critic factory which, depending on the type of environment, creates a suitable MLP-based critic.

DEFAULT_HIDDEN_SIZES = (64, 64)#
create_module(envs: Environments, device: str | device, use_action: bool, discrete_last_size_use_action_shape: bool = False) Module[source]#

Creates the critic module.

Parameters:
  • envs – the environments

  • device – the torch device

  • use_action – whether to expect the action as an additional input (in addition to the observations)

  • discrete_last_size_use_action_shape – whether, for the discrete case, the output dimension shall use the action shape

Returns:

the module

class CriticFactoryDiscreteNet(hidden_sizes: ~collections.abc.Sequence[int], activation: type[~torch.nn.modules.module.Module] = <class 'torch.nn.modules.activation.ReLU'>)[source]#
create_module(envs: Environments, device: str | device, use_action: bool, discrete_last_size_use_action_shape: bool = False) Module[source]#

Creates the critic module.

Parameters:
  • envs – the environments

  • device – the torch device

  • use_action – whether to expect the action as an additional input (in addition to the observations)

  • discrete_last_size_use_action_shape – whether, for the discrete case, the output dimension shall use the action shape

Returns:

the module

class CriticFactoryReuseActor(actor_future: ActorFuture)[source]#

A critic factory which reuses the actor’s preprocessing component.

This class is for internal use in experiment builders only.

create_module(envs: Environments, device: str | device, use_action: bool, discrete_last_size_use_action_shape: bool = False) Module[source]#

Creates the critic module.

Parameters:
  • envs – the environments

  • device – the torch device

  • use_action – whether to expect the action as an additional input (in addition to the observations)

  • discrete_last_size_use_action_shape – whether, for the discrete case, the output dimension shall use the action shape

Returns:

the module