tianshou.env¶

class tianshou.env.BaseVectorEnv(env_fns: List[Callable[], gym.core.Env]])[source]¶

Bases: abc.ABC, gym.core.Env

Base class for vectorized environments wrapper. Usage:

env_num = 8
envs = VectorEnv([lambda: gym.make(task) for _ in range(env_num)])
assert len(envs) == env_num

It accepts a list of environment generators. In other words, an environment generator efn of a specific task means that efn() returns the environment of the given task, for example, gym.make(task).

All of the VectorEnv must inherit BaseVectorEnv. Here are some other usages:

envs.seed(2)  # which is equal to the next line
envs.seed([2, 3, 4, 5, 6, 7, 8, 9])  # set specific seed for each env
obs = envs.reset()  # reset all environments
obs = envs.reset([0, 5, 7])  # reset 3 specific environments
obs, rew, done, info = envs.step([1] * 8)  # step synchronously
envs.render()  # render all environments
envs.close()  # close all environments

__len__() → int[source]¶: Return len(self), which is the number of environments.

abstract close() → None[source]¶

Close all of the environments.

Environments will automatically close() themselves when garbage collected or when the program exits.

abstract render(**kwargs) → None[source]¶: Render all of the environments.

abstract reset(id: Union[int, List[int], None] = None)[source]¶: Reset the state of all the environments and return initial observations if id is None, otherwise reset the specific environments with given id, either an int or a list.

abstract seed(seed: Union[int, List[int], None] = None) → None[source]¶

Set the seed for all environments.

Accept None, an int (which will extend i to [i, i + 1, i + 2, ...]) or a list.

Returns: The list of seeds used in this env’s random number generators.

The first value in the list should be the “main” seed, or the value which a reproducer should pass to ‘seed’.

abstract step(action: numpy.ndarray) → Tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray, numpy.ndarray][source]¶

Run one timestep of all the environments’ dynamics. When the end of episode is reached, you are responsible for calling reset(id) to reset this environment’s state.

Accept a batch of action and return a tuple (obs, rew, done, info).

Parameters

action (numpy.ndarray) – a batch of action provided by the agent.

Returns

A tuple including four items:

obs a numpy.ndarray, the agent’s observation of current environments
rew a numpy.ndarray, the amount of rewards returned after previous actions
done a numpy.ndarray, whether these episodes have ended, in which case further step() calls will return undefined results
info a numpy.ndarray, contains auxiliary diagnostic information (helpful for debugging, and sometimes learning)

class tianshou.env.VectorEnv(env_fns: List[Callable[], gym.core.Env]])[source]¶

Bases: tianshou.env.vecenv.BaseVectorEnv

Dummy vectorized environment wrapper, implemented in for-loop.