gomoku_rl.policy.common module
- gomoku_rl.policy.common.get_optimizer(cfg: DictConfig, params: Iterable[Parameter]) Optimizer [source]
- gomoku_rl.policy.common.make_dataset_naive(tensordict: TensorDict, batch_size: int) Generator[TensorDict, None, None] [source]
- gomoku_rl.policy.common.make_dqn_actor(cfg: DictConfig, action_spec: TensorSpec, device: device | str | int | None)[source]
- gomoku_rl.policy.common.make_egreedy_actor(actor: TensorDictModule, action_spec: TensorSpec, eps_init: float = 1.0, eps_end: float = 0.1, annealing_num_steps: int = 1000)[source]