gomoku_rl.policy package

Module contents

gomoku_rl.policy.get_policy(name: str, cfg: DictConfig, action_spec: DiscreteTensorSpec, observation_spec: TensorSpec, device='cuda') → Policy[source]

Retrieves a policy object based on the specified policy name, configuration, action and observation specifications, and device.

Parameters:

name (str) – The name of the policy to retrieve, which should match a key in the Policy registry.
cfg (DictConfig) – Configuration settings for the policy, typically containing hyperparameters and other policy-specific settings.
action_spec (DiscreteTensorSpec) – The specification of the action space, defining the shape, type, and bounds of actions the policy can take.
observation_spec (TensorSpec) – The specification of the observation space, defining the shape and type of observations the policy will receive from the environment.
device – The computing device (‘cuda’ or ‘cpu’) where the policy computations will be performed. Defaults to “cuda”.

Returns:

An instance of the requested policy class, initialized with the provided configurations, action and observation specifications, and device.

Return type:

Policy

gomoku_rl.policy.get_pretrained_policy(name: str, cfg: DictConfig, action_spec: DiscreteTensorSpec, observation_spec: TensorSpec, checkpoint_path: str, device='cuda') → Policy[source]

Initializes and returns a pretrained policy object based on the specified policy name, configuration, action and observation specifications, checkpoint path, and device.