Skip to content

Documentation for make_env

src.pcgym.pcgym.make_env

Bases: Env

__init__(env_params)

Initialize the environment with given parameters.

Parameters:

Name Type Description Default
env_params dict

Environment configuration parameters including model selection, spaces, simulation parameters, constraints, and custom functions.

required

reset(seed=0, **kwargs)

Reset the environment to its initial state.

This method resets the environment's state, time, and other relevant variables. It's called at the beginning of each episode.

Parameters:

Name Type Description Default
seed int

Seed for random number generator.

0
**kwargs

Additional keyword arguments.

{}

Returns:

Name Type Description
tuple tuple[array, dict]

A tuple containing: - numpy.array: The initial state observation. - dict: Additional information (e.g., initial reward).

step(action)

Perform one time step in the environment.

This method takes an action, applies it to the environment, and returns the next state, reward, and other information.

Parameters:

Name Type Description Default
action array

The action to be taken in the environment.

required

Returns:

Name Type Description
tuple tuple[array, float, bool, bool, dict]

A tuple containing: - numpy.array: The next state observation. - float: The reward for the current step. - bool: Whether the episode has ended. - bool: Whether the episode was truncated. - dict: Additional information about the step.

con_checker(curr_state, inputs)

Check if any constraints are violated for the given states.

Parameters:

Name Type Description Default
model_states list

List of state or input names to check.

required
curr_state list

List of corresponding state or input values.

required

Returns:

Name Type Description
bool bool

True if any constraint is violated, False otherwise.

constraint_check(state, input)

Check if any constraints are violated in the current step.

This method checks both state and input constraints, as well as any custom constraints defined by the user.

Parameters:

Name Type Description Default
state array

The current state of the system.

required
input array

The current input (action) applied to the system.

required

Returns:

Name Type Description
bool bool

True if any constraint is violated, False otherwise.

get_rollouts(policies, reps, oracle=False, dist_reward=False, MPC_params=False, cons_viol=False)

Generate rollouts for the given policies.

This method simulates the environment for multiple episodes using the provided policies.

Parameters:

Name Type Description Default
policies dict

Dictionary of policies to evaluate.

required
reps int

Number of rollouts to perform.

required
oracle bool

Whether to use an oracle model for evaluation. Defaults to False.

False
dist_reward bool

Whether to use reward distribution. Defaults to False.

False
MPC_params bool

Whether to use MPC parameters. Defaults to False.

False
cons_viol bool

Whether to track constraint violations. Defaults to False.

False

Returns:

Name Type Description
tuple tuple[policy_eval, dict]

A tuple containing: - policy_eval: The policy evaluator object. - dict: Data from the rollouts.

plot_rollout(policies, reps, oracle=False, dist_reward=False, MPC_params=False, cons_viol=False, save_fig=False)

Generate and plot rollouts for the given policies.

This method simulates the environment for multiple episodes using the provided policies and plots the results.

Parameters:

Name Type Description Default
policies dict

Dictionary of policies to evaluate.

required
reps int

Number of rollouts to perform.

required
oracle bool

Whether to use an oracle model for evaluation. Defaults to False.

False
dist_reward bool

Whether to use reward distribution for plotting. Defaults to False.

False
MPC_params bool

Whether to use MPC parameters. Defaults to False.

False
cons_viol bool

Whether to track constraint violations. Defaults to False.

False
save_fig bool

Whether to save the generated figures. Defaults to False.

False

Returns:

Name Type Description
tuple tuple[policy_eval, dict]

A tuple containing: - policy_eval: The policy evaluator object. - dict: Data from the rollouts.