Documentation for Policy Evaluation Metrics
src.pcgym.evaluation_metrics.reproducibility_metric
Bases: metric_base
Class for calculating reproducibility metrics.
__init__(dispersion, performance, scalarised_weight)
Initialize the reproducibility metric.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dispersion
|
str
|
The dispersion metric to use ('std' or 'mad'). |
required |
performance
|
str
|
The performance metric to use ('mean' or 'median'). |
required |
scalarised_weight
|
float
|
The weight for scalarised performance. |
required |
evaluate(policy_evaluator, component=None)
Evaluate the given policy using the specified environment.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
policy_evaluator
|
Any
|
The policy evaluator to generate data for a number of policy rollouts. |
required |
component
|
Optional[str]
|
The specific component to evaluate (optional). |
None
|
Returns:
Type | Description |
---|---|
Dict[str, Dict[str, ndarray]]
|
The evaluation metric value. |
policy_dispersion_metric(data, component)
Evaluate the dispersion of the policy.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
Dict[str, Dict[str, ndarray]]
|
Nested dictionary containing policy data. |
required |
component
|
Optional[str]
|
The specific component to evaluate. |
required |
Returns:
Type | Description |
---|---|
Dict[str, Dict[str, ndarray]]
|
The policy dispersion metric value. |
policy_performance_metric(data, component)
Evaluate the performance of the policy.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
Dict[str, Dict[str, ndarray]]
|
Nested dictionary containing policy data. |
required |
component
|
Optional[str]
|
The specific component to evaluate. |
required |
Returns:
Type | Description |
---|---|
Dict[str, Dict[str, ndarray]]
|
The policy performance metric value. |
scalarised_performance(data, component)
Evaluate the scalarised performance of the policy.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
Dict[str, Dict[str, ndarray]]
|
Nested dictionary containing policy data. |
required |
component
|
Optional[str]
|
The specific component to evaluate (set to None to scalarise over all components). |
required |
Returns:
Type | Description |
---|---|
Dict[str, Dict[str, ndarray]]
|
The scalarised policy performance metric value. |
determine_op(component)
Determine the operation to be applied based on the component.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
component
|
str
|
The component to determine the operation for. |
required |
Returns:
Type | Description |
---|---|
Callable[[ndarray], ndarray]
|
A lambda function representing the operation to be applied. |
src.pcgym.evaluation_metrics.metric_base
Bases: ABC
Abstract base class for policy evaluation metrics.
__init__(scalarised_weight)
Initialize the metric base.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
scalarised_weight
|
float
|
The weight for scalarised performance. |
required |
evaluate(policy_evaluator)
Evaluate the given policy using the specified environment.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
policy_evaluator
|
Any
|
The policy evaluator to generate data for a number of policy rollouts. |
required |
Returns:
Type | Description |
---|---|
Any
|
The evaluation metric value. |
policy_dispersion_metric(data)
Evaluate the dispersion of the policy.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
Dict[str, Any]
|
Nested dictionary containing policy data. |
required |
Returns:
Type | Description |
---|---|
Any
|
The policy dispersion metric value. |
policy_performance_metric(data)
Evaluate the performance of the policy.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
Dict[str, Any]
|
Nested dictionary containing policy data. |
required |
Returns:
Type | Description |
---|---|
Any
|
The policy performance metric value. |
scalarised_performance(data)
Evaluate the scalarised performance of the policy.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
Dict[str, Any]
|
Nested dictionary containing policy data. |
required |
Returns:
Type | Description |
---|---|
Any
|
The scalarised policy performance metric value. |
src.pcgym.evaluation_metrics.standard_deviation
Class for calculating standard deviation.
__init__(data)
Initialize the standard deviation calculator.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
ndarray
|
Input data for standard deviation calculation. |
required |
get_value()
Calculate the standard deviation of the data.
Returns:
Type | Description |
---|---|
ndarray
|
The standard deviation value. |
src.pcgym.evaluation_metrics.median_absolute_deviation
Class for calculating median absolute deviation.
__init__(data)
Initialize the median absolute deviation calculator.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
ndarray
|
Input data for median absolute deviation calculation. |
required |
get_value()
Calculate the median absolute deviation of the data.
Returns:
Type | Description |
---|---|
ndarray
|
The median absolute deviation value. |
src.pcgym.evaluation_metrics.mean_performance
Class for calculating mean performance.
__init__(data)
Initialize the mean performance calculator.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
ndarray
|
Input data for mean performance calculation. |
required |
get_value()
Calculate the mean performance of the data.
Returns:
Type | Description |
---|---|
ndarray
|
The mean performance value. |
src.pcgym.evaluation_metrics.median_performance
Class for calculating median performance.
__init__(data)
Initialize the median performance calculator.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
ndarray
|
Input data for median performance calculation. |
required |
get_value()
Calculate the median performance of the data.
Returns:
Type | Description |
---|---|
ndarray
|
The median performance value. |