niaaml

class niaaml.Factory(**kwargs)

Bases: object

Base class with string mappings to entities.

Date:

2020

Author:

Luka Pečnik

License:

MIT

Attributes:

_entities (Dict[str, any]): Dictionary to map from strings to an instance of anything.

get_name_to_classname_mapping()

Get dictionary of user-friendly name to class name mapping.

Returns:

dict: Dictionary of user-friendly name to class name mapping.

get_result(name)

Get the resulting entity.

Arguments:

name (str): String that represents the entity.

Returns:

any: Entity according to the given name.

class niaaml.Logger(verbose=False, output_file=None, **kwargs)

Bases: object

Class for logging throughout the framework.

Date:

2020

Author:

Luka Pečnik

License:

MIT

log_optimization_error(text)

Log optimization error message.

log_pipeline(text)

Log pipeline info message.

log_progress(text)

Log progress message.

class niaaml.MinMax(min, max)

Bases: object

Class for ParameterDefinition’s value property.

Date:

2020

Author:

Luka Pečnik

License:

MIT

Attributes:

min (float): Minimum number (inclusive). max (float): Maximum number (exclusive).

See Also:
  • niaaml.utilities.ParameterDefinition

class niaaml.OptimizationStats(predicted, expected, **kwargs)

Bases: object

Class that holds pipeline optimization result’s statistics. Includes accuracy, precision, Cohen’s kappa and F1-score.

Date:

2020

Author:

Luka Pečnik

License:

MIT

Attributes:

_accuracy (float): Calculated accuracy. _precision (float): Calculated precision. _cohen_kappa (float): Calculated Cohen’s kappa. _f1_score (float): Calculated F1-score.

to_string()

User friendly representation of the object.

Returns:

str: User friendly representation of the object.

class niaaml.ParameterDefinition(value, param_type=None)

Bases: object

Class for PipelineComponent parameters definition.

Date:

2020

Author:

Luka Pečnik

License:

MIT

Attributes:

value (any): Array of possible parameter values or instance of MinMax class. param_type (numpy.dtype): Selection output data type.

See Also:
  • niaaml.pipeline_component.PipelineComponent

  • niaaml.utilities.MinMax

class niaaml.Pipeline(**kwargs)

Bases: object

Classification pipeline defined by optional preprocessing steps and classifier.

Date:

2020

Author:

Luka Pečnik

License:

MIT

Attributes:

__feature_selection_algorithm (Optional[FeatureSelectionAlgorithm]): Feature selection algorithm implementation. __feature_transform_algorithm (Optional[FeatureTransformAlgorithm]): Feature transform algorithm implementation. __classifier (Classifier): Classifier implementation. __selected_features_mask (Iterable[bool]): Mask of selected features during the feature selection process. __best_stats (OptimizationStats): Statistics of the most successful setup of parameters. __categorical_features_encoders (Dict[FeatureEncoder]): Instances of FeatureEncoder for all categorical features. __imputers (Dict[Imputer]): Dictionary of instances of Imputer for all columns that contained missing values during optimization process. __logger (Logger): Logger instance.

export(file_name)

Exports Pipeline object to a file for later use. Extension is added if not present.

Arguments:

file_name (str): Output file name.

export_text(file_name)

Exports Pipeline object to a user-friendly text file. Extension is added if not present.

Arguments:

file_name (str): Output file name.

get_classifier()

Get deep copy of the classifier.

Returns:

Classifier: Instance of the Classifier object.

get_feature_selection_algorithm()

Get deep copy of the feature selection algorithm.

Returns:

FeatureSelectionAlgorithm: Instance of the FeatureSelectionAlgorithm object.

get_feature_transform_algorithm()

Get deep copy of the feature transform algorithm.

Returns:

FeatureTransformAlgorithm: Instance of the FeatureTransformAlgorithm object.

get_logger()

Get logger.

Returns:

Logger: Instance of the Logger object.

get_stats()

Get optimization statistics.

Returns:

OptimizationStats: Instance of the OptimizationStats object.

static load(file_name)

Loads Pipeline object from a file.

Returns:

Pipeline: Loaded Pipeline instance.

optimize(x, y, population_size, number_of_evaluations, optimization_algorithm, fitness_function)

Optimize pipeline’s hyperparameters.

Arguments:

x (pandas.core.frame.DataFrame): n samples to classify. y (pandas.core.series.Series): n classes of the samples in the x array. population_size (uint): Number of individuals in the optimization process. number_of_evaluations (uint): Number of maximum evaluations. optimization_algorithm (str): Name of the optimization algorithm to use. fitness_function (str): Name of the fitness function to use.

Returns:

float: Best fitness value found in optimization process.

run(x)

Runs the pipeline.

Arguments:

x (pandas.core.frame.DataFrame): n samples to classify.

Returns:

pandas.core.series.Series: n predicted classes of the samples in the x array.

set_categorical_features_encoders(value)

Set categorical features’ encoders.

set_classifier(value)

Set classifier.

set_feature_selection_algorithm(value)

Set feature selection algorithm.

set_feature_transform_algorithm(value)

Set feature transform algorithm.

set_imputers(value)

Set imputers.

set_selected_features_mask(value)

Set selected features mask.

set_stats(value)

Set stats.

to_string()

User friendly representation of the object.

Returns:

str: User friendly representation of the object.

to_string_slim()

Slim user friendly representation of the object.

Returns:

str: Slim user friendly representation of the object.

class niaaml.PipelineComponent(**kwargs)

Bases: object

Class for implementing pipeline components.

Date:

2020

Author:

Luka Pečnik

License:

MIT

Attributes:

Name (str): Name of the pipeline component. _params (Dict[str, ParameterDefinition]): Dictionary of components’s parameters with possible values. Possible parameter values are given as an instance of the ParameterDefinition class.

See Also:
  • niaaml.utilities.ParameterDefinition

Name = None
get_params_dict()

Return parameters definition dictionary.

set_parameters(**kwargs)

Set the parameters/arguments of the pipeline component.

to_string()

User friendly representation of the object.

Returns:

str: User friendly representation of the object.

class niaaml.PipelineOptimizer(**kwargs)

Bases: object

Optimization task that finds the best classification pipeline according to the given input.

Date:

2020

Author:

Luka Pečnik

License:

MIT

Attributes:

__data (DataReader): Instance of any DataReader implementation. __feature_selection_algorithms (Optional[Iterable[str]]): Array of names of possible feature selection algorithms. __feature_transform_algorithms (Optional[Iterable[str]]): Array of names of possible feature transform algorithms. __classifiers (Iterable[Classifier]): Array of names of possible classifiers. __categorical_features_encoder (str): Name of the encoder used for categorical features. __categorical_features_encoders (Dict[FeatureEncoder]): Actual instances of FeatureEncoder for all categorical features. __imputer (str): Name of the imputer used for features that contain missing values. __imputers (Dict[Imputer]): Actual instances of Imputer for all features that contain missing values. __logger (Logger): Logger instance.

get_classifiers()

Get classifiers.

Returns:

Iterable[str]: Classifier names.

get_data()

Get data.

Returns:

DataReader: Instance of DataReader object.

get_feature_selection_algorithms()

Get feature selection algorithms.

Returns:

Iterable[str]: Feature selection algorithm names or None.

get_feature_transform_algorithms()

Get feature transform algorithms.

Returns:

Iterable[str]: Feature transform algorithm names or None.

get_logger()

Get logger.

Returns:

Logger: Logger instance.

run(fitness_name, pipeline_population_size, inner_population_size, number_of_pipeline_evaluations, number_of_inner_evaluations, optimization_algorithm, inner_optimization_algorithm=None)

Run classification pipeline optimization process.

Arguments:

fitness_name (str): Name of the fitness class to use as a function. pipeline_population_size (uint): Number of pipeline individuals in the optimization process. inner_population_size (uint): Number of individuals in the hiperparameter optimization process. number_of_pipeline_evaluations (uint): Number of maximum evaluations. number_of_inner_evaluations (uint): Number of maximum inner evaluations. optimization_algorithm (str): Name of the optimization algorithm to use. inner_optimization_algorithm (Optional[str]): Name of the inner optimization algorithm to use. Defaults to the optimization_algorithm argument.

Returns:

Pipeline: Best pipeline found in the optimization process.

run_v1(fitness_name, population_size, number_of_evaluations, optimization_algorithm)

Run classification pipeline optimization process according to the original NiaAML paper.

Reference:

Fister, Iztok, Milan Zorman, and Dušan Fister. “Continuous Optimizers for Automatic Design and Evaluation of Classification Pipelines.” Frontier Applications of Nature Inspired Computation. Springer, Singapore, 2020. 281-301.

Arguments:

fitness_name (str): Name of the fitness class to use as a function. population_size (uint): Number of individuals in the optimization process. number_of_evaluations (uint): Number of maximum evaluations. optimization_algorithm (str): Name of the optimization algorithm to use.

Returns:

Pipeline: Best pipeline found in the optimization process.

niaaml.get_bin_index(value, number_of_bins)

Gets index of value’s bin. Value must be between 0.0 and 1.0.

Arguments:

value (float): Value to put into bin. number_of_bins (uint): Number of bins on the interval [0.0, 1.0].

Returns:

uint: Calculated index.