`niaaml`

class niaaml.Factory(**kwargs)

Bases: object

Base class with string mappings to entities.

Date:: 2020
Author:: Luka Pečnik
License:: MIT
Attributes:: _entities (Dict[str, any]): Dictionary to map from strings to an instance of anything.

get_name_to_classname_mapping()

Get dictionary of user-friendly name to class name mapping.

Returns:: dict: Dictionary of user-friendly name to class name mapping.

get_result(name)

Get the resulting entity.

Arguments:: name (str): String that represents the entity.
Returns:: any: Entity according to the given name.

class niaaml.Logger(verbose=False, output_file=None, **kwargs)

Bases: object

Class for logging throughout the framework.

Date:: 2020
Author:: Luka Pečnik
License:: MIT

log_optimization_error(text): Log optimization error message.

log_pipeline(text): Log pipeline info message.

log_progress(text): Log progress message.

class niaaml.MinMax(min, max)

Bases: object

Class for ParameterDefinition’s value property.

Date:

2020

Author:

Luka Pečnik

License:

MIT

Attributes:

min (float): Minimum number (inclusive). max (float): Maximum number (exclusive).

See Also:

niaaml.utilities.ParameterDefinition

class niaaml.OptimizationStats(predicted, expected, **kwargs)

Bases: object

Class that holds pipeline optimization result’s statistics. Includes accuracy, precision, Cohen’s kappa and F1-score.

Date:: 2020
Author:: Luka Pečnik
License:: MIT
Attributes:: _accuracy (float): Calculated accuracy. _precision (float): Calculated precision. _cohen_kappa (float): Calculated Cohen’s kappa. _f1_score (float): Calculated F1-score.

to_string()

User friendly representation of the object.

Returns:: str: User friendly representation of the object.

class niaaml.ParameterDefinition(value, param_type=None)

Bases: object

Class for PipelineComponent parameters definition.

Date:

2020

Author:

Luka Pečnik

License:

MIT

Attributes:

value (any): Array of possible parameter values or instance of MinMax class. param_type (numpy.dtype): Selection output data type.

See Also:

niaaml.pipeline_component.PipelineComponent
niaaml.utilities.MinMax

class niaaml.Pipeline(**kwargs)

Bases: object

Classification pipeline defined by optional preprocessing steps and classifier.

Date:: 2020
Author:: Luka Pečnik
License:: MIT
Attributes:: __feature_selection_algorithm (Optional[FeatureSelectionAlgorithm]): Feature selection algorithm implementation. __feature_transform_algorithm (Optional[FeatureTransformAlgorithm]): Feature transform algorithm implementation. __classifier (Classifier): Classifier implementation. __selected_features_mask (Iterable[bool]): Mask of selected features during the feature selection process. __best_stats (OptimizationStats): Statistics of the most successful setup of parameters. __categorical_features_encoders (Dict[FeatureEncoder]): Instances of FeatureEncoder for all categorical features. __imputers (Dict[Imputer]): Dictionary of instances of Imputer for all columns that contained missing values during optimization process. __logger (Logger): Logger instance.

export(file_name)

Exports Pipeline object to a file for later use. Extension is added if not present.

Arguments:: file_name (str): Output file name.

export_text(file_name)

Exports Pipeline object to a user-friendly text file. Extension is added if not present.

Arguments:: file_name (str): Output file name.

get_classifier()

Get deep copy of the classifier.

Returns:: Classifier: Instance of the Classifier object.

get_feature_selection_algorithm()

Get deep copy of the feature selection algorithm.

Returns:: FeatureSelectionAlgorithm: Instance of the FeatureSelectionAlgorithm object.

get_feature_transform_algorithm()

Get deep copy of the feature transform algorithm.

Returns:: FeatureTransformAlgorithm: Instance of the FeatureTransformAlgorithm object.

get_logger()

Get logger.

Returns:: Logger: Instance of the Logger object.

get_stats()

Get optimization statistics.

Returns:: OptimizationStats: Instance of the OptimizationStats object.

static load(file_name)

Loads Pipeline object from a file.

Returns:: Pipeline: Loaded Pipeline instance.

optimize(x, y, population_size, number_of_evaluations, optimization_algorithm, fitness_function)

Optimize pipeline’s hyperparameters.

Arguments:: x (pandas.core.frame.DataFrame): n samples to classify. y (pandas.core.series.Series): n classes of the samples in the x array. population_size (uint): Number of individuals in the optimization process. number_of_evaluations (uint): Number of maximum evaluations. optimization_algorithm (str): Name of the optimization algorithm to use. fitness_function (str): Name of the fitness function to use.
Returns:: float: Best fitness value found in optimization process.

run(x)

Runs the pipeline.

Arguments:: x (pandas.core.frame.DataFrame): n samples to classify.
Returns:: pandas.core.series.Series: n predicted classes of the samples in the x array.

set_categorical_features_encoders(value): Set categorical features’ encoders.

set_classifier(value): Set classifier.

set_feature_selection_algorithm(value): Set feature selection algorithm.

set_feature_transform_algorithm(value): Set feature transform algorithm.

set_imputers(value): Set imputers.

set_selected_features_mask(value): Set selected features mask.

set_stats(value): Set stats.

to_string()

User friendly representation of the object.

Returns:: str: User friendly representation of the object.

to_string_slim()

Slim user friendly representation of the object.

Returns:: str: Slim user friendly representation of the object.

class niaaml.PipelineComponent(**kwargs)

Bases: object

Class for implementing pipeline components.

Date:

2020

Author:

Luka Pečnik

License:

MIT

Attributes:

Name (str): Name of the pipeline component. _params (Dict[str, ParameterDefinition]): Dictionary of components’s parameters with possible values. Possible parameter values are given as an instance of the ParameterDefinition class.

See Also:

niaaml.utilities.ParameterDefinition

Name = None

get_params_dict(): Return parameters definition dictionary.

set_parameters(**kwargs): Set the parameters/arguments of the pipeline component.

to_string()

User friendly representation of the object.

Returns:: str: User friendly representation of the object.

class niaaml.PipelineOptimizer(**kwargs)

Bases: object

Optimization task that finds the best classification pipeline according to the given input.

Date:: 2020
Author:: Luka Pečnik
License:: MIT
Attributes:: __data (DataReader): Instance of any DataReader implementation. __feature_selection_algorithms (Optional[Iterable[str]]): Array of names of possible feature selection algorithms. __feature_transform_algorithms (Optional[Iterable[str]]): Array of names of possible feature transform algorithms. __classifiers (Iterable[Classifier]): Array of names of possible classifiers. __categorical_features_encoder (str): Name of the encoder used for categorical features. __categorical_features_encoders (Dict[FeatureEncoder]): Actual instances of FeatureEncoder for all categorical features. __imputer (str): Name of the imputer used for features that contain missing values. __imputers (Dict[Imputer]): Actual instances of Imputer for all features that contain missing values. __logger (Logger): Logger instance.

get_classifiers()

Get classifiers.

Returns:: Iterable[str]: Classifier names.

get_data()

Get data.

Returns:: DataReader: Instance of DataReader object.

get_feature_selection_algorithms()

Get feature selection algorithms.

Returns:: Iterable[str]: Feature selection algorithm names or None.

get_feature_transform_algorithms()

Get feature transform algorithms.

Returns:: Iterable[str]: Feature transform algorithm names or None.

get_logger()

Get logger.

Returns:: Logger: Logger instance.

run(fitness_name, pipeline_population_size, inner_population_size, number_of_pipeline_evaluations, number_of_inner_evaluations, optimization_algorithm, inner_optimization_algorithm=None)

Run classification pipeline optimization process.

Arguments:: fitness_name (str): Name of the fitness class to use as a function. pipeline_population_size (uint): Number of pipeline individuals in the optimization process. inner_population_size (uint): Number of individuals in the hiperparameter optimization process. number_of_pipeline_evaluations (uint): Number of maximum evaluations. number_of_inner_evaluations (uint): Number of maximum inner evaluations. optimization_algorithm (str): Name of the optimization algorithm to use. inner_optimization_algorithm (Optional[str]): Name of the inner optimization algorithm to use. Defaults to the optimization_algorithm argument.
Returns:: Pipeline: Best pipeline found in the optimization process.

run_v1(fitness_name, population_size, number_of_evaluations, optimization_algorithm)

Run classification pipeline optimization process according to the original NiaAML paper.

Reference:: Fister, Iztok, Milan Zorman, and Dušan Fister. “Continuous Optimizers for Automatic Design and Evaluation of Classification Pipelines.” Frontier Applications of Nature Inspired Computation. Springer, Singapore, 2020. 281-301.
Arguments:: fitness_name (str): Name of the fitness class to use as a function. population_size (uint): Number of individuals in the optimization process. number_of_evaluations (uint): Number of maximum evaluations. optimization_algorithm (str): Name of the optimization algorithm to use.
Returns:: Pipeline: Best pipeline found in the optimization process.

niaaml.get_bin_index(value, number_of_bins)

Gets index of value’s bin. Value must be between 0.0 and 1.0.

Arguments:: value (float): Value to put into bin. number_of_bins (uint): Number of bins on the interval [0.0, 1.0].
Returns:: uint: Calculated index.

niaaml

`niaaml`