niaaml
- class niaaml.Factory(**kwargs)
Bases:
object
Base class with string mappings to entities.
- Date:
2020
- Author:
Luka Pečnik
- License:
MIT
- Attributes:
_entities (Dict[str, any]): Dictionary to map from strings to an instance of anything.
- get_name_to_classname_mapping()
Get dictionary of user-friendly name to class name mapping.
- Returns:
dict: Dictionary of user-friendly name to class name mapping.
- get_result(name)
Get the resulting entity.
- Arguments:
name (str): String that represents the entity.
- Returns:
any: Entity according to the given name.
- class niaaml.Logger(verbose=False, output_file=None, **kwargs)
Bases:
object
Class for logging throughout the framework.
- Date:
2020
- Author:
Luka Pečnik
- License:
MIT
- log_optimization_error(text)
Log optimization error message.
- log_pipeline(text)
Log pipeline info message.
- log_progress(text)
Log progress message.
- class niaaml.MinMax(min, max)
Bases:
object
Class for ParameterDefinition’s value property.
- Date:
2020
- Author:
Luka Pečnik
- License:
MIT
- Attributes:
min (float): Minimum number (inclusive). max (float): Maximum number (exclusive).
- See Also:
niaaml.utilities.ParameterDefinition
- class niaaml.OptimizationStats(predicted, expected, **kwargs)
Bases:
object
Class that holds pipeline optimization result’s statistics. Includes accuracy, precision, Cohen’s kappa and F1-score.
- Date:
2020
- Author:
Luka Pečnik
- License:
MIT
- Attributes:
_accuracy (float): Calculated accuracy. _precision (float): Calculated precision. _cohen_kappa (float): Calculated Cohen’s kappa. _f1_score (float): Calculated F1-score.
- to_string()
User friendly representation of the object.
- Returns:
str: User friendly representation of the object.
- class niaaml.ParameterDefinition(value, param_type=None)
Bases:
object
Class for PipelineComponent parameters definition.
- Date:
2020
- Author:
Luka Pečnik
- License:
MIT
- Attributes:
value (any): Array of possible parameter values or instance of MinMax class. param_type (numpy.dtype): Selection output data type.
- See Also:
niaaml.pipeline_component.PipelineComponent
niaaml.utilities.MinMax
- class niaaml.Pipeline(**kwargs)
Bases:
object
Classification pipeline defined by optional preprocessing steps and classifier.
- Date:
2020
- Author:
Luka Pečnik
- License:
MIT
- Attributes:
__feature_selection_algorithm (Optional[FeatureSelectionAlgorithm]): Feature selection algorithm implementation. __feature_transform_algorithm (Optional[FeatureTransformAlgorithm]): Feature transform algorithm implementation. __classifier (Classifier): Classifier implementation. __selected_features_mask (Iterable[bool]): Mask of selected features during the feature selection process. __best_stats (OptimizationStats): Statistics of the most successful setup of parameters. __categorical_features_encoders (Dict[FeatureEncoder]): Instances of FeatureEncoder for all categorical features. __imputers (Dict[Imputer]): Dictionary of instances of Imputer for all columns that contained missing values during optimization process. __logger (Logger): Logger instance.
- export(file_name)
Exports Pipeline object to a file for later use. Extension is added if not present.
- Arguments:
file_name (str): Output file name.
- export_text(file_name)
Exports Pipeline object to a user-friendly text file. Extension is added if not present.
- Arguments:
file_name (str): Output file name.
- get_classifier()
Get deep copy of the classifier.
- Returns:
Classifier: Instance of the Classifier object.
- get_feature_selection_algorithm()
Get deep copy of the feature selection algorithm.
- Returns:
FeatureSelectionAlgorithm: Instance of the FeatureSelectionAlgorithm object.
- get_feature_transform_algorithm()
Get deep copy of the feature transform algorithm.
- Returns:
FeatureTransformAlgorithm: Instance of the FeatureTransformAlgorithm object.
- get_logger()
Get logger.
- Returns:
Logger: Instance of the Logger object.
- get_stats()
Get optimization statistics.
- Returns:
OptimizationStats: Instance of the OptimizationStats object.
- static load(file_name)
Loads Pipeline object from a file.
- Returns:
Pipeline: Loaded Pipeline instance.
- optimize(x, y, population_size, number_of_evaluations, optimization_algorithm, fitness_function)
Optimize pipeline’s hyperparameters.
- Arguments:
x (pandas.core.frame.DataFrame): n samples to classify. y (pandas.core.series.Series): n classes of the samples in the x array. population_size (uint): Number of individuals in the optimization process. number_of_evaluations (uint): Number of maximum evaluations. optimization_algorithm (str): Name of the optimization algorithm to use. fitness_function (str): Name of the fitness function to use.
- Returns:
float: Best fitness value found in optimization process.
- run(x)
Runs the pipeline.
- Arguments:
x (pandas.core.frame.DataFrame): n samples to classify.
- Returns:
pandas.core.series.Series: n predicted classes of the samples in the x array.
- set_categorical_features_encoders(value)
Set categorical features’ encoders.
- set_classifier(value)
Set classifier.
- set_feature_selection_algorithm(value)
Set feature selection algorithm.
- set_feature_transform_algorithm(value)
Set feature transform algorithm.
- set_imputers(value)
Set imputers.
- set_selected_features_mask(value)
Set selected features mask.
- set_stats(value)
Set stats.
- to_string()
User friendly representation of the object.
- Returns:
str: User friendly representation of the object.
- to_string_slim()
Slim user friendly representation of the object.
- Returns:
str: Slim user friendly representation of the object.
- class niaaml.PipelineComponent(**kwargs)
Bases:
object
Class for implementing pipeline components.
- Date:
2020
- Author:
Luka Pečnik
- License:
MIT
- Attributes:
Name (str): Name of the pipeline component. _params (Dict[str, ParameterDefinition]): Dictionary of components’s parameters with possible values. Possible parameter values are given as an instance of the ParameterDefinition class.
- See Also:
niaaml.utilities.ParameterDefinition
- Name = None
- get_params_dict()
Return parameters definition dictionary.
- set_parameters(**kwargs)
Set the parameters/arguments of the pipeline component.
- to_string()
User friendly representation of the object.
- Returns:
str: User friendly representation of the object.
- class niaaml.PipelineOptimizer(**kwargs)
Bases:
object
Optimization task that finds the best classification pipeline according to the given input.
- Date:
2020
- Author:
Luka Pečnik
- License:
MIT
- Attributes:
__data (DataReader): Instance of any DataReader implementation. __feature_selection_algorithms (Optional[Iterable[str]]): Array of names of possible feature selection algorithms. __feature_transform_algorithms (Optional[Iterable[str]]): Array of names of possible feature transform algorithms. __classifiers (Iterable[Classifier]): Array of names of possible classifiers. __categorical_features_encoder (str): Name of the encoder used for categorical features. __categorical_features_encoders (Dict[FeatureEncoder]): Actual instances of FeatureEncoder for all categorical features. __imputer (str): Name of the imputer used for features that contain missing values. __imputers (Dict[Imputer]): Actual instances of Imputer for all features that contain missing values. __logger (Logger): Logger instance.
- get_classifiers()
Get classifiers.
- Returns:
Iterable[str]: Classifier names.
- get_data()
Get data.
- Returns:
DataReader: Instance of DataReader object.
- get_feature_selection_algorithms()
Get feature selection algorithms.
- Returns:
Iterable[str]: Feature selection algorithm names or None.
- get_feature_transform_algorithms()
Get feature transform algorithms.
- Returns:
Iterable[str]: Feature transform algorithm names or None.
- get_logger()
Get logger.
- Returns:
Logger: Logger instance.
- run(fitness_name, pipeline_population_size, inner_population_size, number_of_pipeline_evaluations, number_of_inner_evaluations, optimization_algorithm, inner_optimization_algorithm=None)
Run classification pipeline optimization process.
- Arguments:
fitness_name (str): Name of the fitness class to use as a function. pipeline_population_size (uint): Number of pipeline individuals in the optimization process. inner_population_size (uint): Number of individuals in the hiperparameter optimization process. number_of_pipeline_evaluations (uint): Number of maximum evaluations. number_of_inner_evaluations (uint): Number of maximum inner evaluations. optimization_algorithm (str): Name of the optimization algorithm to use. inner_optimization_algorithm (Optional[str]): Name of the inner optimization algorithm to use. Defaults to the optimization_algorithm argument.
- Returns:
Pipeline: Best pipeline found in the optimization process.
- run_v1(fitness_name, population_size, number_of_evaluations, optimization_algorithm)
Run classification pipeline optimization process according to the original NiaAML paper.
- Reference:
Fister, Iztok, Milan Zorman, and Dušan Fister. “Continuous Optimizers for Automatic Design and Evaluation of Classification Pipelines.” Frontier Applications of Nature Inspired Computation. Springer, Singapore, 2020. 281-301.
- Arguments:
fitness_name (str): Name of the fitness class to use as a function. population_size (uint): Number of individuals in the optimization process. number_of_evaluations (uint): Number of maximum evaluations. optimization_algorithm (str): Name of the optimization algorithm to use.
- Returns:
Pipeline: Best pipeline found in the optimization process.
- niaaml.get_bin_index(value, number_of_bins)
Gets index of value’s bin. Value must be between 0.0 and 1.0.
- Arguments:
value (float): Value to put into bin. number_of_bins (uint): Number of bins on the interval [0.0, 1.0].
- Returns:
uint: Calculated index.