Class Art2aFloatClustering
java.lang.Object
de.unijena.cheminf.clustering.art2a.clustering.Art2aFloatClustering
- All Implemented Interfaces:
IArt2aClustering
The class implements an Art-2A algorithm in single machine precision for fast,
stable unsupervised clustering for open categorical problems. The class is primarily intended for the
clustering of fingerprints.
LITERATURE SOURCE:
LITERATURE SOURCE:
-
Constructor Summary
ConstructorDescriptionArt2aFloatClustering
(float[][] aDataMatrix, int aMaximumNumberOfEpochs, float aVigilanceParameter, float aRequiredSimilarity, float aLearningParameter) Constructor. -
Method Summary
Modifier and TypeMethodDescriptiongetClusterResult
(boolean anIsClusteringResultExported, int aSeedValue) Starts an Art-2A clustering algorithm.int[]
Since the Art-2a algorithm randomly selects any input vector, the input vectors must first be randomized.void
Initialise the cluster matrices.
-
Constructor Details
-
Art2aFloatClustering
public Art2aFloatClustering(float[][] aDataMatrix, int aMaximumNumberOfEpochs, float aVigilanceParameter, float aRequiredSimilarity, float aLearningParameter) throws IllegalArgumentException, NullPointerException Constructor. The data matrix with the input vectors/fingerprints is checked for correctness. Each row of the matrix corresponds to an input vector/fingerprint. The vectors must not have components smaller than 0. All input vectors must have the same length. If there are components greater than 1, these input vectors are scaled so that all vector components are between 0 and 1.
WARNING: If the data matrix consists only of null vectors, no clustering is possible, because they do not contain any information that can be used for similarity evaluation.- Parameters:
aDataMatrix
- matrix contains all inputs for clustering.aMaximumNumberOfEpochs
- maximum number of epochs that the system may use for convergence.aVigilanceParameter
- parameter to influence the number of clusters.aRequiredSimilarity
- parameter indicating the minimum similarity between the current cluster vectors and the previous cluster vectors.aLearningParameter
- parameter to define the intensity of keeping the old class vector in mind before the system adapts it to the new sample vector.- Throws:
IllegalArgumentException
- is thrown if the given arguments are invalid.NullPointerException
- is thrown if aDataMatrix is null.
-
-
Method Details
-
initializeMatrices
public void initializeMatrices()Initialise the cluster matrices.- Specified by:
initializeMatrices
in interfaceIArt2aClustering
-
getRandomizeVectorIndices
public int[] getRandomizeVectorIndices()Since the Art-2a algorithm randomly selects any input vector, the input vectors must first be randomized. The input vectors/fingerprints are randomized so that all input vectors can be clustered by random selection. Here, the Fisher-Yates method is used to randomize the inputs.- Specified by:
getRandomizeVectorIndices
in interfaceIArt2aClustering
- Returns:
- an array with vector indices in a random order
-
getClusterResult
public IArt2aClusteringResult getClusterResult(boolean anIsClusteringResultExported, int aSeedValue) throws ConvergenceFailedException Starts an Art-2A clustering algorithm. The clustering process begins by randomly selecting an input vector/fingerprint from the data matrix. After normalizing the first input vector, it is assigned to the first cluster. For all other subsequent input vectors, they also undergo certain normalization steps. If there is sufficient similarity to an existing cluster, they are assigned to that cluster. Otherwise, a new cluster is formed, and the input is added to it. Null vectors are not clustered.- Specified by:
getClusterResult
in interfaceIArt2aClustering
- Parameters:
anIsClusteringResultExported
- If the parameter == true, all information about the clustering is exported to 2 text files.The first exported text file is a detailed log of the clustering process and the intermediate results and the second file is a rough overview of the final result.aSeedValue
- user-defined seed value to randomize input vectors.- Returns:
- IArt2aClusteringResult
- Throws:
ConvergenceFailedException
- is thrown, when convergence fails.
-