Class Art2aFloatClustering

java.lang.Object
de.unijena.cheminf.clustering.art2a.clustering.Art2aFloatClustering
All Implemented Interfaces:
IArt2aClustering

public class Art2aFloatClustering extends Object implements IArt2aClustering
The class implements an Art-2A algorithm in single machine precision for fast, stable unsupervised clustering for open categorical problems. The class is primarily intended for the clustering of fingerprints.
LITERATURE SOURCE:
See Also:
  • Constructor Details

    • Art2aFloatClustering

      public Art2aFloatClustering(float[][] aDataMatrix, int aMaximumNumberOfEpochs, float aVigilanceParameter, float aRequiredSimilarity, float aLearningParameter) throws IllegalArgumentException, NullPointerException
      Constructor. The data matrix with the input vectors/fingerprints is checked for correctness. Each row of the matrix corresponds to an input vector/fingerprint. The vectors must not have components smaller than 0. All input vectors must have the same length. If there are components greater than 1, these input vectors are scaled so that all vector components are between 0 and 1.
      WARNING: If the data matrix consists only of null vectors, no clustering is possible, because they do not contain any information that can be used for similarity evaluation.
      Parameters:
      aDataMatrix - matrix contains all inputs for clustering.
      aMaximumNumberOfEpochs - maximum number of epochs that the system may use for convergence.
      aVigilanceParameter - parameter to influence the number of clusters.
      aRequiredSimilarity - parameter indicating the minimum similarity between the current cluster vectors and the previous cluster vectors.
      aLearningParameter - parameter to define the intensity of keeping the old class vector in mind before the system adapts it to the new sample vector.
      Throws:
      IllegalArgumentException - is thrown if the given arguments are invalid.
      NullPointerException - is thrown if aDataMatrix is null.
  • Method Details

    • initializeMatrices

      public void initializeMatrices()
      Initialise the cluster matrices.
      Specified by:
      initializeMatrices in interface IArt2aClustering
    • getRandomizeVectorIndices

      public int[] getRandomizeVectorIndices()
      Since the Art-2a algorithm randomly selects any input vector, the input vectors must first be randomized. The input vectors/fingerprints are randomized so that all input vectors can be clustered by random selection. Here, the Fisher-Yates method is used to randomize the inputs.
      Specified by:
      getRandomizeVectorIndices in interface IArt2aClustering
      Returns:
      an array with vector indices in a random order
    • getClusterResult

      public IArt2aClusteringResult getClusterResult(boolean anIsClusteringResultExported, int aSeedValue) throws ConvergenceFailedException
      Starts an Art-2A clustering algorithm. The clustering process begins by randomly selecting an input vector/fingerprint from the data matrix. After normalizing the first input vector, it is assigned to the first cluster. For all other subsequent input vectors, they also undergo certain normalization steps. If there is sufficient similarity to an existing cluster, they are assigned to that cluster. Otherwise, a new cluster is formed, and the input is added to it. Null vectors are not clustered.
      Specified by:
      getClusterResult in interface IArt2aClustering
      Parameters:
      anIsClusteringResultExported - If the parameter == true, all information about the clustering is exported to 2 text files.The first exported text file is a detailed log of the clustering process and the intermediate results and the second file is a rough overview of the final result.
      aSeedValue - user-defined seed value to randomize input vectors.
      Returns:
      IArt2aClusteringResult
      Throws:
      ConvergenceFailedException - is thrown, when convergence fails.