brainSimulator’s API

Created on Thu Apr 28 15:53:15 2016 Last update: 9 Aug, 2017

@author: Francisco J. Martinez-Murcia <fjesusmartinez@ugr.es>

Copyright (C) 2017 Francisco Jesús Martínez Murcia and SiPBA Research Group

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.

class brainSimulator

class brainSimulator.BrainSimulator(method='kde', algorithm='PCA', N=100, n_comp=-1, regularize=False, verbose=False)
createNewBrains(N, kernel, components=None)

Generates new samples in the eigenbrain space and projects back to the image space for a given kernel and a specified number of components.

Parameters:
  • N (integer) – Number of samples to draw from that class
  • kernel (KDEestimator, MVNormalEstimator or GaussianEstimator) – kernel or list of kernels to generate new samples
  • components (int) – Number of components to be used in the reconstruction of the images.
Returns:

simStack - a stack or numpy.ndarray containing N vectorized images in rows.

decompose(stack, labels)

Applies PCA or ICA decomposition of the dataset.

Parameters:
  • stack (numpy.ndarray) – stack of vectorized images comprising the whole database to be decomposed
  • labels (list or numpy.ndarray) – labels of each subject in stack
Returns:

  • SCORE - A matrix of component scores
  • COEFF - The matrix of component loadings.
  • MEAN - If standardized, the mean vector of all samples.
  • VAR - If standardized, the variance of all samples.

estimateDensity(X)

Returns an estimator of the PDF of the current data.

Parameters:X (numpy.ndarray) – the data from which the different kernels are fitted.
Returns:the trained kernel estimated for X
fit(stack, labels)

Performs the fitting of the model, in order to draw samples afterwards. It applies the functions self.decompose and self.model

Parameters:
  • stack (numpy.ndarray) – stack of vectorized images comprising the whole database to be decomposed
  • labels (list or numpy.ndarray) – labels of each subject in stack
generateDataset(stack=None, labels=None, N=100, classes=None, components=None)

Fits the model and generates a new set of N elements for each class specified in “classes”.

Parameters:
  • stack (numpy.ndarray) – the stack from which the model will be created
  • labels (numpy.ndarray) – a vector containing the labels of the stacked dataset
  • N (either int (the same N will be generated per class) or a list of the same length as classes containing the number of subjects to be generated for each class respectively.) – the number of elements (per class) to be generated
  • classes (a list of the classes to be generated, e.g.: [0, 2] or [‘AD’, ‘CTL’].) – the classes that we aim to generate
  • components (integer) – the number of components used in the synthesis. This parameter is only valid if components here is smaller than the n_comp specified when creating and fitting the BrainSimulator object.
Returns:

  • labels - numpy.ndarray vector with labels for stack
  • stack - a stack or numpy.ndarray containing all synthetic images (N per clas clas) in rows.

model(labels)

Models the per-class distribution of scores and sets the kernels. Uses the internally stored SCORE matrix, once the decomposition is applied

Parameters:labels (list or numpy.ndarray) – labels of each subject in stack
Returns:
  • kernels - a multivariate kernel or list of kernels, depending on the model.
  • uniqLabels - unique labels used to create a standard object.
sample(N, clas=0, n_comp=None)

Standard method that draws samples from the model.

Parameters:
  • N (integer) – number of samples to be generated for each class.
  • clas (integer) – class (according to self.uniqLabels) of the images to be generated.
  • n_comp (int) – Number of components to be used in the reconstruction of the images.
Returns:

  • labels - numpy.ndarray vector with N labels of clas
  • stack - a stack or numpy.ndarray containing N vectorized images of clas clas in rows.

auxiliary classes

These auxiliary classes define the PDF models that will be applied in the analysis and synthesis of brain images. All feature a set of methods .fit and .sample that in the case of MVN and Gaussian are a simple interface for their scipy counterparts, while for the KDE, it uses automatic estimation of bandwidth and defines more auxiliary functions. See a further discussion of these in the original paper.

class brainSimulator.MVNormalEstimator(mean=0.0, cov=1.0)

This class creates an interface for generating random numbers according to a given multivariate normal parametrization, estimated from the data Works only with python 3.4+ (due to numpy matrix multiplication)

class brainSimulator.GaussianEstimator(mean=0.0, var=1.0)

This class generates an interface for generating random numbers according to a per-component gaussian parametrization, estimated from the data

class brainSimulator.KDEestimator(bandwidth=1.0)

An interface for generating random numbers according to a given Kernel Density Estimation (KDE) parametrization based on the data.

botev_bandwidth(data)

Implementation of the KDE bandwidth selection method outline in:

    1. Botev, J. F. Grotowski, and D. P. Kroese. Kernel density estimation via diffusion. The Annals of Statistics, 38(5):2916-2957, 2010.

Based on the implementation of Daniel B. Smith, PhD. The object is a callable returning the bandwidth for a 1D kernel.

Forked from the package PyQT_fit.

Parameters:data (numpy.ndarray) – 1D array containing the data to model with a 1D KDE.
Returns:Optimal bandwidth according to the data.

auxiliary functions

brainSimulator.applyPCA(X, regularize=True, n_comp=-1)

This function applies PCA decomposition to a matrix containing all subjects to be modeled.

Parameters:
  • X (numpy.ndarray) – The bidimensional array containing one image per row (conveniently vectorized)
  • regularize (bool) – Whether or not to regularize (standardize) X. default=True.
  • n_comp (int) – Number of components to extract. If not specified, it will compute all available components except one.
Returns:

  • Spca (numpy.ndarray): Array with the PCA decomposition of X.
  • Components (numpy.ndarray): Array with the eigenvalues of the PCA decomposition of X.
  • Mean (numpy.ndarray): Vector with per-column average value.
  • Variance (numpy.ndarray): Vector with per-column variance value.

brainSimulator.applyICA(X, regularize=True, n_comp=-1)

This function applies ICA decomposition to a matrix containing all subjects to be modeled.

Parameters:
  • X (numpy.ndarray) – The bidimensional array containing one image per row (conveniently vectorized)
  • regularize (bool) – Whether or not to regularize (standardize) X. default=True.
  • n_comp (int) – Number of components to extract. If not specified, it will compute all available components except one.
Returns:

  • Spca (numpy.ndarray): Array with the ICA decomposition of X.
  • Components (numpy.ndarray): Array with the eigenvalues of the ICA decomposition of X.
  • Mean (numpy.ndarray): Vector with per-column average value.
  • Variance (numpy.ndarray): Vector with per-column variance value.