BenchNIRS API

Data loading

benchnirs.load_dataset(dataset, path, bandpass=None, order=4, tddr=False, baseline=(None, 0), roi_sides=False, downsample=10)

Load and filter one of the open access dataset.

Parameters:
  • dataset (string) – Dataset to load. 'herff_2014_nb' for n-back from Herff et al., 2014 (epoch interval: -5 to 44 seconds). 'shin_2018_nb' for n-back from Shin et al., 2018 (epoch interval: -2 to 40 seconds). 'shin_2018_wg' for word generation from Shin et al., 2018 (epoch interval: -2 to 10 seconds). 'shin_2016_ma' for mental arithmetic from Shin et al., 2016 (epoch interval: -2 to 10 seconds). 'bak_2019_me' for motor execution from Bak et al., 2019 (epoch interval: -2 to 10 seconds).

  • path (string) – Path of the directory of the dataset selected with the dataset parameter.

  • bandpass (list of floats | None) – Cutoff frequencies of the bandpass Butterworth filter (in Hz). Defaults to None for no filtering.

  • order (integer) – Order of the bandpass Butterworth filter.

  • tddr (boolean) – Whether to apply temporal derivative distribution repair.

  • baseline (None or tuple of length 2) – The time interval to apply baseline correction (in sec). If None do not apply it. If a tuple (a, b) the interval is between a and b (in seconds). If a is None the beginning of the data is used and if b is None then b is set to the end of the interval. If (None, None) all the time interval is used. Correction is applied by computing mean of the baseline period and subtracting it from the data. The baseline (a, b) includes both endpoints, i.e. all timepoints t such that a <= t <= b.

  • roi_sides (boolean) – Whether to average channels by hemisphere in the task related regions of interest.

  • downsample (float | None) – Downsample data with the specified frequency (in Hz). Defaults to 10. Ignored if None or higher than the dataset’s original sampling frequency.

Returns:

epochs – MNE epochs data with associated labels. Subject IDs are contained in the metadata property.

Return type:

MNE Epochs object

benchnirs.load_homer(path, tmin, sfreq)

Load yTrials data from the Homer block average function.

Parameters:
  • path (string) – Path of the directory containing the yTrials files with .mat file extensions.

  • tmin (float) – Start time before the trial onset in seconds (negative or null). Should be 0 if the trigger onset is on the first time point of the trial data.

  • sfreq (float) – Sampling frequency in Hz.

Returns:

epochs – MNE epochs data with associated labels. Subject IDs are contained in the metadata property.

Return type:

MNE Epochs object

Visualisation

benchnirs.epochs_viz(mne_epochs, reject_criteria=None)

Process and visualize epochs. Processing includes baseline cropping and bad epoch removal.

Parameters:
  • mne_epochs (MNE Epochs object) – MNE epochs with associated labels.

  • reject_criteria (list of floats | None) – List of the 2 peak-to-peak rejection thresholds for HbO and HbR channels respectively in uM. Defaults to None for no rejection.

Data processing

benchnirs.process_epochs(mne_epochs, tmin=0, tmax=None, tslide=None, sort=False, reject_criteria=None)

Perform processing on epochs including baseline cropping, bad epoch removal, label extraction and unit conversion.

Parameters:
  • mne_epochs (MNE Epochs object) – MNE epochs of filtered data with associated labels. Subject IDs are contained in the metadata property.

  • tmin (float | None) – Start time of selection in seconds. Defaults to 0 to crop the baseline.

  • tmax (float | None) – End time of selection in seconds. Defaults to None to keep the initial end time.

  • tslide (float | None) – Size of the sliding window in seconds. Will crop the epochs if tmax is not a multiple of tslide. Defaults to None for no window sliding.

  • sort (boolean) – Whether to sort channels by type (all HbO, all HbR). Defaults to False for no sorting.

  • reject_criteria (list of floats | None) – List of the 2 peak-to-peak rejection thresholds for HbO and HbR channels respectively in uM. Defaults to None for no rejection.

Returns:

  • nirs (array of shape (n_samples, n_channels, n_times)) – Processed NIRS data in uM.

  • labels (array of integer) – List of labels, starting at 0 and with ranks in the same order as the event codes. For example if the MNE event codes are [5, 2, 5, 1], the labels will be [2, 1, 2, 0]. Please note that 999 is reserved for unlabelled samples and will be unchanged.

  • groups (array of integer) – List of groups, starting at 0 and with ranks in the same order as the original subject IDs. For example if the subject IDs are [5, 2, 5, 1], the groups will be [2, 1, 2, 0].

benchnirs.extract_features(nirs, feature_list)

Perform feature extraction on NIRS data.

Parameters:
  • nirs (array of shape (n_samples, n_channels, n_times)) – Processed NIRS data.

  • feature_list (list of strings) – List of features to extract. The list can include 'mean' for the mean along the time axis, 'std' for standard deviation along the time axis and 'slope' for the slope of the linear regression along the time axis, 'skew' for the skewness along the time axis, 'kurt' for the kurtosis along the time axis, 'ttp' for the time to peak (requires channels to have been sorted beforehand: all HbO, all HbR), 'peak' for the value of the peak (max value for HbO and min value for HbR, requires channels to have been sorted beforehand: all HbO, all HbR).

Returns:

nirs_features – Features extracted from NIRS data.

Return type:

array of shape (n_samples, n_channels, n_features)

Machine learning

benchnirs.machine_learn(model, nirs, labels, groups, normalize=None, random_state=None, output_folder='./outputs')

Perform nested k-fold cross-validation for standard machine learning models producing metrics and confusion matrices. The models include linear discriminant analysis (LDA), support vector classifier (SVC) with grid search for the regularization parameter (inner cross-validation), and k-nearest neighbors (kNN) with grid search for the number of neighbors (inner cross-validation).

Parameters:
  • model (string) – Standard machine learning to use. Either 'lda' for a linear discriminant analysis, 'svc' for a linear support vector classifier or 'knn' for a k-nearest neighbors classifier.

  • nirs (array of shape (n_samples, n_channels, n_times)) – Processed NIRS data.

  • labels (array of integers) – List of labels matching the NIRS data.

  • groups (array of integers | None) – List of subject ID matching the NIRS data to perform a group k-fold cross-validation. If None, performs a stratified k-fold cross-validation instead.

  • normalize (tuple of integers | None) – Axes on which to normalize data before feeding to the model with min-max scaling based on the train set for each iteration of the outer cross-validation. For example (0, 2) to normalize across samples and time. Defaults to None for no normalization.

  • random_state (integer | None) – Controls the shuffling applied to data. Pass an integer for reproducible output across multiple function calls. Defaults to None for not setting the seed.

  • output_folder (string) – Path to the directory into which the figures will be saved. Defaults to './outputs'.

Returns:

  • accuracies (array of floats) – List of accuracies on the test sets (one for each iteration of the outer cross-validation).

  • all_hps (list of floats | list of None) – List of regularization parameters for the SVC or a list of None for the LDA (one for each iteration of the outer cross-validation).

  • additional_metrics (list of tuples) – List of tuples of metrics composed of (precision, recall, F1 score, support) on the outer cross-validation (one tuple for each iteration of the outer cross-validation). This uses the precision_recall_fscore_support function from scikit-learn with y_true and y_pred being the true and the predictions on the specific iteration of the outer cross-validation.

benchnirs.deep_learn(model_class, nirs, labels, groups, normalize=None, batch_sizes=[4, 8, 16, 32, 64], lrs=[1e-05, 0.0001, 0.001, 0.01, 0.1], max_epochs=100, min_epochs=1, random_state=None, output_folder='./outputs')

Perform nested k-fold cross-validation for a deep learning model. Produces loss graph, accuracy graph and confusion matrice. Early stopping is performed with a validation set of 20 % after the hyperparameters have been selected. The number of classes is deduced from the number of unique labels.

Parameters:
  • model_class (string | PyTorch nn.Module class) – The PyTorch model class to use. If a string, can be either 'ann', 'cnn' or 'lstm'. If a PyTorch nn.Module class, the __init__() method must accept the number of classes as a parameter, and this needs to be the number of output neurons.

  • nirs (array of shape (n_samples, n_channels, n_times)) – Processed NIRS data.

  • labels (array of integers) – List of labels matching the NIRS data.

  • groups (array of integers | None) – List of subject IDs matching the NIRS data to perform a group k-fold cross-validation. If None, performs a stratified k-fold cross-validation instead.

  • normalize (tuple of integers | None) – Axes on which to normalize data before feeding to the model with min-max scaling based on the train set for each iteration of the outer cross-validation. For example (0, 2) to normalize across samples and time. Defaults to None for no normalization.

  • batch_sizes (list of integers) – List of batch sizes to test for hyperparameter selection.

  • lrs (list of floats) – List of learning rates to test for hyperparameter selection.

  • max_epochs (integer) – Maximum number of training epochs possible. Defaults to 100.

  • min_epochs (integer) – Minimum number of training epochs before early stopping. Defaults to 1.

  • random_state (integer | None) – Controls the shuffling applied to data and random model initialization. Pass an integer for reproducible output across multiple function calls. Defaults to None for not setting the seed.

  • output_folder (string) – Path to the directory into which the figures will be saved. Defaults to './outputs'.

Returns:

  • accuracies (array of floats) – List of accuracies on the test sets (one for each iteration of the outer cross-validation).

  • all_hps (list of tuples) – List of best hyperparameters (one tuple for each iteration of the outer cross-validation). Each tuple will be (batch size, learning rate).

  • additional_metrics (list of tuples) – List of tuples of metrics composed of (precision, recall, F1 score, support) on the outer cross-validation (one tuple for each iteration of the outer cross-validation). This uses the precision_recall_fscore_support function from scikit-learn with y_true and y_pred being the true and the predictions on the specific iteration of the outer cross-validation.

benchnirs.train_final(model_class, nirs, labels, batch_size, lr, n_epochs, normalize=None, random_state=None, output_folder='./')

Train a final neural network classifier on the whole data with the selected hyperparameters. The trained neural network checkpoint is saved in the output folder. The number of classes is deduced from the number of unique labels.

Parameters:
  • model_class (PyTorch nn.Module class) – The PyTorch model class to use. The __init__() method must accept the number of classes as a parameter, and this needs to be the number of output neurons.

  • nirs (array of shape (n_samples, n_channels, n_times)) – Processed NIRS data.

  • labels (array of integers) – List of labels matching the NIRS data.

  • batch_size (integer) – Number of samples per batch.

  • lr (float) – Learning rate for the optimizer.

  • n_epochs (integer) – Number of training epochs (number of passes over the whole dataset).

  • normalize (tuple of integers | None) – Axes on which to normalize data before feeding to the model with min-max scaling. For example (0, 2) to normalize across samples and time. Defaults to None for no normalization.

  • random_state (integer | None) – Controls the shuffling applied to data and random model initialization. Pass an integer for reproducible output across multiple function calls. Defaults to None for not setting the seed.

  • output_folder (string) – Path to the directory into which the checkpoint and training graph will be saved. Defaults to the current directory.

Returns:

clf – The trained PyTorch neural network.

Return type:

PyTorch nn.Module