A Collection of Brain Network Datasets
This paper presents a comprehensive and quality collection of functional human brain network data for potential research in the intersection of neuroscience, machine learning, and graph analytics.
Anatomical and functional MRI images of the brain have been used to understand the functional connectivity of the human brain and are particularly important in identifying underlying neurodegenerative conditions such as Alzheimer's, Parkinson's, and Autism. Recently, the study of the brain in the form of brain networks using machine learning and graph analytics has become increasingly popular, especially to predict the early onset of these conditions. A brain network, represented as a graph, retains richer structural and positional information that traditional examination methods are unable to capture. However, the lack of brain network data transformed from functional MRI images prevents researchers from data-driven explorations. One of the main difficulties lies in the complicated domain-specific preprocessing steps and the exhaustive computation required to convert data from MRI images into brain networks. We bridge this gap by collecting a large amount of available MRI images from existing studies, working with domain experts to make sensible design choices, and preprocessing the MRI images to produce a collection of brain network datasets. The datasets originate from 6 different sources, cover 4 neurodegenerative conditions, and consist of a total of 2,688 subjects.
Due to the data protocol, we are unable to release the ADNI dataset here. The data will be released via the ADNI external data submissions within their data system.
We test our graph datasets on 5 machine learning models commonly used in neuroscience and on a recent graph-based analysis model to validate the data quality and to provide domain baselines. To lower the barrier to entry and promote the research in this interdisciplinary field, we release our complete preprocessing details, codes, and brain network data: https://github.com/brainnetuoa/data_driven_network_neuroscience.
To stay informed about the new updates of the datasets, kindly provide us with your email address:
https://forms.gle/KGAajR6LEysXWKvKA
Updated on 10/09/2024:
Please note that we have identified 14 subjects in the PPMI (Parkinson's Progression Markers Initiative) dataset, prodromal group, where the time-series images include only 10 time slots. The invalid subjects are:
sub-prodromal103857
sub-prodromal120622
sub-prodromal146573
sub-prodromal40737
sub-prodromal52874
sub-prodromal55560
sub-prodromal56680
sub-prodromal58027
sub-prodromal58680
sub-prodromal59390
sub-prodromal59483
sub-prodromal59503
sub-prodromal71658
sub-prodromal75422
We have removed the invalid images, and updated the dataset by including both the parcellated images (ppmi_v2.zip
) and the preprocessed images (Ppmi_Preprocessed_v2.z*
).