The University of Auckland
Browse

CANDID-II Dataset

Version 2 2025-06-27, 00:40
Version 1 2024-03-18, 04:32
dataset
posted on 2025-06-27, 00:40 authored by Sijing FengSijing Feng


53,054 anonymized adult chest x-ray dataset in 1024 x 1024 pixel DICOM format with corresponding anonymized free-text reports from Dunedin Hospital, New Zealand between 2010 - 2020. Corresponding radiology reports generated by FRANZCR radiologists were manually annotated for 46 common radiological findings mapped to Unified Medical Language System (UMLS) and RadLex ontology. Each of the multiclassification annotations contains 4 types of labels, namely positive, uncertain, negative and not mentioned. In the provided dataset, image filenames contain patient index (enabling analysis requiring grouping of images by patients), as well as anonymized date of acquisition information where the temporal relationship between images is preserved. This dataset can be used for training and testing for deep learning algorithms for adult chest x rays.

Unfortunately, since Feb 2024, the New Zealand government is changing the data governance on datasets used for AI development and this affects the process of how the CANDID II dataset is to be accessed by the external users. Therefore, the CANDID II dataset is not available for access by users outside Health New Zealand. Further notice of access will be updated here should access by external users be reopened.

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC