The reconstruction dataset (~425G training set + ~42G test set, ~133G and ~13G compressed respectively) consists of photoncube/image pairs from 50 unique simulated scenes, plus another 5 scenes for the test set (for which ground truths are not made public). There is no separate validation set, but you can set aside some of the training set for this purpose as needed. Each photoncube consists of 1024 bitplanes, and the associated ground truth reconstruction corresponds to the last bitplane. You can directly download portions of this dataset using the links at the bottom of this page, or use the aws cli to download the entire dataset as described in the next section. Finally, a sample of the dataset, containing one photoncube/image pair per scene, is available for download here.
The full dataset is hosted on a publicly accessible S3 bucket and compressed into ~8.5GB chunks. To download it, you'll need to use the aws-cli, and an LZMA-capable unzipping utility such as the 7zip cli to extract each zip file into its respective directory. Once installed, you can list out the dataset's components using:
$ aws s3 ls --summarize --human-readable --recursive s3://public-datasets/challenges/reconstruction --endpoint=https://web.s3.wisc.edu --no-sign-request
To download all the dataset archives to a predefined $DOWNLOAD_DIR in one shot, you can use the following command:
$ aws s3 sync s3://public-datasets/challenges/reconstruction $DOWNLOAD_DIR --endpoint=https://web.s3.wisc.edu --no-sign-request
If you wish to only download the test set, you can do so by appending --exclude="*" --include="test*.zip" to the above command.
You then need to extract all zips into their respective directories, which can be done with the following command in
bash:
$ for zip in $(find $DOWNLOAD_DIR -type f -name "*.zip"); do 7z x $zip -o$(dirname $zip) && rm -f $zip; done
You can also directly download the dataset chunks through HTTPS using the following links, although this might be slower:
Test Set (5 files)
Training Set (50 files)
For more single photon datasets, including real captures, see here.