users > Increasing CUDA implementation for CMTK
Showing 1-2 of 2 posts
Jan 20, 2021 05:01 PM | Robert Alfredson - Duke University
Increasing CUDA implementation for CMTK
Greetings,
I am thinking about attempting a project to contribute to the CMTK code base, which would be to implement a CUDA-enabled version of the programs/libraries used in image registration (namely, registration.cxx, warp.cxx, and maybe reformatx.cxx, along with their dependencies). To be clear, I am inexperienced in the domain of CUDA programming, but because there are several examples of existing programs in the code base that have been ported in this way, it seemed at least possible to envision.
Before starting down such a path, I just wanted to ask if anyone had tried this yet, and/or if there is any fundamental reason not to do so. Part of my curiosity about this is sparked by noticing that at least part of the registration code (libs\GPU\cmtkImagePairAffineRegistrationFunctionalDevice_kernels.cu and related files) is already refactored, in support of the code for finding symmetry planes.
If I undertook this, then I would do my best to align with the existing code conventions, so that my attempt happened to succeed, hopefully it would be useful to others. Thank you very much.
Sincerely,
Robert
I am thinking about attempting a project to contribute to the CMTK code base, which would be to implement a CUDA-enabled version of the programs/libraries used in image registration (namely, registration.cxx, warp.cxx, and maybe reformatx.cxx, along with their dependencies). To be clear, I am inexperienced in the domain of CUDA programming, but because there are several examples of existing programs in the code base that have been ported in this way, it seemed at least possible to envision.
Before starting down such a path, I just wanted to ask if anyone had tried this yet, and/or if there is any fundamental reason not to do so. Part of my curiosity about this is sparked by noticing that at least part of the registration code (libs\GPU\cmtkImagePairAffineRegistrationFunctionalDevice_kernels.cu and related files) is already refactored, in support of the code for finding symmetry planes.
If I undertook this, then I would do my best to align with the existing code conventions, so that my attempt happened to succeed, hopefully it would be useful to others. Thank you very much.
Sincerely,
Robert
Jan 21, 2021 12:01 AM | Greg Jefferis
RE: Increasing CUDA implementation for CMTK
Dear Robert,
Torsten made the original CUDA implementations quite some time ago. At the time he judged that a CUDA implementation of the main registration tools was not a worthwhile endeavour. I think this may have related to the amount of RAM typically required by images. Times may have changed with higher spec GPUs.
If you were to try to implement something, I might recommend focussing either on reformat since this might be simpler and is perhaps more commonly used interactively or on warp which is normally the big time sink. It's a long time since I've done any code profiling to examine the bottlenecks there. I wouldn't bother with registration unless it was a simpler starting point for working on warp.
While we would certainly welcome code input, if you want to speed things up, likely the most effective thing you can do is tweak your parameters. There is a good example of this in the recent paper from John Bogovic and Stephan Saalfeld
https://journals.plos.org/plosone/article/figure?id=10.1371/journal.pone.0236495.g006
All the best,
Greg.
Originally posted by Robert Alfredson:
Torsten made the original CUDA implementations quite some time ago. At the time he judged that a CUDA implementation of the main registration tools was not a worthwhile endeavour. I think this may have related to the amount of RAM typically required by images. Times may have changed with higher spec GPUs.
If you were to try to implement something, I might recommend focussing either on reformat since this might be simpler and is perhaps more commonly used interactively or on warp which is normally the big time sink. It's a long time since I've done any code profiling to examine the bottlenecks there. I wouldn't bother with registration unless it was a simpler starting point for working on warp.
While we would certainly welcome code input, if you want to speed things up, likely the most effective thing you can do is tweak your parameters. There is a good example of this in the recent paper from John Bogovic and Stephan Saalfeld
https://journals.plos.org/plosone/article/figure?id=10.1371/journal.pone.0236495.g006
All the best,
Greg.
Originally posted by Robert Alfredson:
Greetings,
I am thinking about attempting a project to contribute to the CMTK code base, which would be to implement a CUDA-enabled version of the programs/libraries used in image registration (namely, registration.cxx, warp.cxx, and maybe reformatx.cxx, along with their dependencies). To be clear, I am inexperienced in the domain of CUDA programming, but because there several examples of existing programs in the code base that have been ported in this way, it seemed at least possible to envision.
Before starting down such a path, I just wanted to ask if anyone had tried this yet, and/or if there is any fundamental reason not to do so. Part of my curiosity about this is sparked by noticing that at least part of the registration code (libs\GPU\cmtkImagePairAffineRegistrationFunctionalDevice_kernels.cu and related files) is already refactored, in support of the code for finding symmetry planes.
If I undertook this, then I would do my best to align with the existing code conventions, so that my attempt happened to succeed, hopefully it would be useful to others. Thank you very much.
Sincerely,
Robert
I am thinking about attempting a project to contribute to the CMTK code base, which would be to implement a CUDA-enabled version of the programs/libraries used in image registration (namely, registration.cxx, warp.cxx, and maybe reformatx.cxx, along with their dependencies). To be clear, I am inexperienced in the domain of CUDA programming, but because there several examples of existing programs in the code base that have been ported in this way, it seemed at least possible to envision.
Before starting down such a path, I just wanted to ask if anyone had tried this yet, and/or if there is any fundamental reason not to do so. Part of my curiosity about this is sparked by noticing that at least part of the registration code (libs\GPU\cmtkImagePairAffineRegistrationFunctionalDevice_kernels.cu and related files) is already refactored, in support of the code for finding symmetry planes.
If I undertook this, then I would do my best to align with the existing code conventions, so that my attempt happened to succeed, hopefully it would be useful to others. Thank you very much.
Sincerely,
Robert