open-discussion > RE: AAAS: Your Paper MUST include Data and Code
Mar 10, 2011  08:03 PM | Matthew Brett
RE: AAAS: Your Paper MUST include Data and Code
Originally posted by Pierre Bellec:
Well I'd say that a lot of people will agree on the principles of reproducible research and will clearly see the benefits, but I am surprised no one has mentioned yet how challenging this is in practice for neuroimaging. First there are huge issues with anonymization, especially for clinical data. Processing a dataset is one thing, releasing it publicly is another (faces in T1 scans have to be blurred for example). Then you need to host securely tons of datasets online. In the same vein, coding an in-house algorithm is one thing, releasing it publicly is again completely different (you need to document !). For new algorithms, the production environment can be very hard to reproduce. Moreover, the analysis itself can be computationally challenging (I use supercomputers all the time, this is not used by the vast majority of the neuroimaging community). Not to mention that a lot of research group do not fully automatize their data processing flow. My point is that it is not enough to say "let's go reproducible/public/open source/....". We also need an infrastructure to do so. I bet that in the next couple of years we will have websites that allow to share datasets publicly, and request with only a couple clicks for processing in supercomputers with well tested and maintained analysis pipeline. There are many current efforts in that direction (e.g. http://www.cbrain.mcgill.ca/). But at this stage this is science fiction as far as I know.

It's science fiction for the same reason that electric cars took a long time to arrive - no one agreed there was a real need.

If everyone regarded it as standard practice to provide data, we would have long ago agreed how to anonymize it.

Code that is not ready to release is probably broken.   I know this because I write a lot of code.  Almost every piece of my own code that I've gone back to and cleaned up and tested, was broken in some way.  The question is, is that important?  I'd argue that it is, and that, without code that has been properly cared for, the results are deeply suspect.

I don't think this will get solved by some large framework - but I might be wrong.  Long ago, when I had much worse tools than are available now, I did this, for example:

http://phiwave.sourceforge.net/fiac/inde...

It took some time, but not much.  The paper wasn't at all important.  I just mean that moving towards reproducible research is just part of ordinary good practice, and that the reason that isn't happening very fast is that there isn't yet deep agreement that this is so.

Threaded View

TitleAuthorDate
Luis Ibanez Mar 10, 2011
hongtu zhu Mar 13, 2011
Luis Ibanez Mar 13, 2011
Matthew Brett Mar 13, 2011
Isaiah Norton Mar 13, 2011
Torsten Rohlfing Mar 10, 2011
Luis Ibanez Mar 11, 2011
Daniel Kimberg Mar 10, 2011
Cinly Ooi Mar 10, 2011
Torsten Rohlfing Mar 10, 2011
Cinly Ooi Mar 10, 2011
Torsten Rohlfing Mar 10, 2011
Cinly Ooi Mar 10, 2011
Torsten Rohlfing Mar 10, 2011
Cinly Ooi Mar 10, 2011
Matthew Brett Mar 10, 2011
Pierre Bellec Mar 10, 2011
Luis Ibanez Mar 11, 2011
RE: AAAS: Your Paper MUST include Data and Code
Matthew Brett Mar 10, 2011
Cinly Ooi Mar 10, 2011
Cinly Ooi Mar 10, 2011
Torsten Rohlfing Mar 10, 2011
Daniel Kimberg Mar 10, 2011
Cinly Ooi Mar 10, 2011