open-discussion
open-discussion > RE: AAAS: Your Paper MUST include Data and Code
Mar 10, 2011 08:03 PM | Matthew Brett
RE: AAAS: Your Paper MUST include Data and Code
Originally posted by Pierre Bellec:
It's science fiction for the same reason that electric cars took a long time to arrive - no one agreed there was a real need.
If everyone regarded it as standard practice to provide data, we would have long ago agreed how to anonymize it.
Code that is not ready to release is probably broken. I know this because I write a lot of code. Almost every piece of my own code that I've gone back to and cleaned up and tested, was broken in some way. The question is, is that important? I'd argue that it is, and that, without code that has been properly cared for, the results are deeply suspect.
I don't think this will get solved by some large framework - but I might be wrong. Long ago, when I had much worse tools than are available now, I did this, for example:
http://phiwave.sourceforge.net/fiac/inde...
It took some time, but not much. The paper wasn't at all important. I just mean that moving towards reproducible research is just part of ordinary good practice, and that the reason that isn't happening very fast is that there isn't yet deep agreement that this is so.
Well I'd say that a lot of people will agree on
the principles of reproducible research and will clearly see the
benefits, but I am surprised no one has mentioned yet how
challenging this is in practice for neuroimaging. First there are
huge issues with anonymization, especially for clinical data.
Processing a dataset is one thing, releasing it publicly is another
(faces in T1 scans have to be blurred for example). Then you need
to host securely tons of datasets online. In the same vein, coding
an in-house algorithm is one thing, releasing it publicly is again
completely different (you need to document !). For new algorithms,
the production environment can be very hard to reproduce. Moreover,
the analysis itself can be computationally challenging (I use
supercomputers all the time, this is not used by the vast majority
of the neuroimaging community). Not to mention that a lot of
research group do not fully automatize their data processing flow.
My point is that it is not enough to say "let's go
reproducible/public/open source/....". We also need an
infrastructure to do so. I bet that in the next couple of years we
will have websites that allow to share datasets publicly, and
request with only a couple clicks for processing in
supercomputers with well tested and maintained analysis pipeline.
There are many current efforts in that direction
(e.g. http://www.cbrain.mcgill.ca/). But at this stage this
is science fiction as far as I know.
It's science fiction for the same reason that electric cars took a long time to arrive - no one agreed there was a real need.
If everyone regarded it as standard practice to provide data, we would have long ago agreed how to anonymize it.
Code that is not ready to release is probably broken. I know this because I write a lot of code. Almost every piece of my own code that I've gone back to and cleaned up and tested, was broken in some way. The question is, is that important? I'd argue that it is, and that, without code that has been properly cared for, the results are deeply suspect.
I don't think this will get solved by some large framework - but I might be wrong. Long ago, when I had much worse tools than are available now, I did this, for example:
http://phiwave.sourceforge.net/fiac/inde...
It took some time, but not much. The paper wasn't at all important. I just mean that moving towards reproducible research is just part of ordinary good practice, and that the reason that isn't happening very fast is that there isn't yet deep agreement that this is so.
Threaded View
Title | Author | Date |
---|---|---|
Luis Ibanez | Mar 10, 2011 | |
hongtu zhu | Mar 13, 2011 | |
Luis Ibanez | Mar 13, 2011 | |
Matthew Brett | Mar 13, 2011 | |
Isaiah Norton | Mar 13, 2011 | |
Torsten Rohlfing | Mar 10, 2011 | |
Luis Ibanez | Mar 11, 2011 | |
Daniel Kimberg | Mar 10, 2011 | |
Cinly Ooi | Mar 10, 2011 | |
Torsten Rohlfing | Mar 10, 2011 | |
Cinly Ooi | Mar 10, 2011 | |
Torsten Rohlfing | Mar 10, 2011 | |
Cinly Ooi | Mar 10, 2011 | |
Torsten Rohlfing | Mar 10, 2011 | |
Cinly Ooi | Mar 10, 2011 | |
Matthew Brett | Mar 10, 2011 | |
Pierre Bellec | Mar 10, 2011 | |
Luis Ibanez | Mar 11, 2011 | |
Matthew Brett | Mar 10, 2011 | |
Cinly Ooi | Mar 10, 2011 | |
Cinly Ooi | Mar 10, 2011 | |
Torsten Rohlfing | Mar 10, 2011 | |
Daniel Kimberg | Mar 10, 2011 | |
Cinly Ooi | Mar 10, 2011 | |