NDAR Cloud Computation Capability Posted By: David Kennedy - Jul 18, 2013Tool/Resource: NIMH Data Archive / National Database for Autism Research Rich datasets (e.g., FASTQ and brain imaging) are stored and protected in object-based storage (Amazon S3) enabling parallel data download through NDAR's Download Manager. NDAR now supports the creation of MySQL databases in the Amazon Cloud which will be hosted by NDAR for 15 days, or longer as needed. These databases, called miNDARS (miniature NDARs), will contain a table for each data structure in a package. Files are granted read-only access to NDAR's S3 objects, and the reference to those objects are provided within the tables that have associated files (e.g., image03 and genomics_sample03). By providing these databases, NDAR envisions real-time computation against rich datasets that can be initiated without the need to move the objects. A new data structure category, evaluated data, has also been created. Tables for these structures will be created for each miNDAR, allowing computational pipelines to write any analyzed data back to the miNDAR database allowing NDAR to make this data available - when appropriate - to the general research community. Following this approach, NDAR is moving from a "store once, download many" approach to an architecture where computation is moved and performed in place. Link to Original Article |
Latest News |
|