open-discussion > Duplicate data in ABIDE
Showing 1-2 of 2 posts
Dec 19, 2019 02:12 PM | Christian Haselgrove
Duplicate data in ABIDE
ABIDE UM_1 subjects 0050279 and 0050286 have the same data.
mprage.nii.gz and rest.nii.gz are each different between the
subjects, but the unzipped data files are identical.
Christian
Christian
Dec 19, 2019 03:12 PM | Yaroslav Halchenko
RE: Duplicate data in ABIDE
FWIW confirming -- they are not bit identical (difference in size
in 1 byte) but otherwise data seems to be the same:
```
$> git clone https://github.com/ReproNim/openneurolab...
$> cd openneurolab-metasearch-dataset
$> ls -ld ./abide_initiative/sub-50279/ses-1/T1_rep-0.mgz ./abide_initiative/sub-50286/ses-1/T1_rep-0.mgz
lrwxrwxrwx 1 yoh yoh 137 Dec 19 10:42 ./abide_initiative/sub-50279/ses-1/T1_rep-0.mgz -> ../../../.git/annex/objects/jf/1K/MD5E-s2305710--77704a0c155603bc634b8b391f7736a5.mgz/MD5E-s2305710--77704a0c155603bc634b8b391f7736a5.mgz
lrwxrwxrwx 1 yoh yoh 137 Dec 19 10:42 ./abide_initiative/sub-50286/ses-1/T1_rep-0.mgz -> ../../../.git/annex/objects/xQ/6P/MD5E-s2305711--f331d20a1f5b7c2ceaad6c0eccfe5f92.mgz/MD5E-s2305711--f331d20a1f5b7c2ceaad6c0eccfe5f92.mgz
$> datalad get ./abide_initiative/sub-50279/ses-1/T1_rep-0.mgz ./abide_initiative/sub-50286/ses-1/T1_rep-0.mgz
get(ok): abide_initiative/sub-50286/ses-1/T1_rep-0.mgz (file) [from web...]
get(ok): abide_initiative/sub-50279/ses-1/T1_rep-0.mgz (file) [from web...]
action summary:
get (ok: 2)
$> nib-diff ./abide_initiative/sub-50279/ses-1/T1_rep-0.mgz ./abide_initiative/sub-50286/ses-1/T1_rep-0.mgz
These files are identical.
```
in my case files come from fcp-indi s3 bucket:
```
$> git annex whereis ./abide_initiative/sub-50279/ses-1/T1_rep-0.mgz ./abide_initiative/sub-50286/ses-1/T1_rep-0.mgz
whereis abide_initiative/sub-50279/ses-1/T1_rep-0.mgz (3 copies)
00000000-0000-0000-0000-000000000001 -- web
9ed025be-5276-4e8a-a1fc-d82a04514147 -- yoh@smaug:/mnt/btrfs/datasets/datalad/crawl/labs/openneurolab/metasearch
d9a685f3-b192-412c-b798-1a12957284ea -- yoh@lena:/tmp/openneurolab-metasearch-dataset [here]
web: https://s3.amazonaws.com/fcp-indi/data/P...
ok
whereis abide_initiative/sub-50286/ses-1/T1_rep-0.mgz (3 copies)
00000000-0000-0000-0000-000000000001 -- web
9ed025be-5276-4e8a-a1fc-d82a04514147 -- yoh@smaug:/mnt/btrfs/datasets/datalad/crawl/labs/openneurolab/metasearch
d9a685f3-b192-412c-b798-1a12957284ea -- yoh@lena:/tmp/openneurolab-metasearch-dataset [here]
web: https://s3.amazonaws.com/fcp-indi/data/P...
ok
```
```
$> git clone https://github.com/ReproNim/openneurolab...
$> cd openneurolab-metasearch-dataset
$> ls -ld ./abide_initiative/sub-50279/ses-1/T1_rep-0.mgz ./abide_initiative/sub-50286/ses-1/T1_rep-0.mgz
lrwxrwxrwx 1 yoh yoh 137 Dec 19 10:42 ./abide_initiative/sub-50279/ses-1/T1_rep-0.mgz -> ../../../.git/annex/objects/jf/1K/MD5E-s2305710--77704a0c155603bc634b8b391f7736a5.mgz/MD5E-s2305710--77704a0c155603bc634b8b391f7736a5.mgz
lrwxrwxrwx 1 yoh yoh 137 Dec 19 10:42 ./abide_initiative/sub-50286/ses-1/T1_rep-0.mgz -> ../../../.git/annex/objects/xQ/6P/MD5E-s2305711--f331d20a1f5b7c2ceaad6c0eccfe5f92.mgz/MD5E-s2305711--f331d20a1f5b7c2ceaad6c0eccfe5f92.mgz
$> datalad get ./abide_initiative/sub-50279/ses-1/T1_rep-0.mgz ./abide_initiative/sub-50286/ses-1/T1_rep-0.mgz
get(ok): abide_initiative/sub-50286/ses-1/T1_rep-0.mgz (file) [from web...]
get(ok): abide_initiative/sub-50279/ses-1/T1_rep-0.mgz (file) [from web...]
action summary:
get (ok: 2)
$> nib-diff ./abide_initiative/sub-50279/ses-1/T1_rep-0.mgz ./abide_initiative/sub-50286/ses-1/T1_rep-0.mgz
These files are identical.
```
in my case files come from fcp-indi s3 bucket:
```
$> git annex whereis ./abide_initiative/sub-50279/ses-1/T1_rep-0.mgz ./abide_initiative/sub-50286/ses-1/T1_rep-0.mgz
whereis abide_initiative/sub-50279/ses-1/T1_rep-0.mgz (3 copies)
00000000-0000-0000-0000-000000000001 -- web
9ed025be-5276-4e8a-a1fc-d82a04514147 -- yoh@smaug:/mnt/btrfs/datasets/datalad/crawl/labs/openneurolab/metasearch
d9a685f3-b192-412c-b798-1a12957284ea -- yoh@lena:/tmp/openneurolab-metasearch-dataset [here]
web: https://s3.amazonaws.com/fcp-indi/data/P...
ok
whereis abide_initiative/sub-50286/ses-1/T1_rep-0.mgz (3 copies)
00000000-0000-0000-0000-000000000001 -- web
9ed025be-5276-4e8a-a1fc-d82a04514147 -- yoh@smaug:/mnt/btrfs/datasets/datalad/crawl/labs/openneurolab/metasearch
d9a685f3-b192-412c-b798-1a12957284ea -- yoh@lena:/tmp/openneurolab-metasearch-dataset [here]
web: https://s3.amazonaws.com/fcp-indi/data/P...
ok
```