open-discussion > Shared NIRS Data Format - SNIRF
Showing 1-19 of 19 posts
Oct 19, 2012 07:10 PM | David Boas
Shared NIRS Data Format - SNIRF
Dear NIRS Community,
Over the past several months, we have collected input from several
people on establishing a shared data format for NIRS experimental data.
A draft specification has been created for the proposed "Shared NIR
Format", a.k.a. SNIRF. This specification can be viewed at
https://docs.google.com/document/d/1EKEMrB6CxmEGnzI4zi7MugHq318HRaR3M2i_vzRIPFU/edit
Our goal is for SNIRF to aid in the sharing and analysis of NIRS data
broadly, initially with a particular emphasis on fNIRS data of brain
activation. This goal will be met as hardware manufacturers support SNIRF for saving NIRS data, and analysis packages are able to load SNIRF compliant files
for analysis.
This is still a draft specification. Our goal is to finalize this
specification by Feb 2013.
Please view the specification at the google doc URL indicated above and
use this forum to comment on any particular issues.
This specification has already benefited from the inputs of: Ted
Huppert, Hamid Dehghani, Takusige Katura, Jong Chul Ye, and Sungho Tak, as well as supporting approval from several others. We
believe that we have incorporated, or at least responded to all of their
comments.
One issue that remains to be resolved is how to handle calculated or
derived data types. The specification presently supports several raw
data types. It is desirable to add a data type for concentration
results. An issue we are struggling with is that every channel of data,
i.e. column of "d", has a corresponding descriptor in the "ml"
structure. The "ml" structure indexes the source, detector, and data
type for the corresponding data channel. It also indexes the wavelength.
For concentration, there is no wavelength. Thus, it seems that if we
have a data type for concentration, then the corresponding
"ml(n).WavelengthIndex" field would be ignored. In addition, the
"ml(n).DataTypeIndex" could be used to reference what chromophore is
stored in the data. The list of chromophores could be provided by
SD.Chromophores, which could be a string array with possible entries of
"HbO", "HbR", "H2O", "aa3", etc.
We are looking forward to more input from the community on establishing
SNIRF.
Sincerely,
Blaise Frederick
David Boas
Over the past several months, we have collected input from several
people on establishing a shared data format for NIRS experimental data.
A draft specification has been created for the proposed "Shared NIR
Format", a.k.a. SNIRF. This specification can be viewed at
https://docs.google.com/document/d/1EKEMrB6CxmEGnzI4zi7MugHq318HRaR3M2i_vzRIPFU/edit
Our goal is for SNIRF to aid in the sharing and analysis of NIRS data
broadly, initially with a particular emphasis on fNIRS data of brain
activation. This goal will be met as hardware manufacturers support SNIRF for saving NIRS data, and analysis packages are able to load SNIRF compliant files
for analysis.
This is still a draft specification. Our goal is to finalize this
specification by Feb 2013.
Please view the specification at the google doc URL indicated above and
use this forum to comment on any particular issues.
This specification has already benefited from the inputs of: Ted
Huppert, Hamid Dehghani, Takusige Katura, Jong Chul Ye, and Sungho Tak, as well as supporting approval from several others. We
believe that we have incorporated, or at least responded to all of their
comments.
One issue that remains to be resolved is how to handle calculated or
derived data types. The specification presently supports several raw
data types. It is desirable to add a data type for concentration
results. An issue we are struggling with is that every channel of data,
i.e. column of "d", has a corresponding descriptor in the "ml"
structure. The "ml" structure indexes the source, detector, and data
type for the corresponding data channel. It also indexes the wavelength.
For concentration, there is no wavelength. Thus, it seems that if we
have a data type for concentration, then the corresponding
"ml(n).WavelengthIndex" field would be ignored. In addition, the
"ml(n).DataTypeIndex" could be used to reference what chromophore is
stored in the data. The list of chromophores could be provided by
SD.Chromophores, which could be a string array with possible entries of
"HbO", "HbR", "H2O", "aa3", etc.
We are looking forward to more input from the community on establishing
SNIRF.
Sincerely,
Blaise Frederick
David Boas
Oct 22, 2012 04:10 PM | Mathieu Coursolle
RE: Shared NIRS Data Format - SNIRF
Hi,
Thanks for sharing this draft specification. Here are a few small (late) comments I'd like to share on the latest version.
Our new NIRS system should be released in a few months, and we made it very modular. The system can be made from any combination of up to 4 "blocks", each having the potential of having a different number of sources, detectors and wavelengths (as well as different wavelength values).
That being said, I just wanted to make sure that the specification doesn't enforce the usage of all wavelengths available on the system for all measurements (I may be wrong, but I do believe it was the case for the Matlab based .nirs file format).
In the description of the "ml" structure, there is a note specifying that "source indices generally refer to the optode naming (probe positions)". I assume that it is the same for detectors?
In the optional variables, the "qform" variable specified a 4x4 matrix to align the NIRS coordinate system to other geometries. Would it be useful to have an additionnal variable that describes that geometry (ex: MNI, Talairach, anatomical, etc) ? I am thinking of something that may be similar to the NIfTI file format.
Also, are there any constraints in terms of endianness of the file, and/or type encoding of the different variables?
Thank you,
Mathieu Coursolle
Rogue Research Inc.
Thanks for sharing this draft specification. Here are a few small (late) comments I'd like to share on the latest version.
Our new NIRS system should be released in a few months, and we made it very modular. The system can be made from any combination of up to 4 "blocks", each having the potential of having a different number of sources, detectors and wavelengths (as well as different wavelength values).
That being said, I just wanted to make sure that the specification doesn't enforce the usage of all wavelengths available on the system for all measurements (I may be wrong, but I do believe it was the case for the Matlab based .nirs file format).
In the description of the "ml" structure, there is a note specifying that "source indices generally refer to the optode naming (probe positions)". I assume that it is the same for detectors?
In the optional variables, the "qform" variable specified a 4x4 matrix to align the NIRS coordinate system to other geometries. Would it be useful to have an additionnal variable that describes that geometry (ex: MNI, Talairach, anatomical, etc) ? I am thinking of something that may be similar to the NIfTI file format.
Also, are there any constraints in terms of endianness of the file, and/or type encoding of the different variables?
Thank you,
Mathieu Coursolle
Rogue Research Inc.
Oct 22, 2012 06:10 PM | Blaise Frederick
RE: Shared NIRS Data Format - SNIRF
Originally posted by Mathieu Coursolle:
In the optional variables, the "qform" variable specified a 4x4 matrix to align the NIRS coordinate system to other geometries. Would it be useful to have an additionnal variable that describes that geometry (ex: MNI, Talairach, anatomical, etc) ? I am thinking of something that may be similar to the NIfTI file format.
That sounds like a good idea; we'll have to think about what geometries would make the most sense ("scanner anatomic" is probably not relevant, but "MNI", "Talairach", maybe "10-20" might be good choices- we'd have to decide how the last would be implemented).
Also, are there any constraints in terms of endianness of the file, and/or type encoding of the different variables?
The underlying file is an HDF5 file, which will take care of the
endianness.
There are some places where we have specified a data type, but I think in the absence of a clear description, we should probably assume "double". David, what do you think?
Hi,
Thanks for sharing this draft specification. Here are a few small (late) comments I'd like to share on the latest version.
Our new NIRS system should be released in a few months, and we made it very modular. The system can be made from any combination of up to 4 "blocks", each having the potential of having a different number of sources, detectors and wavelengths (as well as different wavelength values).
That being said, I just wanted to make sure that the specification doesn't enforce the usage of all wavelengths available on the system for all measurements (I may be wrong, but I do believe it was the case for the Matlab based .nirs file format).
The intention here was to to have a complete description of all the
data that is included in the file. What data you choose to
put in is up to you. The ml variable has to exist for every
data vector in the file, and specifies information about that
vector of data (it refers to a source position, a detector
position, and a wavelength) but it does not have to be a complete
set of possible source detector pairs (it is not required that you
record source detector data for pairs that are too far apart to
have useful information, or for lasers that are off, etc..
The only requirement is that if it's in the file, it needs to
be completely described).Thanks for sharing this draft specification. Here are a few small (late) comments I'd like to share on the latest version.
Our new NIRS system should be released in a few months, and we made it very modular. The system can be made from any combination of up to 4 "blocks", each having the potential of having a different number of sources, detectors and wavelengths (as well as different wavelength values).
That being said, I just wanted to make sure that the specification doesn't enforce the usage of all wavelengths available on the system for all measurements (I may be wrong, but I do believe it was the case for the Matlab based .nirs file format).
In the description of the "ml" structure, there
is a note specifying that "source indices generally refer to the
optode naming (probe positions)". I assume that it is the same for
detectors?
Yes. We'll revise the description to say that.In the optional variables, the "qform" variable specified a 4x4 matrix to align the NIRS coordinate system to other geometries. Would it be useful to have an additionnal variable that describes that geometry (ex: MNI, Talairach, anatomical, etc) ? I am thinking of something that may be similar to the NIfTI file format.
That sounds like a good idea; we'll have to think about what geometries would make the most sense ("scanner anatomic" is probably not relevant, but "MNI", "Talairach", maybe "10-20" might be good choices- we'd have to decide how the last would be implemented).
Also, are there any constraints in terms of endianness of the file, and/or type encoding of the different variables?
There are some places where we have specified a data type, but I think in the absence of a clear description, we should probably assume "double". David, what do you think?
Thank you,
Mathieu Coursolle
Rogue Research Inc.
Oct 25, 2012 01:10 PM | Alessandro Torricelli
RE: Shared NIRS Data Format - SNIRF
Hi I have one general comment and a specif comment for time domain
data.
1) General: It is not clear to me the exact meaning of the variable .
In my understanding channels means optodes.
Then it is not clear to me how data from different wavelengths are stored in the variable d.
Suppose we have a 1 channel CW system operating at 2 wavelengths.
The dimensions of the variable d (actual raw data) for a specific experiment with 100 time points are x that is <100> x <1>.
The corresponding variable ml (measurement list) is an array structure that has the size <1> with fields
ml(1).SourceIndex =1
ml(1).DetectorIndex =1
ml(1).WavelengthIndex =1
ml(1).DataType =1
So, where are the data for the second wavelength??
Maybe you are giving to channels a different meaning. Maybe channels does not specifically refer to optodes, but more generally to acquired data? So = x ??
In this case, in the simple example of 1 channel and 2 wavelengths, the dimensions of the variable d are <100>x<2>, where d(:,1) is the first wavelength and d(:,2) the second wavelength?
Providing a sample of data could help to better understand the proposed format.
2) Time domain fNIRS: How to treat time domain data?
In time domain fNIRS systems based on the TCSPC technique the raw data are the distributions of time of flight (DTOFs) at two or more wavelengths. Microscopic time resolution is typically 10ps, and 512 or 1024 channels are acquired, therefore 5 or 10 ns are recorded. It is probably unreasonable to store all this data in the standard format, therefore preprocessing should be done.
By preprocessing the DTOF we can provide the intensity at selected time-gates. To enhance the contribution from deep layers (brain cortex) and reject the disturbing effect of superficial layers (scalp, skull), late and early time-gates are needed. Therefore the minimum number of time-gates is 2. Since the choice of the early and late time-gates may depend on the specific experiment, we store more than 2 gates, typically 10 time-gates with width of 400ps and variable delays (from 0 to 3.2ns in steps of 400ps). An 11th time gate corresponding to total number of photons (i.e. sum of photons in all time-gates, a pseudo CW measurement) is sometimes stored or calculated.
By a different preprocessing of the DTOF, we can provide the moments (1st, 2nd, and 3rd, corresponding to number of photons, mean time of flight and variance).
Suppose we have a 1 channel time domain system operating at 2 wavelengths and suppose we want to store 2 time gates (early and late).
The dimensions of the variable d (actual raw data) for a specific experiment with 100 time points are <100> x <4>, where d(:,1) is the first wavelength-first gate, d(:,2) the second wavelength-first gate, d(:,3) is the second wavelength-second gate, and d(:,4) the second wavelength-second gate.
If we want to store 11 time gates (i.e. 10 + CW), then the dimensions of the variable d (actual raw data) for a specific experiment with 100 time points are <100> x <22>. Maybe the data referring to the pseudoCW time gate can be recorded before all other time gates (it seems to me more elegant and efficient: when using moments the 1st data is the pseudoCW as well).
In general the dimensions of the variable d (actual raw data) are x , where
= x x ,
or
= x x .
That's all.
Looking forward to seeing you in London.
Alessandro Torricelli
Politecnico di Milano
Italy
1) General: It is not clear to me the exact meaning of the variable .
In my understanding channels means optodes.
Then it is not clear to me how data from different wavelengths are stored in the variable d.
Suppose we have a 1 channel CW system operating at 2 wavelengths.
The dimensions of the variable d (actual raw data) for a specific experiment with 100 time points are x that is <100> x <1>.
The corresponding variable ml (measurement list) is an array structure that has the size <1> with fields
ml(1).SourceIndex =1
ml(1).DetectorIndex =1
ml(1).WavelengthIndex =1
ml(1).DataType =1
So, where are the data for the second wavelength??
Maybe you are giving to channels a different meaning. Maybe channels does not specifically refer to optodes, but more generally to acquired data? So = x ??
In this case, in the simple example of 1 channel and 2 wavelengths, the dimensions of the variable d are <100>x<2>, where d(:,1) is the first wavelength and d(:,2) the second wavelength?
Providing a sample of data could help to better understand the proposed format.
2) Time domain fNIRS: How to treat time domain data?
In time domain fNIRS systems based on the TCSPC technique the raw data are the distributions of time of flight (DTOFs) at two or more wavelengths. Microscopic time resolution is typically 10ps, and 512 or 1024 channels are acquired, therefore 5 or 10 ns are recorded. It is probably unreasonable to store all this data in the standard format, therefore preprocessing should be done.
By preprocessing the DTOF we can provide the intensity at selected time-gates. To enhance the contribution from deep layers (brain cortex) and reject the disturbing effect of superficial layers (scalp, skull), late and early time-gates are needed. Therefore the minimum number of time-gates is 2. Since the choice of the early and late time-gates may depend on the specific experiment, we store more than 2 gates, typically 10 time-gates with width of 400ps and variable delays (from 0 to 3.2ns in steps of 400ps). An 11th time gate corresponding to total number of photons (i.e. sum of photons in all time-gates, a pseudo CW measurement) is sometimes stored or calculated.
By a different preprocessing of the DTOF, we can provide the moments (1st, 2nd, and 3rd, corresponding to number of photons, mean time of flight and variance).
Suppose we have a 1 channel time domain system operating at 2 wavelengths and suppose we want to store 2 time gates (early and late).
The dimensions of the variable d (actual raw data) for a specific experiment with 100 time points are <100> x <4>, where d(:,1) is the first wavelength-first gate, d(:,2) the second wavelength-first gate, d(:,3) is the second wavelength-second gate, and d(:,4) the second wavelength-second gate.
If we want to store 11 time gates (i.e. 10 + CW), then the dimensions of the variable d (actual raw data) for a specific experiment with 100 time points are <100> x <22>. Maybe the data referring to the pseudoCW time gate can be recorded before all other time gates (it seems to me more elegant and efficient: when using moments the 1st data is the pseudoCW as well).
In general the dimensions of the variable d (actual raw data) are x , where
= x x ,
or
= x x .
That's all.
Looking forward to seeing you in London.
Alessandro Torricelli
Politecnico di Milano
Italy
Oct 25, 2012 01:10 PM | Alessandro Torricelli
RE: Shared NIRS Data Format - SNIRF
There are some character missing in my previuos comment.
See attached file for corrected comments.
Alessandro
See attached file for corrected comments.
Alessandro
Oct 26, 2012 11:10 AM | Blaise Frederick
RE: Shared NIRS Data Format - SNIRF
Originally posted by Alessandro Torricelli:
The corresponding variable ml (measurement list) is an array structure that has the size <1> with fields
ml(1).SourceIndex =1
ml(1).DetectorIndex =1
ml(1).WavelengthIndex =1
ml(1).DataType =1
So, where are the data for the second wavelength??
Maybe you are giving to channels a different meaning. Maybe channels does not specifically refer to optodes, but more generally to acquired data? So = x ??
In this case, in the simple example of 1 channel and 2 wavelengths, the dimensions of the variable d are <100>x<2>, where d(:,1) is the first wavelength and d(:,2) the second wavelength?
Yes, this is the way it is stored.
Providing a sample of data could help to better understand the proposed format.
Agreed. We will give some typical example cases when we fill
out the standard a bit - that's generally much clearer. We
just wanted to get the formal standard set first.
2) Time domain fNIRS: How to treat time domain data?
In time domain fNIRS systems based on the TCSPC technique the raw data are the distributions of time of flight (DTOFs) at two or more wavelengths. Microscopic time resolution is typically 10ps, and 512 or 1024 channels are acquired, therefore 5 or 10 ns are recorded. It is probably unreasonable to store all this data in the standard format, therefore preprocessing should be done.
By preprocessing the DTOF we can provide the intensity at selected time-gates. To enhance the contribution from deep layers (brain cortex) and reject the disturbing effect of superficial layers (scalp, skull), late and early time-gates are needed. Therefore the minimum number of time-gates is 2. Since the choice of the early and late time-gates may depend on the specific experiment, we store more than 2 gates, typically 10 time-gates with width of 400ps and variable delays (from 0 to 3.2ns in steps of 400ps). An 11th time gate corresponding to total number of photons (i.e. sum of photons in all time-gates, a pseudo CW measurement) is sometimes stored or calculated.
By a different preprocessing of the DTOF, we can provide the moments (1st, 2nd, and 3rd, corresponding to number of photons, mean time of flight and variance).
Suppose we have a 1 channel time domain system operating at 2 wavelengths and suppose we want to store 2 time gates (early and late).
The dimensions of the variable d (actual raw data) for a specific experiment with 100 time points are <100> x <4>, where d(:,1) is the first wavelength-first gate, d(:,2) the second wavelength-first gate, d(:,3) is the second wavelength-second gate, and d(:,4) the second wavelength-second gate.
If we want to store 11 time gates (i.e. 10 + CW), then the dimensions of the variable d (actual raw data) for a specific experiment with 100 time points are <100> x <22>. Maybe the data referring to the pseudoCW time gate can be recorded before all other time gates (it seems to me more elegant and efficient: when using moments the 1st data is the pseudoCW as well).
In general the dimensions of the variable d (actual raw data) are x , where
= x x ,
or
= x x .
I'm going to have to defer to David on this - I don't do TOF
measurements, so I'm really not the best person to comment on this,
but what you said sounds reasonable, and I think that was what we
were planning. We haven't yet settled on how to handle
derived data (such as pseudo CW, or even oxy- and deoxy- hemoglobin
concentration), as that gets complicated rather quickly, so we
wanted some input from others before wading into that. Also,
if I understand your second case, this also seems to get into
handling multiple timebases in the same file, which we were
thinking would require the extended format.
That's all.
Looking forward to seeing you in London.
And you! See you this evening.
Blaise
Alessandro Torricelli
Politecnico di Milano
Italy
Hi I have one general comment and a specif
comment for time domain data.
1) General: It is not clear to me the exact meaning of the variable .
In my understanding channels means optodes.
Then it is not clear to me how data from different wavelengths are stored in the variable d.
Suppose we have a 1 channel CW system operating at 2 wavelengths.
The dimensions of the variable d (actual raw data) for a specific experiment with 100 time points are x that is <100> x <1>.
Actually, in this case d will be <100> x <2>, one
timecourse for each wavelength1) General: It is not clear to me the exact meaning of the variable .
In my understanding channels means optodes.
Then it is not clear to me how data from different wavelengths are stored in the variable d.
Suppose we have a 1 channel CW system operating at 2 wavelengths.
The dimensions of the variable d (actual raw data) for a specific experiment with 100 time points are x that is <100> x <1>.
The corresponding variable ml (measurement list) is an array structure that has the size <1> with fields
ml(1).SourceIndex =1
ml(1).DetectorIndex =1
ml(1).WavelengthIndex =1
ml(1).DataType =1
So, where are the data for the second wavelength??
Maybe you are giving to channels a different meaning. Maybe channels does not specifically refer to optodes, but more generally to acquired data? So = x ??
In this case, in the simple example of 1 channel and 2 wavelengths, the dimensions of the variable d are <100>x<2>, where d(:,1) is the first wavelength and d(:,2) the second wavelength?
Providing a sample of data could help to better understand the proposed format.
2) Time domain fNIRS: How to treat time domain data?
In time domain fNIRS systems based on the TCSPC technique the raw data are the distributions of time of flight (DTOFs) at two or more wavelengths. Microscopic time resolution is typically 10ps, and 512 or 1024 channels are acquired, therefore 5 or 10 ns are recorded. It is probably unreasonable to store all this data in the standard format, therefore preprocessing should be done.
By preprocessing the DTOF we can provide the intensity at selected time-gates. To enhance the contribution from deep layers (brain cortex) and reject the disturbing effect of superficial layers (scalp, skull), late and early time-gates are needed. Therefore the minimum number of time-gates is 2. Since the choice of the early and late time-gates may depend on the specific experiment, we store more than 2 gates, typically 10 time-gates with width of 400ps and variable delays (from 0 to 3.2ns in steps of 400ps). An 11th time gate corresponding to total number of photons (i.e. sum of photons in all time-gates, a pseudo CW measurement) is sometimes stored or calculated.
By a different preprocessing of the DTOF, we can provide the moments (1st, 2nd, and 3rd, corresponding to number of photons, mean time of flight and variance).
Suppose we have a 1 channel time domain system operating at 2 wavelengths and suppose we want to store 2 time gates (early and late).
The dimensions of the variable d (actual raw data) for a specific experiment with 100 time points are <100> x <4>, where d(:,1) is the first wavelength-first gate, d(:,2) the second wavelength-first gate, d(:,3) is the second wavelength-second gate, and d(:,4) the second wavelength-second gate.
If we want to store 11 time gates (i.e. 10 + CW), then the dimensions of the variable d (actual raw data) for a specific experiment with 100 time points are <100> x <22>. Maybe the data referring to the pseudoCW time gate can be recorded before all other time gates (it seems to me more elegant and efficient: when using moments the 1st data is the pseudoCW as well).
In general the dimensions of the variable d (actual raw data) are x , where
= x x ,
or
= x x .
That's all.
Looking forward to seeing you in London.
Blaise
Alessandro Torricelli
Politecnico di Milano
Italy
Nov 5, 2012 03:11 PM | Alex Cristia
RE: Shared NIRS Data Format - SNIRF
Hello,
Thanks for getting this discussion started, it will be great to have a standard format. I've a couple clarification questions and comments:
- "The time variable. This provides the acquisition time of the measurement relative to the time origin." I'm guessing this is preferable over sample number because (1) there's no need to look at sampling frequency before comparing to studies; (2) some systems may actually store time, if the sampling frequency is variable/there are errors. Is that so?
- You could add a field SD.Origin which can specify the 10-20 electrode used as reference, since most fNIRS neurocog users will have one. This simple addition would make it much easier to incorporate localization in an eventual meta-analysis.
- Would original source power (e.g., 0.6mW) be stored in metadata? Some systems, particularly those which use multiple distances, can specify different powers for each source.
- It would be useful to have a forum with suggested fields for metadata. I won't put them here, but there are several variables that appear to affect data quality in infant research, so it would be useful if we could keep track of them. If you're interested in starting to collect suggestions, let me know and I'll email you my list.
Alex Cristia
--
__________________________
Scientific staff member MPI-Nijmegen
sites.google.com/site/acrsta
Wundtlaan 1
6525 XD, Nijmegen
Netherlands
__________________________
Thanks for getting this discussion started, it will be great to have a standard format. I've a couple clarification questions and comments:
- "The time variable. This provides the acquisition time of the measurement relative to the time origin." I'm guessing this is preferable over sample number because (1) there's no need to look at sampling frequency before comparing to studies; (2) some systems may actually store time, if the sampling frequency is variable/there are errors. Is that so?
- You could add a field SD.Origin which can specify the 10-20 electrode used as reference, since most fNIRS neurocog users will have one. This simple addition would make it much easier to incorporate localization in an eventual meta-analysis.
- Would original source power (e.g., 0.6mW) be stored in metadata? Some systems, particularly those which use multiple distances, can specify different powers for each source.
- It would be useful to have a forum with suggested fields for metadata. I won't put them here, but there are several variables that appear to affect data quality in infant research, so it would be useful if we could keep track of them. If you're interested in starting to collect suggestions, let me know and I'll email you my list.
Alex Cristia
--
__________________________
Scientific staff member MPI-Nijmegen
sites.google.com/site/acrsta
Wundtlaan 1
6525 XD, Nijmegen
Netherlands
__________________________
Nov 5, 2012 04:11 PM | David Boas
RE: Shared NIRS Data Format - SNIRF
I respond to 3 points from the message copied below:
1) Description for ml has been updated in the spec on google docs to indicate that detector indicies also generally refer to opt ode naming and not necessarily the physical detector numbers on the hardware.
2) regarding adding an additional variable describing the geometry associated with qform, I hope that Mathieu Coursolle, Blaise, and others can resolve this.
3) Regarding data types, I agree that unless otherwise specified it will be double.
Originally posted by Blaise Frederick:
1) Description for ml has been updated in the spec on google docs to indicate that detector indicies also generally refer to opt ode naming and not necessarily the physical detector numbers on the hardware.
2) regarding adding an additional variable describing the geometry associated with qform, I hope that Mathieu Coursolle, Blaise, and others can resolve this.
3) Regarding data types, I agree that unless otherwise specified it will be double.
Originally posted by Blaise Frederick:
Originally posted by Mathieu Coursolle:
In the optional variables, the "qform" variable specified a 4x4 matrix to align the NIRS coordinate system to other geometries. Would it be useful to have an additionnal variable that describes that geometry (ex: MNI, Talairach, anatomical, etc) ? I am thinking of something that may be similar to the NIfTI file format.
That sounds like a good idea; we'll have to think about what geometries would make the most sense ("scanner anatomic" is probably not relevant, but "MNI", "Talairach", maybe "10-20" might be good choices- we'd have to decide how the last would be implemented).
Also, are there any constraints in terms of endianness of the file, and/or type encoding of the different variables?
The underlying file is an HDF5 file, which will take care of the
endianness.
There are some places where we have specified a data type, but I think in the absence of a clear description, we should probably assume "double". David, what do you think?
...
In the description of the "ml" structure, there
is a note specifying that "source indices generally refer to the
optode naming (probe positions)". I assume that it is the same for
detectors?
Yes. We'll revise the description to say that.
In the optional variables, the "qform" variable specified a 4x4 matrix to align the NIRS coordinate system to other geometries. Would it be useful to have an additionnal variable that describes that geometry (ex: MNI, Talairach, anatomical, etc) ? I am thinking of something that may be similar to the NIfTI file format.
That sounds like a good idea; we'll have to think about what geometries would make the most sense ("scanner anatomic" is probably not relevant, but "MNI", "Talairach", maybe "10-20" might be good choices- we'd have to decide how the last would be implemented).
Also, are there any constraints in terms of endianness of the file, and/or type encoding of the different variables?
There are some places where we have specified a data type, but I think in the absence of a clear description, we should probably assume "double". David, what do you think?
Nov 5, 2012 05:11 PM | David Boas
RE: Shared NIRS Data Format - SNIRF
1) We'll have to provide sample data soon in the spec.
2) Your understanding of how different time gate and/or moments will be handled is correct. If you have time gates, they would be specified as ml(n).DataType = 3, raw time domain time gate. you can add moments as well as ml(n).DataType = 4, raw time domain moments. These data types must index additional information for each channel of data with ml(n).DataTypeIndex which is provided in the SD structure in the spec
For gates
SD.TimeDelay
SD.TimeDelayWidth
For moments
SD.MomentOrder
Note that the gates require 2 additional bits of information, the time delay, and the time delay (or gate) width. These are index with the same ml(n).DataTypeIndex.
There are three different ways of representing CW data derived from the time domain data.
CW data could be represented as ml(n).DataType=3 using a SD.TimeDelay=0 and SD.TimeDelayWidth=10e-9. This is a very long gate that is effectively CW.
CW data could be indicated as the 0th moment as ml(n).DataType=4 with SD.MomentOrder=0.
Or, CW data could be indicated as normal CW data with ml(n).DataType=1.
Originally posted by Blaise Frederick:
2) Your understanding of how different time gate and/or moments will be handled is correct. If you have time gates, they would be specified as ml(n).DataType = 3, raw time domain time gate. you can add moments as well as ml(n).DataType = 4, raw time domain moments. These data types must index additional information for each channel of data with ml(n).DataTypeIndex which is provided in the SD structure in the spec
For gates
SD.TimeDelay
SD.TimeDelayWidth
For moments
SD.MomentOrder
Note that the gates require 2 additional bits of information, the time delay, and the time delay (or gate) width. These are index with the same ml(n).DataTypeIndex.
There are three different ways of representing CW data derived from the time domain data.
CW data could be represented as ml(n).DataType=3 using a SD.TimeDelay=0 and SD.TimeDelayWidth=10e-9. This is a very long gate that is effectively CW.
CW data could be indicated as the 0th moment as ml(n).DataType=4 with SD.MomentOrder=0.
Or, CW data could be indicated as normal CW data with ml(n).DataType=1.
Originally posted by Blaise Frederick:
Originally posted by Alessandro
Torricelli:
Providing a sample of data could help to better understand the proposed format.
Agreed. We will give some typical example cases when we fill
out the standard a bit - that's generally much clearer. We
just wanted to get the formal standard set first.
2) Time domain fNIRS: How to treat time domain data?
In time domain fNIRS systems based on the TCSPC technique the raw data are the distributions of time of flight (DTOFs) at two or more wavelengths. Microscopic time resolution is typically 10ps, and 512 or 1024 channels are acquired, therefore 5 or 10 ns are recorded. It is probably unreasonable to store all this data in the standard format, therefore preprocessing should be done.
By preprocessing the DTOF we can provide the intensity at selected time-gates. To enhance the contribution from deep layers (brain cortex) and reject the disturbing effect of superficial layers (scalp, skull), late and early time-gates are needed. Therefore the minimum number of time-gates is 2. Since the choice of the early and late time-gates may depend on the specific experiment, we store more than 2 gates, typically 10 time-gates with width of 400ps and variable delays (from 0 to 3.2ns in steps of 400ps). An 11th time gate corresponding to total number of photons (i.e. sum of photons in all time-gates, a pseudo CW measurement) is sometimes stored or calculated.
By a different preprocessing of the DTOF, we can provide the moments (1st, 2nd, and 3rd, corresponding to number of photons, mean time of flight and variance).
Suppose we have a 1 channel time domain system operating at 2 wavelengths and suppose we want to store 2 time gates (early and late).
The dimensions of the variable d (actual raw data) for a specific experiment with 100 time points are <100> x <4>, where d(:,1) is the first wavelength-first gate, d(:,2) the second wavelength-first gate, d(:,3) is the second wavelength-second gate, and d(:,4) the second wavelength-second gate.
If we want to store 11 time gates (i.e. 10 + CW), then the dimensions of the variable d (actual raw data) for a specific experiment with 100 time points are <100> x <22>. Maybe the data referring to the pseudoCW time gate can be recorded before all other time gates (it seems to me more elegant and efficient: when using moments the 1st data is the pseudoCW as well).
In general the dimensions of the variable d (actual raw data) are x , where
= x x ,
or
= x x .
...
Hi I have one general comment and a specif
comment for time domain data.
1) General: It is not clear to me the exact meaning of the variable .
...
1) General: It is not clear to me the exact meaning of the variable .
...
Providing a sample of data could help to better understand the proposed format.
2) Time domain fNIRS: How to treat time domain data?
In time domain fNIRS systems based on the TCSPC technique the raw data are the distributions of time of flight (DTOFs) at two or more wavelengths. Microscopic time resolution is typically 10ps, and 512 or 1024 channels are acquired, therefore 5 or 10 ns are recorded. It is probably unreasonable to store all this data in the standard format, therefore preprocessing should be done.
By preprocessing the DTOF we can provide the intensity at selected time-gates. To enhance the contribution from deep layers (brain cortex) and reject the disturbing effect of superficial layers (scalp, skull), late and early time-gates are needed. Therefore the minimum number of time-gates is 2. Since the choice of the early and late time-gates may depend on the specific experiment, we store more than 2 gates, typically 10 time-gates with width of 400ps and variable delays (from 0 to 3.2ns in steps of 400ps). An 11th time gate corresponding to total number of photons (i.e. sum of photons in all time-gates, a pseudo CW measurement) is sometimes stored or calculated.
By a different preprocessing of the DTOF, we can provide the moments (1st, 2nd, and 3rd, corresponding to number of photons, mean time of flight and variance).
Suppose we have a 1 channel time domain system operating at 2 wavelengths and suppose we want to store 2 time gates (early and late).
The dimensions of the variable d (actual raw data) for a specific experiment with 100 time points are <100> x <4>, where d(:,1) is the first wavelength-first gate, d(:,2) the second wavelength-first gate, d(:,3) is the second wavelength-second gate, and d(:,4) the second wavelength-second gate.
If we want to store 11 time gates (i.e. 10 + CW), then the dimensions of the variable d (actual raw data) for a specific experiment with 100 time points are <100> x <22>. Maybe the data referring to the pseudoCW time gate can be recorded before all other time gates (it seems to me more elegant and efficient: when using moments the 1st data is the pseudoCW as well).
In general the dimensions of the variable d (actual raw data) are x , where
= x x ,
or
= x x .
Nov 16, 2012 04:11 PM | Mathieu Coursolle
RE: Shared NIRS Data Format - SNIRF
Hello,
On Mac OS X and iOS, file types are defined by a concept known as UTIs (uniform type identifiers). Basically, they are a collection of metadata, as key-value pairs, that describe the file format.
We think it would be valuable for the community to agree upon such a definition for the .snirf format.
This describes the motivation of UTIs over say just file extensions:
<http://developer.apple.com/library/mac/#documentation/FileManagement/Conceptual/understanding_utis/understand_utis_intro/understand_utis_intro.html>
and this describes the metadata that can be provided:
<http://developer.apple.com/library/mac/documentation/FileManagement/Conceptual/understanding_utis/understand_utis_declare/understand_utis_declare.html#//apple_ref/doc/uid/TP40001319-CH204-SW1>
We need to define some fields:
UTTypeDescription (ex: SNIRF)
UTTypeIdentifier (ex: com.some.identifier)
UTTypeReferenceURL (ex: http://www.some.stable.URL.over.time.com)
UTTypeTagSpecification (ex: snirf)
Notes:
- the most important is the choice of UTTypeIdentifier
- will that web URL be stable over time?
- is there a MIME type for this file format? It could be added in UTTypeTagSpecification if so.
Thanks,
Mathieu
On Mac OS X and iOS, file types are defined by a concept known as UTIs (uniform type identifiers). Basically, they are a collection of metadata, as key-value pairs, that describe the file format.
We think it would be valuable for the community to agree upon such a definition for the .snirf format.
This describes the motivation of UTIs over say just file extensions:
<http://developer.apple.com/library/mac/#documentation/FileManagement/Conceptual/understanding_utis/understand_utis_intro/understand_utis_intro.html>
and this describes the metadata that can be provided:
<http://developer.apple.com/library/mac/documentation/FileManagement/Conceptual/understanding_utis/understand_utis_declare/understand_utis_declare.html#//apple_ref/doc/uid/TP40001319-CH204-SW1>
We need to define some fields:
UTTypeDescription (ex: SNIRF)
UTTypeIdentifier (ex: com.some.identifier)
UTTypeReferenceURL (ex: http://www.some.stable.URL.over.time.com)
UTTypeTagSpecification (ex: snirf)
Notes:
- the most important is the choice of UTTypeIdentifier
- will that web URL be stable over time?
- is there a MIME type for this file format? It could be added in UTTypeTagSpecification if so.
Thanks,
Mathieu
Nov 16, 2012 08:11 PM | David Boas
RE: Shared NIRS Data Format - SNIRF
Alex,
Thanks for your comments.
1) Time is more general and more informative than storing the data by sample number. When analyzing the data, we need to know the time of each sample. By providing the time variable, it provides this information and allows for variable timing in the acquisition within a channel and between channels.
2) Your point about referencing optodes to 10-20 coordinates is a good one. This is a topic that needs further consideration. There are several approaches already for handling this that are separate from NIRS time series acquisition. I hope that people more experienced on this topic can share there thoughts.
3) source power would be stored in meta data.
4) Please share your thoughts on meta data using this forum. Thanks
Originally posted by Alex Cristia:
Thanks for your comments.
1) Time is more general and more informative than storing the data by sample number. When analyzing the data, we need to know the time of each sample. By providing the time variable, it provides this information and allows for variable timing in the acquisition within a channel and between channels.
2) Your point about referencing optodes to 10-20 coordinates is a good one. This is a topic that needs further consideration. There are several approaches already for handling this that are separate from NIRS time series acquisition. I hope that people more experienced on this topic can share there thoughts.
3) source power would be stored in meta data.
4) Please share your thoughts on meta data using this forum. Thanks
Originally posted by Alex Cristia:
Hello,
Thanks for getting this discussion started, it will be great to have a standard format. I've a couple clarification questions and comments:
- "The time variable. This provides the acquisition time of the measurement relative to the time origin." I'm guessing this is preferable over sample number because (1) there's no need to look at sampling frequency before comparing to studies; (2) some systems may actually store time, if the sampling frequency is variable/there are errors. Is that so?
- You could add a field SD.Origin which can specify the 10-20 electrode used as reference, since most fNIRS neurocog users will have one. This simple addition would make it much easier to incorporate localization in an eventual meta-analysis.
- Would original source power (e.g., 0.6mW) be stored in metadata? Some systems, particularly those which use multiple distances, can specify different powers for each source.
- It would be useful to have a forum with suggested fields for metadata. I won't put them here, but there are several variables that appear to affect data quality in infant research, so it would be useful if we could keep track of them. If you're interested in starting to collect suggestions, let me know and I'll email you my list.
Alex Cristia
--
__________________________
Scientific staff member MPI-Nijmegen
sites.google.com/site/acrsta
Wundtlaan 1
6525 XD, Nijmegen
Netherlands
__________________________
Thanks for getting this discussion started, it will be great to have a standard format. I've a couple clarification questions and comments:
- "The time variable. This provides the acquisition time of the measurement relative to the time origin." I'm guessing this is preferable over sample number because (1) there's no need to look at sampling frequency before comparing to studies; (2) some systems may actually store time, if the sampling frequency is variable/there are errors. Is that so?
- You could add a field SD.Origin which can specify the 10-20 electrode used as reference, since most fNIRS neurocog users will have one. This simple addition would make it much easier to incorporate localization in an eventual meta-analysis.
- Would original source power (e.g., 0.6mW) be stored in metadata? Some systems, particularly those which use multiple distances, can specify different powers for each source.
- It would be useful to have a forum with suggested fields for metadata. I won't put them here, but there are several variables that appear to affect data quality in infant research, so it would be useful if we could keep track of them. If you're interested in starting to collect suggestions, let me know and I'll email you my list.
Alex Cristia
--
__________________________
Scientific staff member MPI-Nijmegen
sites.google.com/site/acrsta
Wundtlaan 1
6525 XD, Nijmegen
Netherlands
__________________________
Nov 20, 2012 03:11 PM | David Boas
RE: Shared NIRS Data Format - SNIRF
We are getting close to finalizing the SNIRF spec.
After more discussion with Ted Huppert, we figured out a good compromise to combine the "standard" and "extended" SNIRF file formats. The extended format used a unique time vector for every channel of data. But this can place an unnecessary book keeping burden on the software developer if all channels have a common time vector. The specification has been modified to describe how we can achieve both at the same time without having a "standard" and "extended" version of SNIRF.
Basically, we created blocks of data in the variable "data(idx)" with subfields "data(idx).d", "data(idx).t", and "data(idx).ml". In this way, if the data channels all have a common time vector, then there could simply be a single block of data, i.e. only one element in the data array. If, for example, there are 10 channels of data each with a unique time vector, then there will be 10 blocks of data, i.e. 10 elements in the data array.
I believe that this is an important simplification that developers will appreciate in the future.
This had a few residual impacts. First, the one column option for the sim(n).Data has been removed. This leaves the 3 column option which in fact is much more general. It was appropriate to drop the one column option. Second, there is now no need for the optional toffset list.
Finally, there are a few remaining things from the discussions that need to be addressed. I welcome any input or help on these points.
1) Provide examples of different data files.
2) Further discuss meta data.
3) Forum discussion by Mathieu Coursolle and Blaise Frederick:
In the optional variables, the "qform" variable specified a 4x4 matrix to align the NIRS coordinate system to other geometries. Would it be useful to have an additionnal variable that describes that geometry (ex: MNI, Talairach, anatomical, etc) ? I am thinking of something that may be similar to the NIfTI file format.
That sounds like a good idea; we'll have to think about what geometries would make the most sense ("scanner anatomic" is probably not relevant, but "MNI", "Talairach", maybe "10-20" might be good choices- we'd have to decide how the last would be implemented).
4) Forum Discussion by Alex Cristia
- You could add a field SD.Origin which can specify the 10-20 electrode used as reference, since most fNIRS neurocog users will have one. This simple addition would make it much easier to incorporate localization in an eventual meta-analysis.
After more discussion with Ted Huppert, we figured out a good compromise to combine the "standard" and "extended" SNIRF file formats. The extended format used a unique time vector for every channel of data. But this can place an unnecessary book keeping burden on the software developer if all channels have a common time vector. The specification has been modified to describe how we can achieve both at the same time without having a "standard" and "extended" version of SNIRF.
Basically, we created blocks of data in the variable "data(idx)" with subfields "data(idx).d", "data(idx).t", and "data(idx).ml". In this way, if the data channels all have a common time vector, then there could simply be a single block of data, i.e. only one element in the data array. If, for example, there are 10 channels of data each with a unique time vector, then there will be 10 blocks of data, i.e. 10 elements in the data array.
I believe that this is an important simplification that developers will appreciate in the future.
This had a few residual impacts. First, the one column option for the sim(n).Data has been removed. This leaves the 3 column option which in fact is much more general. It was appropriate to drop the one column option. Second, there is now no need for the optional toffset list.
Finally, there are a few remaining things from the discussions that need to be addressed. I welcome any input or help on these points.
1) Provide examples of different data files.
2) Further discuss meta data.
3) Forum discussion by Mathieu Coursolle and Blaise Frederick:
In the optional variables, the "qform" variable specified a 4x4 matrix to align the NIRS coordinate system to other geometries. Would it be useful to have an additionnal variable that describes that geometry (ex: MNI, Talairach, anatomical, etc) ? I am thinking of something that may be similar to the NIfTI file format.
That sounds like a good idea; we'll have to think about what geometries would make the most sense ("scanner anatomic" is probably not relevant, but "MNI", "Talairach", maybe "10-20" might be good choices- we'd have to decide how the last would be implemented).
4) Forum Discussion by Alex Cristia
- You could add a field SD.Origin which can specify the 10-20 electrode used as reference, since most fNIRS neurocog users will have one. This simple addition would make it much easier to incorporate localization in an eventual meta-analysis.
Nov 20, 2012 05:11 PM | Alex Cristia
RE: Shared NIRS Data Format - SNIRF
On metadata for infant studies, this is the list we've compiled
based on input from 3 labs + literature review trends:
Generally agreed/coded/reported:
- broad infant population type: "standard" (for lack of a better word; this is fullterm, no neurological, language, health problems, etc etc etc); otherwise specify. From 76 articles or theses published in English until June 2012, reporting results from 3557 infants, only 13% of infants were not "standard", so it's unclear how many and which categories we'd like to create. Perhaps it's simpler for now to just write his/her characteristics out.
- state: awake asleep other (ideally, we'd keep more specific track of state, but this is not often done/practical)
- circumference (cm)
- ear-to-ear-vertex (cm)
- ear-to-ear-inion (cm)
- nasion-inion (cm)
Other variables suggested:
Sex: male female (cf. arguments in EEG of sex differences in infancy)
Studies on lateralization, handedness: mother, father, child (left, right, ambi)
For newborns and premature infants: Gestational age at birth (days); Birth (VB, C); Anesthetic (no, name)
Hair: 1-3 quantity, 1-3 thickness, 1-3 darkness (potential problem: cross-lab differences in coding when seeing the same exact picture)
Skin: (no agreement on how to code this, none of the labs really keeping track of this)
Generally agreed/coded/reported:
- broad infant population type: "standard" (for lack of a better word; this is fullterm, no neurological, language, health problems, etc etc etc); otherwise specify. From 76 articles or theses published in English until June 2012, reporting results from 3557 infants, only 13% of infants were not "standard", so it's unclear how many and which categories we'd like to create. Perhaps it's simpler for now to just write his/her characteristics out.
- state: awake asleep other (ideally, we'd keep more specific track of state, but this is not often done/practical)
- circumference (cm)
- ear-to-ear-vertex (cm)
- ear-to-ear-inion (cm)
- nasion-inion (cm)
Other variables suggested:
Sex: male female (cf. arguments in EEG of sex differences in infancy)
Studies on lateralization, handedness: mother, father, child (left, right, ambi)
For newborns and premature infants: Gestational age at birth (days); Birth (VB, C); Anesthetic (no, name)
Hair: 1-3 quantity, 1-3 thickness, 1-3 darkness (potential problem: cross-lab differences in coding when seeing the same exact picture)
Skin: (no agreement on how to code this, none of the labs really keeping track of this)
Nov 20, 2012 07:11 PM | Mathieu Coursolle
RE: Shared NIRS Data Format - SNIRF
Regarding the qform variable, it could be paired with a qform_type
or qform_code variable?
Minimally, I would add support for:
- MNI-ICBM (http://www.bic.mni.mcgill.ca/ServicesAtl...)
- Talairach
- Unknown (Arbitrary coordinates).
It could be defined as a integer:
(ex: 0 = Unknown, 1 = MNI, 2 = Talairach).
NIfTI provides a similar concept with its qform_code (http://nifti.nimh.nih.gov/nifti-1/docume...).
Thanks,
Mathieu
Minimally, I would add support for:
- MNI-ICBM (http://www.bic.mni.mcgill.ca/ServicesAtl...)
- Talairach
- Unknown (Arbitrary coordinates).
It could be defined as a integer:
(ex: 0 = Unknown, 1 = MNI, 2 = Talairach).
NIfTI provides a similar concept with its qform_code (http://nifti.nimh.nih.gov/nifti-1/docume...).
Thanks,
Mathieu
Apr 2, 2013 04:04 PM | Mathieu Coursolle
RE: Shared NIRS Data Format - SNIRF
Hi,
I'd like to bring up another comment on the proposed snirf format (if not too late...).
In the description of the "ml" structure, there is a note specifying that "source indices generally refer to the optode naming (probe positions)".
What if the optode naming is different from the source/detector indices? It is usually the case with our system.
I guess a potential solution would be to add optional fields to SD? We could then have something like "SrcLabels" and "DetLabels" fields, which could indicate the name/label of the optode to be used.
I think this would avoid a lot of confusion when reviewing the data. As an example, the first optode used could be S4 on the device, not S1.
Thank you,
Mathieu
I'd like to bring up another comment on the proposed snirf format (if not too late...).
In the description of the "ml" structure, there is a note specifying that "source indices generally refer to the optode naming (probe positions)".
What if the optode naming is different from the source/detector indices? It is usually the case with our system.
I guess a potential solution would be to add optional fields to SD? We could then have something like "SrcLabels" and "DetLabels" fields, which could indicate the name/label of the optode to be used.
I think this would avoid a lot of confusion when reviewing the data. As an example, the first optode used could be S4 on the device, not S1.
Thank you,
Mathieu
Apr 15, 2013 07:04 PM | Mathieu Coursolle
RE: Shared NIRS Data Format - SNIRF
Hi,
Another comment/suggestion on the file format...
Our system has 'standard' detectors, and 'proximity' detectors. The proximity detectors are meant to be use to record the surface physiology with short-separation measurements.
The current definition of 'ml' allows to define a measurement using its source/wavelength index, and its detector index.
How could we specify here if the detector index refers to the ith detector, or the ith proximity detector? Add a 'detectorType'?
If I could bring up another issue...
Our system also uses a modular concept, where the full system is the combination of multiple smaller 8-channel modules.
This could also apply to systems that could be combined (e.g. using 2 system to add more measurements).
What if I want to pair a detector from a module/system, to the detector of another module/system?
In its most complex form, I would see 'ml' as
- detector module index
- detector type
- detector index
- source module index
- source index
- wavelength index
I know this would add complexity, but I am trying to make sure that no information is lost if this data format is used for export.
Thank you,
Mathieu
Another comment/suggestion on the file format...
Our system has 'standard' detectors, and 'proximity' detectors. The proximity detectors are meant to be use to record the surface physiology with short-separation measurements.
The current definition of 'ml' allows to define a measurement using its source/wavelength index, and its detector index.
How could we specify here if the detector index refers to the ith detector, or the ith proximity detector? Add a 'detectorType'?
If I could bring up another issue...
Our system also uses a modular concept, where the full system is the combination of multiple smaller 8-channel modules.
This could also apply to systems that could be combined (e.g. using 2 system to add more measurements).
What if I want to pair a detector from a module/system, to the detector of another module/system?
In its most complex form, I would see 'ml' as
- detector module index
- detector type
- detector index
- source module index
- source index
- wavelength index
I know this would add complexity, but I am trying to make sure that no information is lost if this data format is used for export.
Thank you,
Mathieu
Jul 31, 2013 08:07 PM | David Boas
RE: Shared NIRS Data Format - SNIRF
This is a good point. And this is easily resolved by adding
optional fields to SD called SrcLabels and DetLabels.
I have added this to the spec at
https://docs.google.com/document/d/1EKEM...
One concern I have is that this is a string array which might cause some annoyances unless we specify the length of the string rather than having it be open ended. Is it okay to set the length of the string as a max of 10 characters?
If anyone sees a major issue with the addition of these optional fields or anything else, please let me know. Note that this SNIRF spec is not yet finalized, but it would be good to finalize it soon.
Thanks,
David
Originally posted by Mathieu Coursolle:
I have added this to the spec at
https://docs.google.com/document/d/1EKEM...
One concern I have is that this is a string array which might cause some annoyances unless we specify the length of the string rather than having it be open ended. Is it okay to set the length of the string as a max of 10 characters?
If anyone sees a major issue with the addition of these optional fields or anything else, please let me know. Note that this SNIRF spec is not yet finalized, but it would be good to finalize it soon.
Thanks,
David
Originally posted by Mathieu Coursolle:
Hi,
I'd like to bring up another comment on the proposed snirf format (if not too late...).
In the description of the "ml" structure, there is a note specifying that "source indices generally refer to the optode naming (probe positions)".
What if the optode naming is different from the source/detector indices? It is usually the case with our system.
I guess a potential solution would be to add optional fields to SD? We could then have something like "SrcLabels" and "DetLabels" fields, which could indicate the name/label of the optode to be used.
I think this would avoid a lot of confusion when reviewing the data. As an example, the first optode used could be S4 on the device, not S1.
Thank you,
Mathieu
I'd like to bring up another comment on the proposed snirf format (if not too late...).
In the description of the "ml" structure, there is a note specifying that "source indices generally refer to the optode naming (probe positions)".
What if the optode naming is different from the source/detector indices? It is usually the case with our system.
I guess a potential solution would be to add optional fields to SD? We could then have something like "SrcLabels" and "DetLabels" fields, which could indicate the name/label of the optode to be used.
I think this would avoid a lot of confusion when reviewing the data. As an example, the first optode used could be S4 on the device, not S1.
Thank you,
Mathieu
Jul 31, 2013 08:07 PM | David Boas
RE: Shared NIRS Data Format - SNIRF
Could detectorType, sourceType,
detectorModule and sourceModule be added as optional fields in
SD?
But then we need to introduce the list of possible source types and detector types. At present you only provide the example of a short-separation source or detector. But why is that any different than a regular measurement? For analysis purposes, in Homer2, we identify short separation measurements based on the source detector separation which we calculate form the optode positions, rather than relying on a prior specification of what is a short separation measurement. If you feel strongly about adding detType or srcType, perhaps we can discuss it further.
as for the module information, it seems that this information could be conveyed in the srcLabel and detLabel fields that we have added to SD as an optional field. I guess that srcType and detType could also be encoded in the srcLabel and detLabel. That way the encoding specification is left to the individual and doesn't need to be standardized since it is not clear that it is of general utility. How does that sound?
David Boas
Originally posted by Mathieu Coursolle:
But then we need to introduce the list of possible source types and detector types. At present you only provide the example of a short-separation source or detector. But why is that any different than a regular measurement? For analysis purposes, in Homer2, we identify short separation measurements based on the source detector separation which we calculate form the optode positions, rather than relying on a prior specification of what is a short separation measurement. If you feel strongly about adding detType or srcType, perhaps we can discuss it further.
as for the module information, it seems that this information could be conveyed in the srcLabel and detLabel fields that we have added to SD as an optional field. I guess that srcType and detType could also be encoded in the srcLabel and detLabel. That way the encoding specification is left to the individual and doesn't need to be standardized since it is not clear that it is of general utility. How does that sound?
David Boas
Originally posted by Mathieu Coursolle:
Hi,
Another comment/suggestion on the file format...
Our system has 'standard' detectors, and 'proximity' detectors. The proximity detectors are meant to be use to record the surface physiology with short-separation measurements.
The current definition of 'ml' allows to define a measurement using its source/wavelength index, and its detector index.
How could we specify here if the detector index refers to the ith detector, or the ith proximity detector? Add a 'detectorType'?
If I could bring up another issue...
Our system also uses a modular concept, where the full system is the combination of multiple smaller 8-channel modules.
This could also apply to systems that could be combined (e.g. using 2 system to add more measurements).
What if I want to pair a detector from a module/system, to the detector of another module/system?
In its most complex form, I would see 'ml' as
- detector module index
- detector type
- detector index
- source module index
- source index
- wavelength index
I know this would add complexity, but I am trying to make sure that no information is lost if this data format is used for export.
Thank you,
Mathieu
Another comment/suggestion on the file format...
Our system has 'standard' detectors, and 'proximity' detectors. The proximity detectors are meant to be use to record the surface physiology with short-separation measurements.
The current definition of 'ml' allows to define a measurement using its source/wavelength index, and its detector index.
How could we specify here if the detector index refers to the ith detector, or the ith proximity detector? Add a 'detectorType'?
If I could bring up another issue...
Our system also uses a modular concept, where the full system is the combination of multiple smaller 8-channel modules.
This could also apply to systems that could be combined (e.g. using 2 system to add more measurements).
What if I want to pair a detector from a module/system, to the detector of another module/system?
In its most complex form, I would see 'ml' as
- detector module index
- detector type
- detector index
- source module index
- source index
- wavelength index
I know this would add complexity, but I am trying to make sure that no information is lost if this data format is used for export.
Thank you,
Mathieu
Aug 1, 2013 09:08 PM | David Boas
RE: Shared NIRS Data Format - SNIRF
I received some good input from Randall Barbour by email that he
gave me the okay to share here.
Hi David:
Some general thoughts. Typically standards for data sharing serve two principal purposes; a basis for cross platform data review, and to facilitate the ability to reproduce experimental conditions. While perhaps already considered, particulars of the source-detector configuration would benefit from platform-dependent information that additionally identifies source-detector power/amplification details. For instance, all of our imagers support the ability to vary source amplitude as well as detector gain. Also adjustable is the timing of the source illumination sequence. For instance, large area investigations can be accomplished using either a single-point time multiplexing scheme or simultaneous multipoint, time multiplexing. In the limit, there are a large number of ways illumination-detection can be accomplished even for a fixed sensor layout. The way we deal with this is to have the user define the source detector configuration that pertains to each sub-array of the overall sensing geometry. Clearly, there would be a need to access this information, without which the data would not be interpretable.
Still another consideration looking forward are elements that pertain to data quality assurance. For instance, we routinely examine our gain setting values prior to collecting data as a basis for determining the fidelity of optode contact. It might be helpful to have the ability to include a metric that supports such measures as an objective basis for appreciating variations in head-gear setup.
I hope you find this helpful. Please get back to me if you need any additional clarifications.
Best,
Randall
My response:
Dear Randall,
Thanks for the input. Your points are clear. We should modify the spec to allow for information about source power and detector gain. I believe that the spec already handles variable illumination sequences. I'd be happy to review this with you to make sure I am not missing something.
So, the question is how best to modify the spec to handle source power and detector gain. These are not necessarily fixed parameters, like the their positions, since the power and gain can vary depending on the specific source-detector pair during temporal multiplexing. As such, it seems best to put this information into the measurement list 'ml' in data(idx).ml(chIdx).XXX . Every subfield in 'ml' is currently an index into the SD structure. So, we could either add
ml(chIdx).srcPower and ml(chIdx).detGain
or
ml(chIdx).srcPowerIdx and ml(chIdx).detGainIdx
where these indices would refer to new fields
SD.SrcPower and SD.DetGain
Thanks,
David
So, I am now about to incorporate this into the spec. I am leaning towards using
ml(chIdx).srcPower and ml(chIdx).detGain
as this will be much more efficient than the latter option as indexing srcPower and detGain could become quite annoying as both power and gain can easily take on a continuum of values. All the fields currently indexed in the SD structure very clearly only take on discrete values.
I will make this addition to the ml structure for now, but encourage comments in support or against it.
Thanks,
David
Hi David:
Some general thoughts. Typically standards for data sharing serve two principal purposes; a basis for cross platform data review, and to facilitate the ability to reproduce experimental conditions. While perhaps already considered, particulars of the source-detector configuration would benefit from platform-dependent information that additionally identifies source-detector power/amplification details. For instance, all of our imagers support the ability to vary source amplitude as well as detector gain. Also adjustable is the timing of the source illumination sequence. For instance, large area investigations can be accomplished using either a single-point time multiplexing scheme or simultaneous multipoint, time multiplexing. In the limit, there are a large number of ways illumination-detection can be accomplished even for a fixed sensor layout. The way we deal with this is to have the user define the source detector configuration that pertains to each sub-array of the overall sensing geometry. Clearly, there would be a need to access this information, without which the data would not be interpretable.
Still another consideration looking forward are elements that pertain to data quality assurance. For instance, we routinely examine our gain setting values prior to collecting data as a basis for determining the fidelity of optode contact. It might be helpful to have the ability to include a metric that supports such measures as an objective basis for appreciating variations in head-gear setup.
I hope you find this helpful. Please get back to me if you need any additional clarifications.
Best,
Randall
My response:
Dear Randall,
Thanks for the input. Your points are clear. We should modify the spec to allow for information about source power and detector gain. I believe that the spec already handles variable illumination sequences. I'd be happy to review this with you to make sure I am not missing something.
So, the question is how best to modify the spec to handle source power and detector gain. These are not necessarily fixed parameters, like the their positions, since the power and gain can vary depending on the specific source-detector pair during temporal multiplexing. As such, it seems best to put this information into the measurement list 'ml' in data(idx).ml(chIdx).XXX . Every subfield in 'ml' is currently an index into the SD structure. So, we could either add
ml(chIdx).srcPower and ml(chIdx).detGain
or
ml(chIdx).srcPowerIdx and ml(chIdx).detGainIdx
where these indices would refer to new fields
SD.SrcPower and SD.DetGain
Thanks,
David
So, I am now about to incorporate this into the spec. I am leaning towards using
ml(chIdx).srcPower and ml(chIdx).detGain
as this will be much more efficient than the latter option as indexing srcPower and detGain could become quite annoying as both power and gain can easily take on a continuum of values. All the fields currently indexed in the SD structure very clearly only take on discrete values.
I will make this addition to the ml structure for now, but encourage comments in support or against it.
Thanks,
David