24 Aug 2018 |
emdupre | Hi everyone ! mainakjas | 15:43:34 |
emdupre | and I just had a conversation on the shuttle over about the distinction between versioning datasets and derivatives as it relates to annotations. The tricky piece is that, intuitively, I would have said annotations are not derivatives as they don't involve transformations of the data | 15:46:08 |
emdupre | But effectively what we came to agree is that derivatives have to exist in relation to the dataset as it was originally released. Otherwise, there's an almost infinite space of information you could additionally extract without transforming the data in any way (i.e., running NeuroScount with different parameter sets, having different humans annotate the data) | 15:47:21 |
emdupre | I know that teonbrooks agrees with my gut-reaction (which I'm slowly getting over !), but I'm wondering if this line of reasoning is convincing to anyone else ? | 15:48:09 |
@teonbrooks:matrix.org | hi emdupre | 15:51:27 |
Chris Gorgolewski | +1 for annotations as derivatives | 15:52:01 |
@teonbrooks:matrix.org | I'm +0 on whether they are derivatives. I definitely think that they need to exist | 15:52:53 |
@teonbrooks:matrix.org | and separate from the events | 15:52:59 |
Chris Gorgolewski | yeah that what I meant | 15:53:19 |
Chris Gorgolewski | lets not split hairs on definitions | 15:53:30 |
@teonbrooks:matrix.org | ;) | 15:53:38 |
@teonbrooks:matrix.org | but then the question lies as to where the file lives, does it live at the top level with the raw files or down in the derivatives folder | 15:54:04 |
@teonbrooks:matrix.org | that is the open question i have. i don't think they are informative at the derivatives folder level because i feel that without the original file to act on, it doesn't give useful information to act on | 15:55:03 |
Chris Gorgolewski | considering the concept of independet derived datasets that could be shared and published I think it should be allowed | 15:55:23 |
dorahermes | should there be an optional _annot.json that states who/which algorithm did the annotation? | 15:56:11 |
dorahermes | and which data it is meant for? | 15:56:25 |
Chris Gorgolewski | I think it's worthwhile thinking of metadata fields for annotations | 15:56:48 |
Chris Gorgolewski | but in a pragamatic way - what things would applications filter on | 15:57:10 |
emdupre | In terms of needing the annnotations to act on the data, I think that still would need to be a derivative, unfortunately (though I completely see your point, Teon !) | 15:57:54 |
@teonbrooks:matrix.org | i think the metadata file should have the raw file that is applied to like dorahermes | 15:57:57 |
@teonbrooks:matrix.org | just mentioned | 15:58:08 |
emdupre | effectively it's just not information that the investigator released, and it's subsequently extracted from the data. and yes your pipeline / process needs it, but there could theoretically be others that don't, so saying it's raw data is tricky | 15:59:01 |
@teonbrooks:matrix.org | so if you wanted to share say the original file but no the intermediate files say epochs or evoked etc. you would share the data at the raw file level and the derivatives folder with just the _annotations.tsv and its corresponding json | 16:02:03 |
emdupre | so that's a bigger discussion @filo had brought up about the relationship between BIDS Raw and BIDS Derivatives and whether or not derivatives could sit next to raw files | 16:04:45 |
emdupre | Good place to continue that discussion is in the BEP003 chat ! | 16:05:24 |
@teonbrooks:matrix.org | 👍 | 16:10:11 |
emdupre | But WDYT about that as a working definition for derivatives vs raw in the annotation space ? | 16:11:19 |
@teonbrooks:matrix.org | sorry but what would be the working definition you mention? | 16:14:30 |
emdupre |
But effectively what we came to agree is that derivatives have to exist in relation to the dataset as it was originally released. Otherwise, there's an almost infinite space of information you could additionally extract without transforming the data in any way (i.e., running NeuroScount with different parameter sets, having different humans annotate the data)
| 16:14:50 |
@teonbrooks:matrix.org | oh yeah, i agree with that | 16:15:18 |