Package: dirdf
Extracts Metadata from Directory and File Names
Metadata in file names
> dir("examples/dataset_1/")
[1] "2013-06-26_BRAFWTNEG_Plasmid-Cellline-100_A01.csv"
[2] "2013-06-26_BRAFWTNEG_Plasmid-Cellline-100_A02.csv"
[3] "2014-02-26_BRAFWTNEG_FFPEDNA-CRC-1-41_D08.csv"
[4] "2014-03-05_BRAFWTNEG_FFPEDNA-CRC-REPEAT_H03.csv"
[5] "2016-04-01_BRAFWTNEG_FFPEDNA-CRC-1-41_E12.csv"
> library("dirdf")
> dirdf("examples/dataset_1/", template="date_assay_experiment_well.ext")
date assay experiment well ext pathname
1 2013-06-26 BRAFWTNEG Plasmid-Cellline-100 A01 csv 2013-06-26_BRAFWTNEG_Plasmid-Cellline-100_A01.csv
2 2013-06-26 BRAFWTNEG Plasmid-Cellline-100 A02 csv 2013-06-26_BRAFWTNEG_Plasmid-Cellline-100_A02.csv
3 2014-02-26 BRAFWTNEG FFPEDNA-CRC-1-41 D08 csv 2014-02-26_BRAFWTNEG_FFPEDNA-CRC-1-41_D08.csv
4 2014-03-05 BRAFWTNEG FFPEDNA-CRC-REPEAT H03 csv 2014-03-05_BRAFWTNEG_FFPEDNA-CRC-REPEAT_H03.csv
5 2016-04-01 BRAFWTNEG FFPEDNA-CRC-1-41 E12 csv 2016-04-01_BRAFWTNEG_FFPEDNA-CRC-1-41_E12.csv
Inconsistent file names
> dir("examples/dataset_2/")
[1] "2011-12-16_OTHER_FFPEDNA-CRC-1-41_D08.csv"
[2] "2013-06-26_OTHER_Plasmid-Cellline-100_B02.csv"
[3] "2014-03-05_OTHER_FFPEDNA-CRC-REPEAT_platefile.csv"
[4] "2014-07-06_OTHER_Plasmid-Cellline-100_B01.csv"
[5] "2016-01-11_OTHER_FFPEDNA-CRC-2-41.csv"
> dirdf("examples/dataset_2/", template="date_assay_experiment_well.ext")
Error in dirdf_parse(pathnames, colnames = colnames, regexp = regexp, :
Unexpected path(s) found:
2016-01-11_OTHER_FFPEDNA-CRC-2-41.csv
> dirdf("examples/dataset_2/", template="date_assay_experiment_well?.ext")
date assay experiment well ext pathname
1 2011-12-16 OTHER FFPEDNA-CRC-1-41 D08 csv 2011-12-16_OTHER_FFPEDNA-CRC-1-41_D08.csv
2 2013-06-26 OTHER Plasmid-Cellline-100 B02 csv 2013-06-26_OTHER_Plasmid-Cellline-100_B02.csv
3 2014-03-05 OTHER FFPEDNA-CRC-REPEAT H03 csv 2014-03-05_OTHER_FFPEDNA-CRC-REPEAT_H03.csv
4 2014-07-06 OTHER Plasmid-Cellline-100 B01 csv 2014-07-06_OTHER_Plasmid-Cellline-100_B01.csv
5 2016-01-11 OTHER FFPEDNA-CRC-2-41 <NA> csv 2016-01-11_OTHER_FFPEDNA-CRC-2-41.csv
Metadata in directory and path names
> dir("examples/", recursive=TRUE)
[1] "LabA,2016/2013-06-26_BRAFWTNEG_Plasmid-Cellline-100_A01.csv"
[2] "LabA,2016/2013-06-26_BRAFWTNEG_Plasmid-Cellline-100_A02.csv"
[3] "LabA,2016/2014-02-26_BRAFWTNEG_FFPEDNA-CRC-1-41_D08.csv"
[4] "LabA,2016/2014-03-05_BRAFWTNEG_FFPEDNA-CRC-REPEAT_H03.csv"
[5] "LabA,2016/2016-04-01_BRAFWTNEG_FFPEDNA-CRC-1-41_E12.csv"
[6] "LabB,2015/2011-12-16_OTHER_FFPEDNA-CRC-1-41_D08.csv"
[7] "LabB,2015/2013-06-26_OTHER_Plasmid-Cellline-100_B02.csv"
[8] "LabB,2015/2014-03-05_OTHER_FFPEDNA-CRC-REPEAT_H03.csv"
[9] "LabB,2015/2014-07-06_OTHER_Plasmid-Cellline-100_B01.csv"
[10] "LabB,2015/2016-01-11_OTHER_FFPEDNA-CRC-2-41.csv"
> dirdf("examples/", template="lab,year/date_assay_experiment_well?.ext")
lab year date assay experiment well ext pathname
1 LabA 2016 2013-06-26 BRAFWTNEG Plasmid-Cellline-100 A01 csv LabA,2016/2013-06-26_BRAFWTNEG_Plasmid-Cellline-100_A01.csv
2 LabA 2016 2013-06-26 BRAFWTNEG Plasmid-Cellline-100 A02 csv LabA,2016/2013-06-26_BRAFWTNEG_Plasmid-Cellline-100_A02.csv
3 LabA 2016 2014-02-26 BRAFWTNEG FFPEDNA-CRC-1-41 D08 csv LabA,2016/2014-02-26_BRAFWTNEG_FFPEDNA-CRC-1-41_D08.csv
4 LabA 2016 2014-03-05 BRAFWTNEG FFPEDNA-CRC-REPEAT H03 csv LabA,2016/2014-03-05_BRAFWTNEG_FFPEDNA-CRC-REPEAT_H03.csv
5 LabA 2016 2016-04-01 BRAFWTNEG FFPEDNA-CRC-1-41 E12 csv LabA,2016/2016-04-01_BRAFWTNEG_FFPEDNA-CRC-1-41_E12.csv
6 LabB 2015 2011-12-16 OTHER FFPEDNA-CRC-1-41 D08 csv LabB,2015/2011-12-16_OTHER_FFPEDNA-CRC-1-41_D08.csv
7 LabB 2015 2013-06-26 OTHER Plasmid-Cellline-100 B02 csv LabB,2015/2013-06-26_OTHER_Plasmid-Cellline-100_B02.csv
8 LabB 2015 2014-03-05 OTHER FFPEDNA-CRC-REPEAT H03 csv LabB,2015/2014-03-05_OTHER_FFPEDNA-CRC-REPEAT_H03.csv
9 LabB 2015 2014-07-06 OTHER Plasmid-Cellline-100 B01 csv LabB,2015/2014-07-06_OTHER_Plasmid-Cellline-100_B01.csv
10 LabB 2015 2016-01-11 OTHER FFPEDNA-CRC-2-41 <NA> csv LabB,2015/2016-01-11_OTHER_FFPEDNA-CRC-2-41.csv