Package: dirdf

Extracts Metadata from Directory and File Names

Metadata in file names

> dir("examples/dataset_1/")
[1] "2013-06-26_BRAFWTNEG_Plasmid-Cellline-100_A01.csv"
[2] "2013-06-26_BRAFWTNEG_Plasmid-Cellline-100_A02.csv"
[3] "2014-02-26_BRAFWTNEG_FFPEDNA-CRC-1-41_D08.csv"
[4] "2014-03-05_BRAFWTNEG_FFPEDNA-CRC-REPEAT_H03.csv"
[5] "2016-04-01_BRAFWTNEG_FFPEDNA-CRC-1-41_E12.csv"

> library("dirdf")
> dirdf("examples/dataset_1/", template="date_assay_experiment_well.ext")
        date     assay           experiment well ext                                          pathname
1 2013-06-26 BRAFWTNEG Plasmid-Cellline-100  A01 csv 2013-06-26_BRAFWTNEG_Plasmid-Cellline-100_A01.csv
2 2013-06-26 BRAFWTNEG Plasmid-Cellline-100  A02 csv 2013-06-26_BRAFWTNEG_Plasmid-Cellline-100_A02.csv
3 2014-02-26 BRAFWTNEG     FFPEDNA-CRC-1-41  D08 csv     2014-02-26_BRAFWTNEG_FFPEDNA-CRC-1-41_D08.csv
4 2014-03-05 BRAFWTNEG   FFPEDNA-CRC-REPEAT  H03 csv   2014-03-05_BRAFWTNEG_FFPEDNA-CRC-REPEAT_H03.csv
5 2016-04-01 BRAFWTNEG     FFPEDNA-CRC-1-41  E12 csv     2016-04-01_BRAFWTNEG_FFPEDNA-CRC-1-41_E12.csv

Inconsistent file names

> dir("examples/dataset_2/")
[1] "2011-12-16_OTHER_FFPEDNA-CRC-1-41_D08.csv"
[2] "2013-06-26_OTHER_Plasmid-Cellline-100_B02.csv"
[3] "2014-03-05_OTHER_FFPEDNA-CRC-REPEAT_platefile.csv"
[4] "2014-07-06_OTHER_Plasmid-Cellline-100_B01.csv"
[5] "2016-01-11_OTHER_FFPEDNA-CRC-2-41.csv"

> dirdf("examples/dataset_2/", template="date_assay_experiment_well.ext")
Error in dirdf_parse(pathnames, colnames = colnames, regexp = regexp,  :
  Unexpected path(s) found:
2016-01-11_OTHER_FFPEDNA-CRC-2-41.csv

> dirdf("examples/dataset_2/", template="date_assay_experiment_well?.ext")
        date assay           experiment well ext                                      pathname
1 2011-12-16 OTHER     FFPEDNA-CRC-1-41  D08 csv     2011-12-16_OTHER_FFPEDNA-CRC-1-41_D08.csv
2 2013-06-26 OTHER Plasmid-Cellline-100  B02 csv 2013-06-26_OTHER_Plasmid-Cellline-100_B02.csv
3 2014-03-05 OTHER   FFPEDNA-CRC-REPEAT  H03 csv   2014-03-05_OTHER_FFPEDNA-CRC-REPEAT_H03.csv
4 2014-07-06 OTHER Plasmid-Cellline-100  B01 csv 2014-07-06_OTHER_Plasmid-Cellline-100_B01.csv
5 2016-01-11 OTHER     FFPEDNA-CRC-2-41 <NA> csv         2016-01-11_OTHER_FFPEDNA-CRC-2-41.csv

Metadata in directory and path names

> dir("examples/", recursive=TRUE)
 [1] "LabA,2016/2013-06-26_BRAFWTNEG_Plasmid-Cellline-100_A01.csv"
 [2] "LabA,2016/2013-06-26_BRAFWTNEG_Plasmid-Cellline-100_A02.csv"
 [3] "LabA,2016/2014-02-26_BRAFWTNEG_FFPEDNA-CRC-1-41_D08.csv"
 [4] "LabA,2016/2014-03-05_BRAFWTNEG_FFPEDNA-CRC-REPEAT_H03.csv"
 [5] "LabA,2016/2016-04-01_BRAFWTNEG_FFPEDNA-CRC-1-41_E12.csv"
 [6] "LabB,2015/2011-12-16_OTHER_FFPEDNA-CRC-1-41_D08.csv"
 [7] "LabB,2015/2013-06-26_OTHER_Plasmid-Cellline-100_B02.csv"
 [8] "LabB,2015/2014-03-05_OTHER_FFPEDNA-CRC-REPEAT_H03.csv"
 [9] "LabB,2015/2014-07-06_OTHER_Plasmid-Cellline-100_B01.csv"
[10] "LabB,2015/2016-01-11_OTHER_FFPEDNA-CRC-2-41.csv"

> dirdf("examples/", template="lab,year/date_assay_experiment_well?.ext")
    lab year       date     assay           experiment well ext                                                    pathname
1  LabA 2016 2013-06-26 BRAFWTNEG Plasmid-Cellline-100  A01 csv LabA,2016/2013-06-26_BRAFWTNEG_Plasmid-Cellline-100_A01.csv
2  LabA 2016 2013-06-26 BRAFWTNEG Plasmid-Cellline-100  A02 csv LabA,2016/2013-06-26_BRAFWTNEG_Plasmid-Cellline-100_A02.csv
3  LabA 2016 2014-02-26 BRAFWTNEG     FFPEDNA-CRC-1-41  D08 csv     LabA,2016/2014-02-26_BRAFWTNEG_FFPEDNA-CRC-1-41_D08.csv
4  LabA 2016 2014-03-05 BRAFWTNEG   FFPEDNA-CRC-REPEAT  H03 csv   LabA,2016/2014-03-05_BRAFWTNEG_FFPEDNA-CRC-REPEAT_H03.csv
5  LabA 2016 2016-04-01 BRAFWTNEG     FFPEDNA-CRC-1-41  E12 csv     LabA,2016/2016-04-01_BRAFWTNEG_FFPEDNA-CRC-1-41_E12.csv
6  LabB 2015 2011-12-16     OTHER     FFPEDNA-CRC-1-41  D08 csv         LabB,2015/2011-12-16_OTHER_FFPEDNA-CRC-1-41_D08.csv
7  LabB 2015 2013-06-26     OTHER Plasmid-Cellline-100  B02 csv     LabB,2015/2013-06-26_OTHER_Plasmid-Cellline-100_B02.csv
8  LabB 2015 2014-03-05     OTHER   FFPEDNA-CRC-REPEAT  H03 csv       LabB,2015/2014-03-05_OTHER_FFPEDNA-CRC-REPEAT_H03.csv
9  LabB 2015 2014-07-06     OTHER Plasmid-Cellline-100  B01 csv     LabB,2015/2014-07-06_OTHER_Plasmid-Cellline-100_B01.csv
10 LabB 2015 2016-01-11     OTHER     FFPEDNA-CRC-2-41 <NA> csv    LabB,2015/2016-01-11_OTHER_FFPEDNA-CRC-2-41.csv