PREPARE THE FOLDER STRUCTURE FOR THE RUN
Bug fixes and updates (after release)
To run the statistics analysis - you need 6 CB4/CC4/CF*/CS*/Cb* pipeline runs (containing folders F1 to F7) : 3 replicates of your 'condition' and 3 replicates for your 'control'. Place all these 6 folders under a same TOP FOLDER . You can just move them there with mv command like this :
All the output data is in same folder located in /t1-data/usr/....../Promoter_capture Promoter_capture | | CONDITION-samples (3 replicates) - this condition is called INSERT | |-- Analysis_INSERT_EB5 | `-- F6_greenGraphs_combined_..._CC4 | |-INSERT_EB5_Mitof.gff | |-INSERT_EB5_SOX2.gff | `-INSERT_EB5_nanog.gff | |-- Analysis_INSERT_EB6 | `-- F6_greenGraphs_combined_..._CC4 |-- Analysis_INSERT_EB7 | `-- F6_greenGraphs_combined_..._CC4 | | CONTROL-samples (3 replicates) - this control is called WT | |-- Analysis_WT_Sp5 | `-- F6_greenGraphs_combined_..._CC4 |-- Analysis_WT_Sp6 | `-- F6_greenGraphs_combined_..._CC4 `-- Analysis_WT_Sp15 `-- F6_greenGraphs_combined_..._CC4
The condition samples (above) have word INSERT in the folder name and the control samples have word WT in the folder name
You can name your condition and control folder with any name you want, as long as there are 3 folders named as condition folders, and 3 folders named as control folders.
If your folder structure does not look like that - change folder names :
How to change folder names ???
Go to your top folder ( folder /t1-data/usr/....../Promoter_capture in above example ), and run the tester command :
/t1-data/data/hugheslab/jelenatools/CC/statistics/oligolister.sh
You should see each of your OLIGOs once for each of your 6 samples ! Check also the file size ! - you may have empty file (size 0)
Example output :Analysis_INSERT_EB5 File size OLIGOfile name 539K COMBINED_CC4_nprl3.gff 346K COMBINED_CC4_mpg.gff Analysis_INSERT_EB6 File size OLIGOfile name 846K COMBINED_CC4_nprl3.gff 607K COMBINED_CC4_mpg.gff ... et cetera ... Analysis_WT_Sp16 File size OLIGOfile name 482K COMBINED_CC4_nprl3.gff 334K COMBINED_CC4_mpg.gff
If you have VERY MANY folders in there (more than the 6 samples for this statistics run) - you can add to the command, which "folder series" you want to list:
/t1-data/data/hugheslab/jelenatools/CC/statistics/oligolister.sh INSERT /t1-data/data/hugheslab/jelenatools/CC/statistics/oligolister.sh WT
/t1-data/data/hugheslab/jelenatools/CC/statistics/statisticsRunner.sh --ccversion CC4 ( or CB4/CC4/CF*/CS*/Cb* ) --genome mm9 --name outputFolder_statisticsAnalysis --pf /public/username/WT_INSERT_analysis --condition INSERT --control WT --path /t1-data/usr/....../Promoter_capture --folders Analysis_INSERT_EB5,Analysis_INSERT_EB6,Analysis_INSERT_EB7,Analysis_WT_Sp5,Analysis_WT_Sp6,Analysis_WT_Sp15 --oligos nprl3,mpg
Tips and helpers to build your run command below (after run instructions) !
You don't need to put all oligos in - if you have many oligos : list only the ones you actually WANT to analyse.
The folders don't need to be in "any order" - just check that you listed all 6 of them.
The public folder does not need to exist (is generated during the run).
The output folder --name is generated in the run folder.
Below some helper commands to make the --folders --oligos and --path parameters
/t1-data/data/hugheslab/jelenatools/CC/statistics/oligolistGenerator.shExample output :
--oligos mpg,nprl3
You can add to the command, which "folder series" you want to list:
/t1-data/data/hugheslab/jelenatools/CC/statistics/oligolistGenerator.sh INSERT
NOTE !! If you combined your globins, the combined names will not show in the above list ! To use combined globins : --oligos mpg,nprl3,HbaCombined,HbbCombined
/t1-data/data/hugheslab/jelenatools/CC/statistics/folderlistGenerator.shExample output :
--folders Analysis_INSERT_EB5,Analysis_INSERT_EB6,Analysis_INSERT_EB7,Analysis_WT_Sp5,Analysis_WT_Sp6,Analysis_WT_Sp15
You can add to the command, which 2 "folder series" you want to list:
/t1-data/data/hugheslab/jelenatools/CC/statistics/folderlistGenerator.sh INSERT WT
pwd