Which CC version should I choose ? - the descriptive details !
More technical details containing version guide :
versionsdetails
----------------------------------------
Recommended way to run all analysis :
run CS5 with default settings.
Reproduce previous results (for direct comparison to earlier runs)
To run CC3 "as before" - run CS5 with --strandSpecificDuplicates --CCversion CS3
To run CC4 "as before" - run CS5 with --strandSpecificDuplicates --CCversion CS4
Note ! - this leads you to reproduce all the bugs of the CB4a/CB3a codes as well !
-----------------------------------------
Here the versions in detail :
CURRENT
CM5 : PARALLEL RUNS, RAINBOW VISUALISATIONS : current development version (fancy, potentially unstable)
CB5 : current development version (fancy, potentially unstable)
CS5 : current stable version
BUG-FIXED
CF5 : major bug fixes release for CB3a and CB4a pipes (released Nov2017)
OUTDATED
CB3a CB4a : are just as CC3 and CC4 but still getting bug fixes.
CC3 CC4 : outdated, not updated any more. Contain more bugs than CB3a and CB4a.
Details below.
~~~~~~~~~~~~~~~~~~~~~~~
CB3a, CB4a : are just as CC3 and CC4 but still getting bug fixes.
The main difference to CC3 and CC4 is that these include the "configuration scripts",
which means that EXTERNAL users can set these pipes up.
Otherwise these are exactly the same codes as CC3 and CC4
~~~~~~~~~~~~~~~~~~~~~~~
CF5 : the major cumulative bug fixes of CB pipes (released Nov2017)
CF5 : just like CB3a, CB4a, but the two are combined to a single pipe,
and you can ask the different run modes with --CCversion CF3 and --CCversion CF4
Default run mode is --CCversion CF5 (which is better in any case, but if you want to generater "CC3" or "CC4" compatible data, to compare to earlier data sets, you should use the above run modes to do that)
See below (*) for details in this
CF5 version also contains all the cumulative bug fixes, which the CB3a and CB4a needed at the time (Nov2017),
the most important being removing strandedness from the duplicate filtering.
When comparing CF5 and CB4a/CB3a plots - we assume CF5 signal to be lower signal and less noisy throughout.
~~~~~~~~~~~~~~~~~~~~~~~
CS5 : is like CF5, but getting also new features in.
( CF5 only gets bug fixes, no new features).
CS5 is also fully backwards compatible (provides flag --strandSpecificDuplicates to re-introduce the bug in CB3a and CB4a pipes, to avoid re-running samples if direct comparison is needed)
New features : bowtie2 support, and "wobbly ends support" ( to give more flexibility to duplicate filtering).
How to run samples in CS5 but having them readily comparable with earlier CC3 and CC4 runs - see (**) below.
~~~~~~~~~~~~~~~~~~~~~~~
CB5 : is like CS5 but Very Actively developed.
All new features come to CB5 first, and once they are thoroughly tested they are moved over to CS5.
So - I would use CS5 normally as that is stable and working.
CB5 is pretty shaky and horrible sometimes, as things are Really Happening in there.
~~~~~~~~~~~~~~~~~~~~~~~
CM5 : is like CB5 but parallel. Very Actively developed.
CM5 is your pipe of choice, if you want to run more than 30 oligos (more feasible visualisations), or have multilane data (no catenation of input fastq files needed any more).
CM5 is fresh - so can still be a bit shaky.
All new development comes to CM5 first (such as tiled-capture support etc),
and time permitting seeps down the chain to CB5 and finally to CS5.
CM5 will take over as "the pipeline we will exclusively use",
as soon as it gets stable - this is assumed to happen ~ March 2019.
~~~~~~~~~~~~~~~~~~~~~~~
In addition to the above, Jelena has in her own private use 2 more pipeline versions :
CB5aDev (where the CB5 is being built),
and CM5aDev (where the parallel CM5 is being built.)
------------------------------------------
(*) Differences in run modes CF3 CF4 CF5
CF5 introduced "third way to filter non-flashed reads" :
Because: CB3a is too lenient to duplicates (of non-flashable reads) and CB4a is too strict - CB5 is "in between" (just goldilocks).
------------------------------------------
(**) Differences in run modes CS3 CS4 CS5
CF5 introduced "third way to filter non-flashed reads" :
Because: CB3a is too lenient to duplicates (of non-flashable reads) and CB4a is too strict - CB5 is "in between" (just goldilocks).
In CS5 these run modes also include a bug fix.
To reproduce the analysis "exactly as it would be" in CC4 and CC3 run modes (to compare to earlier samples)
you need to add a flag to "bring back the bug" like this :
--strandSpecificDuplicates --CCversion CS3
--strandSpecificDuplicates --CCversion CS4
--------------------------------------------