#########################################################

FAQ : This is all very nice, but I would like to have more P-values here ?

#########################################################

   CCseqBasic pipeline provides only this :


   1) visualisation of non-normalised, filtered (curated), interacting fragments
      over each restriction enzyme fragment "bin". 

   2) various visualisations and reports of "how we got there"

   3) counts of cis and trans fragments (how many interactios per capture site)


----------------------------------------------------

   The pipeline thus does not :


   4) NORMALISE THE INTERACTIONS (reads per 100 000 is a good way to normalise)
      this can be done, for example like this : 

      http://userweb.molbiol.ox.ac.uk/public/telenius/captureManual/Normalising_the_CaptureC_data_without_a_pipeline.pdf


   5) COMPARE TWO DIFFERENT CELL TYPES (etc) - to find "changes" in interaction profiles.
      this can be done, for example like this :

      http://userweb.molbiol.ox.ac.uk/public/telenius/captureManual/statistics.html

      Other toolkits to do this exist as well, and are recommended warmly to all (the above is by no means exhaustive or "the best" way to get p-values in these situations).
      One possibility is fourCseq. Pondering about fourCseq in the bottom of this page : 
      http://userweb.molbiol.ox.ac.uk/public/telenius/captureManual/statistics.html


   6) CALL SIGNIFICANT INTERACTIONS (over a baseline)

      There have been several toolkits published to do just this for chromosome capture data.
      Which one, if any, of these suit best for Capture-C data, is not clear.
      As of now we are collaborating with mathematicians, to build something more solid than we have at the moment.

      However, you can certainly give these tools a shot :
      peakC, fourCseq, r3Cseq, chicago.

      Some pondering about all these tools in the bottom of this page :
      http://userweb.molbiol.ox.ac.uk/public/telenius/captureManual/statistics.html


      peakC    
            - easy to use
            - supports capture-c data really well : the gff files of CCseqBasic (in F6 folder) can be easily modified to be peakC input files like this :
                If cis chromosome is chr6 : 
                cat thisismyfile.gff | grep '^chr6\s' | cut -f 5,6  > thisismyfile.wig
            - has easy to modify source code
            - analyses one capture site per run (needs to be ran as many times as you have oligos in your design - for each capture's gff file separately)

      fourCseq 
            - easy to use (but z-score parameter needs to be lowered to match the typically very low background read levels of captureC)
            - capture-c data goes in as bam files (files containing only the reporter fragments) - use the "per oligo" bam files in F6 folder.
            - capture-c data needs to be "forced" to go in by setting "distance from nearest cut site" to ~ 200 bases (natively pairs to the popular HiC analyser HiCup instead of CCseqBasic)
            - we are prone to lose a lot of good fragments in this analysis
            - the output visualisations are really nice
            - analyses one capture site per run (needs to be ran as many times as you have oligos in your design - for each capture's gff file separately)
            - some notes of how to run this (from before time the BLAT-filtering wasn't integral part of the pipeline) :
              http://userweb.molbiol.ox.ac.uk/public/telenius/CAPTUREC_DATA/paperRevisionHtml_240815/index

      r3Cseq 
            - easy to use
            - capture-c data goes in as bam files (files containing only the reporter fragments) - use the "per oligo" bam files in F6 folder.
            - analyses one capture site per run (needs to be ran as many times as you have oligos in your design - for each capture's gff file separately)
            - some notes of how to run this (from before time the BLAT-filtering wasn't integral part of the pipeline) :
              http://userweb.molbiol.ox.ac.uk/public/telenius/CAPTUREC_DATA/paperRevisionHtml_240815/index

     chicago
            - a bit cumbersome to use, but can be worked out
            - developed to "massively parallel" capture experiments, containing more than 1 000 capture sites
            - has been succesfully used to analyse data from 30 capture sites up (however the developer of the code suggests it for only 1 000 capture sites and up)
            - capture-c data goes in as bam files (files containing only the reporter fragments) - use the "per oligo" bam files in F6 folder.
            - analyses all of your capture sites in a single run
            - tries to be clever in normalising all the capture sites to each others, to enable comparing the "relative strenghts" of each of the capture profiles
            - some notes of how to run this in Jelena's raw notes :
              /home/molhaem2/telenius/WorkingDiaries/working_diary33.txt
              /home/molhaem2/telenius/WorkingDiaries/working_diary34.txt
              /home/molhaem2/telenius/WorkingDiaries/working_diary35.txt


     Some pondering about all these tools in the bottom of this page :
     http://userweb.molbiol.ox.ac.uk/public/telenius/captureManual/statistics.html
            

Page updated by Jelena 27Sep2017