WELCOME

Section: Welcome to the CCB (7)
Last updated: Wed 10 Jan 16:53:42 GMT 2024
Index Return to Main Contents
 

DEPRECATION NOTICE

This CCB documentation has been superseded by our new website: https://lumin.imm.ox.ac.uk  

INTRODUCTION

Welcome to WIMM Centre for Computational Biology (CCB). If you find an error in any of these pages or there's information which isn't provided that you feel would be useful, please contact us via the address provided below. Our documentation is important to us and we treat errors and omissions as problems to be fixed.

If you're logged into one of our JADE cluster login nodes, note that you can read this information at any time using the man(1) command, e.g:

man welcome
 

POLICIES COVERING USE OF CCB SYSTEMS

In using our systems, you are agreeing to our tsandcs(7) and acceptable-use(7). If you wish to withdraw your agreement at any time, please inform us so that we can remove your access for you.  

APPLYING FOR ACCESS

To find out how to apply for a CCB account or to get storage space, please see accounts(7).  

DIFFERENCES BETWEEN JADE AND CENTOS 7

If you're an existing user there are some important differences between JADE and CentOS 7. You can read more about these in differences(7).  

LOGGING IN

- If you just need to use R and will rarely log in at all, point your browser at https://rstudio.molbiol.ox.ac.uk . The server has 128 CPU cores and 2TB memory.

- If you plan to use a Jupyter notebook, please read https://datashare.molbiol.ox.ac.uk/public/files/Jupyter.pdf before spending time working out how to set custom sessions. This uses the same physical hardware as R-Studio.

- For direct SSH logins, there are two primary login servers for the CCB cluster. login1.molbiol.ox.ac.uk and login2.molbiol.ox.ac.uk can be used equivalently and you can connect using any SSH client and using a command such as:

ssh -l your_user_name login1.molbiol.ox.ac.uk
Both tmux(1) and screen(1) are installed for session persistence between logins. Please be aware that the CCB login nodes are provided in order to allow file managment, light development work and submission of jobs to the cluster. As such, all user logins are limited to a maximum of 2 CPU cores and 4GB memory ; if you attempt to use more than 2 cores then nothing bad will happen (it will just be no faster). On the other hand, if you attempt to use more memory your processes will be automatically terminated. In addition, to guard against the system slowing to a crawl if too many people are logged in at the same time, watchdog programs are installed as a last line of defence. See cpuwatch(7) , memwatch(7) and acceptable-use(7) for further details. If you think need more than this, see slurm-basics(7) for information on interactive Slurm sessions.

- There is no graphical login available. If you'd like to learn more about the Linux command line, please consider attending an OBDS training course as described at https://www.imm.ox.ac.uk/research/units-and-centres/mrc-wimm-centre-for-computational-biology/training/oxford-biomedical-data-science-training-programme . If you want to use programs such as Jupyter which provide graphical interfaces, please consider using our pre-provided services described above; for advanced users, you can additionally use SSH tunnels with port forwarding as described at ssh-tunnel(7). To view files installed locally, note that the -X option to SSH is enabled for X forwarding, and that the image viewer feh(1) and PDF viewer xpdf(1) are both installed.

 

CHANGING YOUR PASSWORD

When your account was created you were provided with a randomly generated 16 character password. If you wish to keep this, you're more than welcome. If, however, you'd prefer to change it to something more memorable, please login using SSH as described above and type the command:

$ passwd

Please note that:

- what you type won't appear on the screen as you type it

- your password must be at least 16 characters

- it must not be a password used anywhere else

- you must never, ever share it

For more information on choosing secure passwords, please see the University's Information Security website at https://www.infosec.ox.ac.uk/create-strong-passwords  

CUSTOMISING YOUR .BASHRC

Whether you're an experienced UNIX/Linux user or are only just at the beginning of your journey, sooner or later you will find someone advising you to change the contents of your .bashrc - the master configuration file for your shell.

If you're confident that you know exactly what you're doing you are, of course, free to modify this in any way you wish. At the same time, please be aware that a significant percentage of issues that we're asked to help with turn out to be the result of misconfiguration of a .bashrc and that you can even block your own logins.

In particluar, we recommend that you do not add any of the following to your .bashrc :

- Any commands which load modules

- Any commands which load Python venvs

- Any commands which load Conda evironments

- Any commands which alter your LD_LIBRARY_PATH

If you find yourself in the situation where you're considering doing any of these, please consider contacting us for help before doing so. We may be able to suggest a better alternative and save you hours (or even days) of wasted time trying to figure out why something has broken.  

DATA STORAGE AND SHARING

When your account is set up you are provided with a /home/ directory which has a quota limit of 10GB; your /home/ directory is intended for small but important files; configuration and settings, scripts, documents, source code etc. Your /home/ directory is completely private to you.

If you need to store large files such as NGS data or to share data with other users on the cluster you can request collaboration projects, which will be set up for you in /project/. Please note that the quota for a project is for all of the data in it and not on a per-user basis. Collaboration projects are a secure way for data to be shared with specific groups of people. You just pick the logical set of data and define the people it needs to be shared with, then we create the project for you and update the list of people as time goes on so that it's always correct. They're a secure, convenient way to share data. For more information, see project(7). To join an exiting project, please ask the PI to contact us via help@imm.ox.ac.uk and ask for your username to be added.

You can additionally share data with other people over the internet using our public datashare service. For further details, see datashare(7).

If you are handling data that needs especially high confidentially (e.g. human patient data) or where there are specific security requirements imposed by the data source or funder (e.g. UK Biobank data), please get in touch with us before uploading it.  

FILE SYSTEM LAYOUT

For further information on the layout of the CCB filesystem, see hier(7).  

CHECKING YOUR QUOTAS

To find information concerning your quotas, please see getquota(1)  

TRANSFERRING FILES TO AND FROM THE SERVER

We recommend you use SCP on the command line or the graphical program FileZilla. If you use FileZilla, then from the main interface enter the following options into the Quickconnect bar:

Host: login1.molbiol.ox.ac.uk

Username: [your CCB username]

Password: [your CCB password]

Port: 22

Then click the Quickconnect button to establish the connection. The files in your home directory on login1 will be displayed in the lower right panel of FileZilla. These can be dragged to the left panel in order to copy files from the cluster to your local machine. Similarly, dragging files from the left panel to the right panel will copy them from your local machine to a directory on the cluster.  

DATABASES ON THE CLUSTER

iGenomes is a collection of reference sequences and annotation files for commonly analysed organisms. The files were originally generated by Illumina. The files have been downloaded from Ensembl, NCBI, or UCSC, and chromosome names have been changed to be simple and consistent with their download source.

On the CCB servers these files can be found at /databank/igenomes/. More information can be found in igenomes(7)  

SOFTWARE ON THE CLUSTER

A huge amount of time has been invested in providing access to hundreds of bioinformatics, Python and R packages "out-of-the-box". These are provided using the module system, which makes it quick and easy to swap between the different software packages and versions that you require.

We recommend taking the time to read modules(7) , python-cbrg(7) and R-cbrg(7) before attempting to run any programs or install software which appears to be "missing".

JADE also has the singularity container solution installed (aka apptainer), and is available without loading any module. Example: singularity exec your-Singularity-container.sif ... . This is also compatible with Docker container images, example: singularity run docker://ubuntu:latest  

RUNNING JOBS

The CCB cluster uses the Slurm job manager to allow you to run dozens (or even hundreds) of programs at the same time. The use of Slurm is too large a topic to cover in this introduction, so please take a look at slurm-basics(7) for more details.  

GETTING HELP

For the answers to commonly asked questions, please see faq(7). Otherwise, you can email the CCB team using the email address help@imm.ox.ac.uk. Using this address ensures your email is logged and assigned a tracking number, and will go to all the core team, which means the appropriate person or people will be able to pick it up. Whilst it might be tempting to mail an individual member of the team, the unfortunate reality is that your mail is likely to be lost in the large volume of existing traffic and take longer to answer than a mail to the help desk.

We also recommend that you send a single question or request per email, and that you always send new requests as new emails and not as replies to old ones. Doing this makes it both more likely that your quick questions are answered quickly, and less likely that your requests are overlooked because we think that we've already done them.  

STAYING UP TO DATE

We maintain a running record of updates to JADE at changelog(7). We also publicise issues and limitations of JADE at known-issues(7).  

FURTHER READING

We recommend that you take some time to read the additional pages listed at the bottom of this page, especially the faq(7). Investing some time now can save many wasted hours trying to figure out how to do something later!  

COPYRIGHT

This text is copyright University of Oxford and MRC and may not be reproduced or redistributed without permission.  

AUTHOR

Duncan Tooke <duncan.tooke@imm.ox.ac.uk>  

SEE ALSO

acceptable-use(7), accounts(7), changelog(7), cpuwatch(7), datashare(7), differences(7), faq(7), hier(7), getquota(1), igenomes(7), known-issues(7), memwatch(7), miseq(7), modules(7), project(7), python-cbrg(7), R-cbrg(7), ssh-tunnel(7), slurm-basics(7), tsandcs(7)


 

Index

DEPRECATION NOTICE
INTRODUCTION
POLICIES COVERING USE OF CCB SYSTEMS
APPLYING FOR ACCESS
DIFFERENCES BETWEEN JADE AND CENTOS 7
LOGGING IN
CHANGING YOUR PASSWORD
CUSTOMISING YOUR .BASHRC
DATA STORAGE AND SHARING
FILE SYSTEM LAYOUT
CHECKING YOUR QUOTAS
TRANSFERRING FILES TO AND FROM THE SERVER
DATABASES ON THE CLUSTER
SOFTWARE ON THE CLUSTER
RUNNING JOBS
GETTING HELP
STAYING UP TO DATE
FURTHER READING
COPYRIGHT
AUTHOR
SEE ALSO