WELCOME

Section: Welcome to your CCB account (7)
Updated:
Index Return to Main Contents
 

INTRODUCTION

Welcome to your MRC WIMM Centre for Computational Biology (CCB) account. We recommend that you familiarise yourself with this information. With it, you should be up-to-speed on using your account as quickly as possible.

If you find an error in any of these pages or there's information which isn't provided that you feel would be useful, please contact us via the address provided below. Our documentation is important to us and we treat errors and omissions as problems to be fixed.  

OVERVIEW

Your CCB account allows you to log in to a high-performance computing cluster and includes access to:

- Over 200 bioinformatics programs

- Modern Python versions with hundreds of packages pre-installed

- Modern R versions with over 1000 libraries preinstalled

- A high performance compute cluster

- All major bioinformatics databases

- CCB introductory bioinformatics training course

 

ACCESSING YOUR ACCOUNT

There are three primary login servers for the CCB cluster. cbrglogin1.molbiol.ox.ac.uk is a regular memory node which should only be used for logins and to run jobs on the queue and is often very busy. cbrglogin2.molbiol.ox.ac.uk is a second regular memory node which should also only be used for logins and to run jobs on the queue; it is, however, generally less heavily used. cbrglogin3.molbiol.ox.ac.uk is a larger memory node sutiable for running interactive programs such as R-Studio.

There are two different ways to connect to the CCB servers. For a full graphical login you can use RDP ; this remote desktop software allows you to connect to a server and run graphical applications, giving a familiar ‘desktop view’ with pull down menus, desktop icons and file browser window options. For fast command line access, you can also connect using any SSH client ; both tmux(1) and screen(1) are installed for session persistence between logins. If you don't need to run a graphical application, such as R-Studio, we recommend taking some time to get comfortable using plain SSH. It's quicker over slow internet connections, avoids keyboard mapping issues and works more reliably.

Please note that CPU and memory resources on the login nodes are not unlimited and that there are watchdog scripts installed to ensure that a single user account cannot monopolise them or cause the system to become unstable. See cpuwatch(7) and memwatch(7) for further details.  

CHANGING YOUR PASSWORD

When your account was created you were provided with a randomly generated 16 character password. If you wish to keep this, you're more than welcome. If, however, you'd prefer to change it to something more memorable, please type the command:

$ passwd

Please note that:

- what you type won't appear on the screen as you type it

- your password must be at least 16 characters

- it must not be a password used anywhere else

- you must never, ever share it

For more information on choosing secure passwords, please see the University's Information Security website at https://www.infosec.ox.ac.uk/create-strong-passwords  

CUSTOMISING YOUR .BASHRC

Whether you're an experienced UNIX/Linux user or are only just at the beginning of your journey, sooner or later you will find someone advising you to change the contents of your .bashrc - the master configuration file for your shell.

If you're confident that you know exactly what you're doing you are, of course, free to modify this in any way you wish. At the same time, please be aware that a significant percentage of issues that we're asked to help with turn out to be the result of misconfiguration of a .bashrc and that you can even block your own logins.

In particluar, we recommend that you do not add any of the following to your .bashrc :

- Any commands which load modules

- Any commands which load Python venvs

- Any commands which load Conda evironments

- Any commands which alter your LD_LIBRARY_PATH

If you find yourself in the situation where you're considering doing any of these, please consider contacting us for help before doing so. We may be able to suggest a better alternative and save you hours (or even days) of wasted time trying to figure out why something has broken.  

DATA STORAGE AND SHARING

When your account is set up you are provided with a /home/ directory which has a quota limit of 20GB; your /home/ directory is intended for small but important files – configuration and settings, scripts, documents, source code etc. Your /home/ directory is completely private to you.

If you need to store large files such as NGS data or to share data with other users on the cluster you can request collaboration projects, which will be set up for you in /project/. Here, you have a larger 1TB quota limit which can be expanded on request, though additional charges will apply at a future date (further details will be announced in advance of any costs). Please note that the quota is for all data in all projects and not on a per-project basis. Collaboration projects are a secure way for data to be shared with specific groups of people. You just pick the logical set of data and define the people it needs to be shared with, then we create the project for you and update the list of people as time goes on so that it's always correct. They're a secure, convenient way to share data. For more information, see project(7). To join an exiting project, please ask the PI to contact us via genmail@molbiol.ox.ac.uk and ask for your username to be added.

You can additionally share data with other people over the internet using our public datashare service. For further details, see datashare(7).

If you are handling data that needs especially high confidentially (e.g. human patient data) or where there are specific security requirements imposed by the data source or funder (e.g. UK Biobank data), please get in touch with us before uploading it.  

FILE SYSTEM LAYOUT

For further information on the layout of the CCB filesystem, see hier(7).  

DATA BACKUPS

The entire contents of your /home/ directory are backed up daily - you don't need to do anything. However, no other data is backed up by default. To ensure that your valuable research is safe, we recommend that you read backup(7) and contact us for help when you're ready to get going.  

CHECKING YOUR QUOTAS

To find information concerning your quotas, please see getquota(1)  

TRANSFERRING FILES TO AND FROM THE SERVER

We recommend you use the program FileZilla. From the main interface, enter the following options into the Quickconnect bar:

Host: cbrglogin1.molbiol.ox.ac.uk

Username: [your CCB username]

Password: [your CCB password]

Port: 22

Then click the Quickconnect button to establish the connection. The files in your home directory on cbrglogin1 will be displayed in the lower right panel of FileZilla. These can be dragged to the left panel in order to copy files from the cluster to your local machine. Similarly, dragging files from the left panel to the right panel will copy them from your local machine to a directory on the cluster.

FileZilla is also installed on the login nodes, and can be used from within an RDP session. This can be handy for transfering files between the CCB server and any other remote file server (for example, to transfer fastq files from a sequencing facility).  

DATABASES ON THE CLUSTER

iGenomes is a collection of reference sequences and annotation files for commonly analysed organisms. The files were originally generated by Illumina. The files have been downloaded from Ensembl, NCBI, or UCSC, and chromosome names have been changed to be simple and consistent with their download source.

On the CCB servers these files can be found at /databank/igenomes/. More information can be found in igenomes(7)  

SOFTWARE ON THE CLUSTER

A huge amount of time has been invested in providing access to hundreds of bioinformatics, Python and R packages "out-of-the-box". These are provided using the module system, which makes it quick and easy to swap between the different software packages and versions that you require.

We recommend taking the time to read modules(7) , python-cbrg(7) and R-cbrg(7) before attempting to run any programs or install software which appears to be "missing".  

RUNNING JOBS

The CCB cluster uses the Slurm job manager to allow you to run dozens (or even hundreds) of programs at the same time. The use of Slurm is too large a topic to cover in this introdution, so please take a look at slurm-basics(7) for more details and consider attending one of our training courses.  

TRAINING

We provide a curated collection of self-study videos under the University's LinkedIn Learning subscription, which you can access at https://www.linkedin.com/learning/collections/6894987611124154368?u=76177458 using your SSO login. We recommend completing the course Learning Linux Command Line if you are new to Linux. The remainder of the courses are provided so that you can develop additional skills if you should need them.

The CCB runs introductory short courses which cover the Unix command line, programming in R, genomics workflows (ChIP-seq, RNAseq) and Slurm. For more details on the available topics and to book your place, please contact courses@molbiol.ox.ac.uk  

GETTING HELP

For the answers to commonly asked questions, please see faq(7). Otherwise, you can email the CCB team using the email address genmail@molbiol.ox.ac.uk. Using this address ensures your email is logged and assigned a tracking number, and will go to all the core team, which means the appropriate person or people will be able to pick it up. Whilst it might be tempting to mail an individual member of the team, the unfortunate reality is that your mail is likely to be lost in the large volume of existing traffic and take longer to answer than a mail to the help desk.

We also recommend that you send a single question or request per email, and that you always send new requests as new emails and not as replies to old ones. Doing this makes it both more likely that your quick questions are answered quickly, and less likely that your requests are overlooked because we think that we've already done them.  

FURTHER READING

We recommend that you take some time to read the additional man(1) pages listed at the bottom of this page, especially the faq(7). Investing some time now can save many wasted hours trying to figure out how to do something later!  

COPYRIGHT

This text is copyright University of Oxford and MRC and may not be reproduced or redistributed without permission.  

AUTHOR

Duncan Tooke <duncan.tooke@imm.ox.ac.uk>  

SEE ALSO

backup(7), cpuwatch(7), datashare(7), faq(7), hier(7), getquota(1), igenomes(7), man(1), memwatch(7), miseq(7), modules(7), project(7), python-cbrg(7), R-cbrg(7), slurm-basics(7)


 

Index

INTRODUCTION
OVERVIEW
ACCESSING YOUR ACCOUNT
CHANGING YOUR PASSWORD
CUSTOMISING YOUR .BASHRC
DATA STORAGE AND SHARING
FILE SYSTEM LAYOUT
DATA BACKUPS
CHECKING YOUR QUOTAS
TRANSFERRING FILES TO AND FROM THE SERVER
DATABASES ON THE CLUSTER
SOFTWARE ON THE CLUSTER
RUNNING JOBS
TRAINING
GETTING HELP
FURTHER READING
COPYRIGHT
AUTHOR
SEE ALSO