Genetics and Genomics
Analysis Platform

Introduction slides


  • Goals of GenAP Platform:
    • Make bioinformatics analyzes more accessible to non-bioinformaticians
    • Reduce data processing bottlenecks
  • GenAP leverages Compute Canada HPC infrastructure

Common problems tackled by GenAP

  1. Avoid having multiple copies and repeated installation of standard databases (e.g. hg19) and tools (e.g. BWA)
  2. Simplify the deployment, maintenance and sharing of bioinformatics pipelines on Compute Canada resources
  3. Facilitate data sharing (e.g. to avoid having to send hard drives)
  4. Simplify the access to Compute Canada resources for life scientists via standard web applications (e.g. UCSC Browser, Galaxy)

GenAP Components

  1. Infrastructure and User Portal
    Build around VMs. With UCSC, Galaxy, etc.
  2. CVMFS (Cern Virtual Machine File System) for code distributions
    To facilitate code maintenance and distribution across Calcul Quebec nodes.
  3. Genetics and Genomics Pipelines and Tools Framework
    Including the installation of common bioinformatics tools and datasets.

GenAP Service: Galaxy

  • Runs jobs on Compute Canada HPC
    • Uses one’s space and computation time allocations
  • Offers over 400 tools
  • Offers support to add more tools on request
  • Instantiates a VM for each GenAP project

GenAP Service: Data hub

  • Open web space to:
    • Publish documents, for online tools
    • Share files with collaborators
  • Example usage: UCSC Genome Browser track hub
    • Tracks to be displayed as annotations in the browser
    • Text files with instructions on how to organize tracks

GenAP Portal

  • Purpose: Managing GenAP services and resources
  • Authenticates against the CCDB
    • Once you have a Compute Canada account, you can log in the Portal
  • Basic setup automatically created when Project Owner (PI) first logs in the Portal
    • A single project with default storage allocation
    • All sponsored CC users are automatically members
  • What’s remaining to use a service: Instantiate it

GenAP Portal

  • More advanced setup can be reached with GenAP projects
    • Assign specific CCDB user, even outside of sponsored users group
    • Modify/split CC storage allocation per project
    • Assign project administrators

Step 1 - Log in with CC account

Step 2 - Welcome screen

Step 3 - Get to project list and click Edit

Step 4 - Get Applications list

Step 5 - Create a new application (e.g. Galaxy)

Application gets instantiated, and virtual machine starts

Project page gives other information, such as members