Data Management

With big data comes big responsibility.

Data Repositories

Scholars Portal Dataverse

Scholars Portal Dataverse is a research data repository that emphasizes discoverability and long-term preservation of data while ensuring academic credit. For more information, see the section on Scholars Portal Dataverse in the UBC Brain Circuits Data Management white paper.

The Brain Circuits Cluster has its own dataverse, within which certain labs have set up their own dataverses.

Attention

The maximum file size for upload on Scholars Portal Dataverse is 2.5 GB.

If a dataverse for your lab already exists, ask your PI for access.

If your lab is interested in creating a dataverse, contact Jeffrey LeDue through email (jledue@mail.ubc.ca).

Federated Research Data Repository (FRDR)

The Federated Research Data Repository (FRDR) is a collaboration between Portage, Compute Canada (CC), and the Canadian Association of Research Libraries (CARL) which provides a system that combines curated large-scale data preservation with data discovery across Canadian research data repositories. Refer to the FRDR section in the UBC Brain Circuit’s Data Management white paper for more information.

Datasets in FRDR are searchable by anyone, however FRDR is currently in Limited Production and therefore only accepts data deposits from a select number of research groups. The cluster is pleased to be one of four Special Storage Groups and the first in UBC to deposit a dataset into FRDR.

PIs can deposit datasets into the cluster’s FRDR collection by becoming a depositor. Contact Jeffrey LeDue (jledue@mail.ubc.ca) to request access.

Attention

FRDR automaticallly archives data submissions less than 4 TB.

Version Control

Simply put, version control is a way of tracking changes to a file. It can seem like a lot of work on top of the big helping of work you likely already have so why add it to your plate?

Version control is the solution to many problems which are unfortunately familiar:

  • File recovery: computer crash, accidental deletion, corrupted file - in case of emergency, break the version control glass.
  • Versioning: Make experimental changes without relying on the undo button to save you when things goes awry.
  • Backup: Lost your copy of the file? Working from another computer? No problem.
  • Attribution: Credit is given where credit is due. A record of user contributions simplifies collaboration.
  • Changelog: Have you ever looked at previous work and wondered, Why did I do that?? Version control keeps a record of the changes you’ve made and why you’ve made them so your past self doesn’t confuse your future self.

Note

A remote repository is hosted on a server (online) while a local repository exists on a computer (offline). When referring to the same repository, the local repo is usually a clone of the remote which typically lives on GitHub.

Git

Git is a popular open-source distributed version control system.

GitHub hosts Git repositories. Check out the cluster’s GitHub Team repository. Each lab can be added as a Team. If interested, please contact Jeffrey LeDue (jledue@mail.ubc.ca).

Tip

Learning Git can be a tough task. Here are some resources to help you get started: