Scientific Computing Fundamentals for CAMH Researchers

A five-day, self-paced series of workshops on scientific computing fundamentals taught with ♥ by CAMH researcher-nerds (we’re part of the CAMH Scientific Computing Working Group).

Navona Calarco
Research Analyst
Kimel Family Translational Imaging-Genetics Research Lab
Erin W. Dickie
Research Analyst
Kimel Family Translational Imaging-Genetics Research Lab
Dan Felsky
Graduate Student
Kimel Family Translational Imaging-Genetics Research Lab
Ricardo Harripaul
Graduate Student
Vincent Lab
Natalia Potapova
Research Methods Specialist
CAMH IT
Yuliya Nikolova
Postdoc Fellow
Sibille Lab
Jon Pipitone
Research Methods Specialist
Kimel Family Translational Imaging-Genetics Research Lab
David Rotenberg
Manager, Scientific Computing
CAMH IT
Ishraq Siddiqui
Graduate Student
Virtual Reality Lab
Umakajan Umakanthan
Research Analyst
Psychiatric Neurogenetics Lab
Joseph Viviano
Research Methods Specialist
Kimel Family Translational Imaging-Genetics Research Lab
Why

Because …

… research is becoming more computational and you’ve probably never been formally trained in general computing skills.

That’s a problem.

Software is your experimental apparatus.

Just like cleaning test tubes and pipetting, computing is a basic skill you need to be competent with.

These workshops will focus on some computing skills fundamentals you’ll need for getting your study data organized, doing repeatable/reproducible analysis, and making use of the CAMH compute cluster (SCC) to save time.

You should attend if you are doing any sort of scientific computing work, or work that involves repetition that could be automated. If you have questions, send email.

Workshops

Monday, November 23

Computer Organization

10am-12pm, RS 2022

Everything you’ve always wanted to know about computers but were too afraid to ask!

This an informal, friendly Q&A session for any and all novice questions about computers and computing.

You’ll learn:

  • What the different parts of a computer are
  • The difference between software and hardware
  • What a server is and how it’s different from a desktop or laptop

Prerequisites: None

You’ll need: Just you.


Intro to Linux (1 of 2)

1pm-3pm, RS 2022

Computers aren’t scary (yet) and knowing how to use them will make doing science better/easier/quicker.

You’ll learn:

  • About Unix/Linux
  • What a terminal/shell is.
  • Managing (making, moving, editing) files and folders
  • Remote access (SSH/FTP)
  • Where to find help online/etc.
  • Some super useful Linux commands

Prerequisites: None

You’ll need: A laptop. If you’re using Windows, install BASH.



Spreadsheets

3pm-5pm, RS 2022

Everyone knows how to use a spreadsheet, but most people use them terribly inefficiently.

You’ll learn:

  • Navigating a spreadsheet
  • Autofilling in sequences, and advanced cut and paste
  • Avoiding “copy and paste” by using cell references
  • What a “formula” is
  • Important formulas to make auto-updating cells

Prerequisites: Basic familiarity with spreadsheets (Excel/Openoffice)

You’ll need: A laptop with Excel or Openoffice installed.



Tuesday, November 24

Intro to Linux (2 of 2)

10am-12pm, RS 2022

Doing work on lots of data by hand is boring and error prone. Learn how to write shell scripts to automate your work.

You’ll learn:

  • Running commands on many files (globbing, looping, if statements)
  • Reading and writing to files & sorting/filtering data in files
  • Writing scripts, chaining tools together (pipes, redirection)

Prerequisites: Familiarity with Linux (e.g. cd, ls, mv, cp)

You’ll need: A laptop. If you’re using Windows, install BASH and mobaXterm.



Introduction to R (1 of 2)

1pm-3pm, RS 2022

R is a free, featureful and sometimes magical language for doing statistical analysis. This workshop won’t cover much stats (see Part II for that).

You’ll learn:

  • loading libraries/installing packages
  • basic syntax, commands, and scripting practices
  • reading/sorting/merging/filtering tabular data (e.g. CSV files)
  • basic summary statistics

Prerequisites: Familiarity with another programming language

You’ll need: A laptop with R and Rstudio installed.



Databases

3pm-5pm, RS 2022

TBA




Wednesday, November 25

Version Control

10pm-12pm, CS 734

Git/Github are ugly but necessary tools to help you (and others) manage changes to scripts and data, and publish your code.

You’ll learn:

  • Put documents/code in a git repository
  • Push and pull code changes to github
  • Merge in changes from other users
  • Manage code changes on branches, send pull requests

Prerequisites: Preferably some version control experience.

You’ll need: Laptop with git installed.

REDCap

1pm-3pm, RS 2015

REDCap is a web application for building and managing surveys and databases, and requires little programming knowledge.

You’ll learn:

  • How to design a study in REDCap
  • Sending surveys to participants
  • Setting up basic branching and display logic
  • Exporting data and generating reports
  • Using internal applications to review data quality

Prerequisites: None

You’ll need: (Optional) your own laptop.



Introduction to Python

3pm-5pm, RS 2015

This short module is designed to give students familiar with programming a basic introduction to the python language.

You’ll learn:

  • Data types (e.g. lists and dictionaries)
  • String manipulation
  • Reading and writing to files
  • Modules and packages
  • Basic plotting with matplotlib

Prerequisites: Familiarity with another programming language (e.g. MATLAB, R).

What you’ll need: A laptop with a Python distribution installed.




Thursday, Nov 26

MRI with Python

10am-12pm, RS 2015

Python is useful to MRI data. It’s flexible. And it’s free!

You’ll learn:

  • Use the interactive Python terminal
  • Import your NIFTI data
  • Interact with it and visualize it
  • Use common command-line python tools to analyze your NIFTI data

Prerequisites: Familiarity with Python.

You’ll need: A laptop with Python installed.

MATLAB

1pm-3pm, RS 2015

MATLAB is a scripting and programming language paired with an interactive environment that focuses on manipulation, analysis, and visualization of numerical data.

You’ll learn:

  • Using the graphical MATLAB environment
  • Basic syntax and operations
  • Manipulating vectors, matrices, and tabular data
  • Writing scripts and functions
  • Basics of built-in commands for plotting and statistics

Prerequisites: Familiarity with another programming or scripting language.

You’ll need: A laptop with MATLAB.




Friday, November 27

R (part II)

10am-12pm, RS 2022

You know the basic functionality of R, but the hardest part is getting your data in order.

In this workshop you’ll learn some fundamental ways of organizing and manipulating datasets, as well as modelling/visualising your data.

You’ll learn:

  • Melting/casting dataframes
  • Transforming your data
  • Linear modelling
  • Plotting and visualization with ggplot2

Prerequisites: Basic familiarity with R

You’ll need: A laptop with R and Rstudio installed.




Introduction to the SCC

1pm-3pm, RS 2022

The Specialised Computing Centre (SCC) is CAMH’s own super-computer.

You can use this to speed up your processing-intensive analyses by distributing your work across many computers at once.

You’ll learn:

  • How the SCC is organized
  • Suitable tasks for the SCC
  • How to run many tasks in parallel

Prerequisites: Familiarity with the Unix shell

You’ll need: Computers provided. If bringing a Windows laptop, install BASH and mobaXterm.




Scientific Computing

3pm-5pm, RS 2022

The purpose of this course is to expose students to automating process and working on the Specialized Computing Cluster. Topics will be:

You’ll learn:

  • Advanced bash loops – automating processing
  • Parallelization using GNU Parallel
  • Using the Debugging and Interactive Nodes
  • I/O
  • Using the RAMDISK

Prerequisites: Familiarity with BASH scripting.

You’ll need: A laptop. If it’s a Windows computer, install BASH and mobaXterm.




Register

Workshop Schedule and Registration

Preparation

Install the following bits of software before you come to the workshops. For more detailed installation instructions go here.

Send us email if you are having trouble.

Linux/Shell (windows users only)

R

Python

MATLAB

Version Control with Git

Optionally:

More Help

What more help/instruction? There are lots of things going on at CAMH and U of T that you can get involved in:

  • Weekly Office Hours - CS 163, Tue 11am-12pm, Thu 2-3pm

    All staff, students and trainees welcome to drop in. This is an informal space to ask your knowledgeable colleagues about any computing issue, programming/scripting question, or analysis design.

  • Peers Teaching Peers - CS 163 Tue 12-1pm, Lunch provided!

    Weekly presentations by students and staff. The purpose of PTP is simply to get others thinking about how they may implement a method they are not familiar with and/or to build a working relationship with someone to help you with that method.

  • University of Toronto Scientific Coders

    Meets weekly to run workshops and co-working sessions.

  • SciNet Events and Courses

    SciNet is U of T’s super computing organization. Don’t let that intimidate you though; they run amazing introductory courses on linux, R, Python, and HPC.

  • SickKids: Bioinformatics Tools & Tricks: Hands-On Tutorials for Biologists