Scientific Computing Fundamentals for CAMH Researchers

October 4th - 13th, 2016

A six-day, self-paced series of workshops on scientific computing fundamentals taught with ♥ by CAMH researcher-nerds
(an initiative of the CAMH Scientific Computing Working Group)



Brice Aminou
IT Programmer/ Analyst
Epigenetics Lab
Nikhil Bhagwat
Graduate Student
Kimel Lab
Marzena Boczulak
Research Analyst
Slaight Centre
Navona Calarco
Research Analyst
Kimel Lab | VR Lab
Qing Chang
Research Methods Specialist
Research IT
Jessica D'Arcey
Research Analyst
Slaight Centre
Susana Da Silva
Graduate Student
VR Lab
Erin W. Dickie
Scientist
Kimel Lab
Leon French
Scientist
Bioinformatics
Daniel Groot
Research Analyst
Epigenetics Lab
Ricardo Harripaul
Graduate Student
Vincent Lab
Colin Hawco
Scientist
Kimel Lab
Steve Hawley
Research Methods Specialist
Slaight Centre
Janelle Hinds
Research Analyst
Slaight Centre
Grace Jacobs
Graduate Student
Kimel Lab
Sophie Lafaille
Research Coordinator
MRI Centre
Kyle Lago
Research Analyst
Geriatric Mental Health
Yuliya Nikolova
Postdoctoral Fellow
Sibille Lab
Dawson Overton
Research Methods Specialist
Kimel Lab
Natalia Potapova
Research Methods Specialist
CAMH IT
David Rotenberg
Manager, Scientific Computing
CAMH IT
Sarah Saperia
Research Analyst
VR Lab
Marcos Sanches
Statistician
Research IT
Dawn Smith
Systems Analyst
Kimel Lab
Laura Stefanik
Graduate Student
Kimel Lab
Andy Wang
Research Methods Specialist
Research IT
Tom Wright
Research Methods Specialist
Kimel Lab
Why

Because …

… research is becoming more computational and you’ve probably never been formally trained in general computing skills.

That’s a problem.

Software is your experimental apparatus. Just like cleaning test tubes and pipetting, computing is a basic skill you need to be competent with.

These workshops will focus on some computing skills fundamentals you’ll need for getting your study data organized, doing repeatable/reproducible analysis, and making use of existing CAMH computing resources to save time.

You should attend if you are doing any sort of scientific computing work, or work that involves repetition that could be automated.

This time around, we’re using a shared dataset - spanning demographic, cognitive, imaging, and genomic data - across all workshops so we can get started faster and increase attendees’ exposure to all the great work going on around the hospital

If you have questions, send us an email at scwg@camh.ca.

Workshops

Tuesday, October 4th

Ask us Anything about Computers!

9-10am, RS 2062

Instructor: Ricardo
Helper: Sophie

Description: Everything you’ve always wanted to know about computers but were too afraid to ask!

This an informal, friendly Q&A session for any and all novice questions about computers and computing.

You’ll learn:

  • What’s inside a computer (we’ll take one apart for you)
  • What a hard drive is and how its different from RAM - What the differences are between Mac, Windows, and Linux
  • Whatever else you’re interested in…

Prerequisites: None

You’ll need: Just you.


Programming Logic

10am-12pm, RS 2062

Instructor: Janelle
Helper: Brice

Description: Come here if you’d like to learn about programming, but have little to no prior experience. This will be an interactive workshop where you will play a game to learn new concepts.

You’ll learn:

  • What is programming, anyway?
  • Essential programming concepts, such as sequencing, loops, conditionals, functions, variables, and datatypes

Prerequisites: None

You’ll need: A laptop. We’ll be working with python, but you don’t need it installed.



Introduction to Linux and the Shell

1-3pm, CS 845 note CS location

Instructor: Dawn
Helper: Daniel

Description: Computers aren’t scary (yet) and knowing how to use them will make doing science better/easier/quicker.

You’ll learn:

  • About Unix/Linux
  • What a terminal/shell is
  • Managing (making, moving, editing) files and folders
  • Remote access (SSH/FTP)
  • Where to find help online/etc.
  • Some super useful Linux commands

Prerequisites: None

You’ll need: A laptop. If you’re using Windows, install BASH and mobaXterm.



Introduction to Python

3-5pm, CS 845 note CS location

Instructor: Dawson
Helpers: Dawn & Nikhil

Description: This basic instroduction is designed to make students familiar with programming in python.

You’ll learn:

  • Data types (e.g. lists and dictionaries)
  • String manipulation
  • Reading and writing to files
  • Modules and packages
  • Basic plotting with matplotlib

Prerequisites: Familiarity with another programming language.

You’ll need: A laptop with python installed.




Wednesday, October 5th

Spreadsheets

9:00am-11:00am, RS 2062

Instructors: Yuliya & Navona
Helper: Marzena

Description: Everyone knows what a spreadsheet is, but most people use them terribly inefficiently.

You’ll learn:

  • How to set up a spreadsheet for research logs and data
  • Autofilling in sequences
  • Avoiding “copy and paste” by using cell references
  • What a “formula” is
  • How to auto-update cells
  • Helpful data cleaning tools

Prerequisites: Basic familiarity with spreadsheets (Excel/Openoffice)

You’ll need: A laptop with Excel or OpenOffice installed.

SPSS

11am-1pm, RS 2020

Note: This workshop is also offered on Thursday Oct 13th, 9-11am

Instructor: Marcos
Helper: Sarah

Description: SPSS can be very useful for simple data exploration to advanced statistics. It is widely used at CAMH and is a tool very much worth knowing.

You’ll learn:

  • What is SPSS
  • Types of data and how to enter them into SPSS
  • Data manipulation – Creating new variables
  • Data manipulation – Sorting
  • Basic data analysis – Means, Frequencies and Crosstabs

Prerequisites: Some familiarity with data in general, no previous knowledge of SPSS is required

You’ll need: Just you - computer and SPSS will be provided in lab. BUT if you have a laptop with SPSS, please bring it!



Introduction to REDCap

1-3pm, RS 2020

Note: This workshop is also offered on Wednesday Oct 12th, 9-11am

Instructor: Navona
Helper: Kyle

Description: REDCap is a web application for building and managing surveys and databases. It requires little programming knowledge, and it’s increasingly used at CAMH for multiple purposes.

You’ll learn:

  • How to design a study in REDCap
  • Sending surveys to participants
  • Setting up basic branching and display logic
  • Exporting data and generating reports
  • Using internal applications to review data quality

Prerequisites: None

You’ll need: Your own laptop (optional) and REDCap account (optional).




Web Development (HTML & CSS)

3-5pm, RS 2062

Instructor: Brice
Helper: Dawn

Description: We access websites in our everyday life for all types of reasons. You might even want to program you own. Come and learn what the foundation of a website is, so you can do it yourself! You’ll learn:

  • Creating a static web page with HTML tags
  • Integrating CSS to your HTML page
  • Make a web page more appealing with CSS
  • Working with “bootstrap”

What you’ll need: A laptop with Notepad++ or CoffeeCup installed.



Thursday, October 6th

Introduction to R

9:00-11:00am, RS 2062

Note: This workshop is also offered on Wednesday October 12th, 1:00pm-3:00pm

Instructors: Erin
Helper: Yuliya, David, Susana, Laura, Grace Description: R is a free, featureful and sometimes magical language for doing statistical analysis. This workshop will introduce you to R and the Rstudio environment.

You’ll learn:

  • Reading you data into and R dataframe
  • Sorting/merging/filtering your data tables
  • Data Cleaning
  • Getting summary statistics

Prerequisites: Some familiarity with another programming language

You’ll need: A laptop with R and R studio installed. As well as packages (dplyr rms, and ggplot2)



MATLAB

11:00am-1:00pm, RS 2062

Instructor: Colin
Helper: Dawn

Description: MATLAB is a scripting and programming language paired with an interactive environment that focuses on manipulation, analysis, and visualization of numerical data.

You’ll learn:

  • Using the graphical MATLAB environment and workspace
  • MATLAB variables (double, string, cell, struct)
  • Basic syntax and operations (for loops, while, if, switch, try)
  • How to call a function
  • How to write scripts and functions

Prerequisites: Familiarity with another programming or scripting language.

You’ll need: A laptop with MATLAB installed.




MS Access

1:00pm-3:00pm, RS T321

Instructor: Tom
Helper: Jessica

Description: Many of us are using are data that is stored in Access databases but do not know how to these databases are built or how to make improvements.

You’ll learn:

  • Basic architecture: table, form, query, report and why do you need them.
  • How to create/modify tables. Tips and tricks.
  • How to create/modify forms. Form Wizard.
  • How to extract data from the database. Queries vs reports. Query Wizard, Report Wizard.
  • How to make something happen when you click on a button.
  • Using parameters to filter data.
  • Linking tables/forms (example: multiple appointments for one person)

Prerequisites: Familiarity with spreadsheets.

What you’ll need: A laptop with MS Access 2010 or access to remote.camh.ca (optional)




SQL

3-5pm, RS T321

Instructor: Natalia & Ricardo
Helper: Dawn & Brice

Description: SQL is a langauge crucial to databases (and underlies software like Access and REDCap!). Come and get a glimpse of what SQL is and can do for you.

You’ll learn:

  • how to build a database
  • entering data into a database
  • constructing queries for selecting, filtering, and joining tables
  • calculating basic statistics.
  • basic normalization principles
  • Basic SQL syntax

What you’ll need: A laptop



Tuesday, October 11th

Automating in Linux

9-11am, RS 2062

Instructor: Nikhil
Helper: Dawn

Description: Doing work on lots of data by hand is boring and error prone. Learn how to use the shell to automate your work.

You’ll learn:

  • Running commands on many files (globbing, looping, if statements)
  • Reading and writing to files & sorting/filtering data in files
  • Writing scripts, chaining tools together (pipes, redirection)

Prerequisites: Familiarity with the Shell (e.g. cd, ls, mv, cp)

You’ll need: A laptop. If you’re using Windows, install BASH and mobaXterm.



Introduction to the SCC

11am-1pm, RS 2062

Instructor: Andy & David
Helper: Ricardo

Description: The Specialised Computing Centre (SCC) is CAMH’s own super-computer.

You can use CAMH’S SCC to speed up your processing-intensive analyses by distributing your work across many computers at once.

You’ll learn:

  • How the SCC is organized into different types of nodes
  • How to load software using the SCC’s modue system
  • How to get your data to and from the SCC using rsync
  • How to submit jobs on the SCC queue

Prerequisites: Familiarity with the Unix shell (e.g. ‘Introduction to Linux and the Shell’ and ‘Automating in Linux’ workshops)

You’ll need: An account on the SCC. We’ll provide computers, but if you’re bringing a Windows laptop, install BASH and mobaXterm.




Advanced SCC (Scientific Computing)

1-3pm, RS 2062

Instructor: Andy & David
Helper: Ricardo

Description: The purpose of this course is to expose students to automating process and working on the Specialized Computing Cluster. You’ll learn:

  • How to run many samples in paraell using GNU Parallel
  • Debugging your jobs and Interactive Nodes
  • Best practices for using the cluster efficiently

Prerequisites: Familiarity with BASH scripting (e.g.’Introduction to Linux and the Shell’ and ‘Automating in Linux’ workshops) and familiarity with the SCC (e.g. Introduction to the SCC).

You’ll need: An account on the SCC, and a laptop. If it’s a Windows computer, install BASH and mobaXterm.




Managing Code, Experiments, & Data (with git’s help)

3:00-5:00pm, RS 2015

Instructors: Erin & Ricardo
Helpers: Dawson & Qing

Description: Research can get messy quickly, but there are some tried and true ways of organizing your experiments that work.

You’ll learn:

  • File and directory naming conventions
  • Version control with Github and GitLab (now at CAMH!)
  • Backup guidelines
  • Creating scripts and metadata to reproduce your work

Prerequisites: Familiarity with the Shell (e.g. cd, ls, mv, cp)

You’ll need: A laptop. If you’re using Windows, install BASH and mobaXterm.



Wednesday, October 12th

Introduction to REDCap

9-11am, RS 2020

Note: This workshop is also offered on October 5th, 1-3pm

Instructors: Steve & Kyle

Description: REDCap is a web application for building and managing surveys and databases. It requires little programming knowledge, and it’s increasingly used at CAMH for multiple purposes.

You’ll learn:

  • How to design a study in REDCap
  • Sending surveys to participants
  • Setting up basic branching and display logic
  • Exporting data and generating reports
  • Using internal applications to review data quality

Prerequisites: None

You’ll need: Your own laptop (optional) and REDCap account (optional).




Advanced REDCap

11am-1pm, RS 2062

Instructors: Natalia & Steve
Helper: Kyle

Description: You’ve built a project in REDCap, and now would like to learn more about this software.

You’ll learn:

  • HTML formatting and built-in templates
  • Examples of branching logic and calculating fields
  • Examples of piping
  • Public survey link vs. individual links
  • Data Quality module
  • Randomization Module
  • REDCap Mobile
  • And any questions you may have

Prerequisites: Some familiarity with REDCap.

You’ll need: Your own laptop (optional) and REDCap account (optional).


Introduction to R

1:00-3:00pm, RS 4100

Note: This workshop is also offered on Thusday October 6th, 9-11am.

Instructors: Erin & Yuliya
Helpers: David, Susana, Laura, Grace

Description: R is a free, featureful and sometimes magical language for doing statistical analysis. This workshop will introduce you to R and the Rstudio enviroment.

You’ll learn:

  • Reading you data into and R dataframe
  • Sorting/merging/filtering your data tables
  • Getting summary statistics

Prerequisites: Familiarity with another programming language

You’ll need: A laptop with R and R studio installed.

Exploring Data with R

3-5pm, RS 2062

Instructors: Erin & Yuliya
Helper: David & Navona

Description: You know the basic functionality of R, but the hardest part is getting your data in order. In this workshop you’ll learn some fundamental ways of organizing and manipulating datasets, as well as visualising your data.

You’ll learn:

  • Reorganize/Reshape your dataset with tidyr and dplyr
  • Quickly create stats tables
  • Plotting and visualization with ggplot2

Prerequisites: Basic familiarity with R (i.e. Introduction to R)

You’ll need: A laptop with R and R studio installed.




Thursday, Oct 13th

SPSS

9-11am, RS 2020

Note: This workshop is also offered on Wednesday October 11th, 11-1pm

Instructor: Marcos
Helper: Natalia

Description: SPSS can be very useful for simple data exploration to advanced statistics. It is widely used at CAMH and is a tool very much worth knowing.

You’ll learn:

  • What is SPSS
  • Types of data and how to enter them into SPSS
  • Data manipulation – Creating new variables
  • Data manipulation – Sorting
  • Basic data analysis – Means, Frequencies and Crosstabs

Prerequisites: Some familiarity with data in general, no previous knowledge of SPSS is required

You’ll need: Just you - computer and SPSS will be provided in lab, BUT if you have a laptop with SPSS, please bring it!



Statistics with R

11-1pm, RS T321

Instructors: Erin & Yuliya
Helper: Laura & Navona

Description: You know the basic functionality of R, now let’s actually run some statistical tests.

You’ll learn:

  • Transforming variables to normality
  • Linear modeling and regression
  • Building tables and figures to publish your results

Prerequisites: Some knowledge of statistics. Basic familiarity with R (i.e. Introduction to R)

You’ll need: A laptop with R and Rstudio installed. R packages dplyr, tidyr, cars and rms.


Writing Papers with RStudio

1:00-3:00pm, RS T321

Instructors: Erin & Leon
Helper: Ricardo & Grace

Description:Keeping your figures and data tables in sync with your paper text, and sharing it for editing can be a pain. There is another way.

You’ll learn:

  • Markdown, the plain-text format for writing documents
  • Creating figures, tables, and stats that update automatically
  • Keeping all things things organized

Prerequisites: Some R and Rstudio. Bonus points for familiarity with Git. Keeping your figures and data tables in sync with your paper text, and sharing it for editing can be a pain. There is another way.

You’ll need: A laptop with R and Rstudio installed.


Creating a Scientific poster using Photoshop

3-5pm, RS T321

Instructor: Steve
Helper: TBA

Description:You’re probably used to creating posters in MS PowerPoint, which is all well and good. If you’d like to take your posters to the next level, however, you’ll need something with a little more power under the hood. With its powerful image and text-editing capabilities, and fine control over layers and groups, Photoshop is an excellent option!

You’ll learn:

  • Setting up and laying out your document
  • Importing images and keeping your layers organized
  • Aligning and re-sizing images
  • Adding text and shape layers
  • Masking vs cropping
  • Ethical file management, and use of layer effects and filters

Prerequisites: A basic familiarity with Photoshop may be useful, but not required.

You’ll need: Your own laptop with Photoshop installed.




Register

Workshop Schedule and Registration

The SCWG workshop series is FREE for all CAMH students, staff, and trainees (please sign up with a CAMH email address, so that we know who you are!).

All others are welcome to join us for $20 per course. Questions? Email us at scwg@camh.ca

Powered by Eventbrite
Preparation

Install the following bits of software before you come to the workshops. For more detailed installation instructions go here.

Send us email if you are having trouble.

Linux/Shell (windows users only)

R

Python

MATLAB

Version Control with Git

Optionally:

REDCap

MS Access

Spreadsheets

SCC

Photoshop

  • A free 30-day trial can be downloaded from Adobe, and a full license can be bought for (a very reasonable) $9.99 USD a month. Note that while the basic principles of this course could be applied to free alternatives such as GIMP, direct translation between programs is difficult as there are many minor differences (which add up) between the programs.
More Help

What more help/instruction? There are lots of things going on at CAMH and U of T that you can get involved in:

  • **Weekly Office Hours - CS 163, Tuesday 1-2pm

    All staff, students and trainees welcome to drop in. This is an informal space to ask your knowledgeable colleagues about any computing issue, programming/scripting question, or analysis design.

  • Peers Teaching Peers - CS 163 Tue 12-1pm, Lunch provided!

    Weekly presentations by students and staff. The purpose of PTP is simply to get others thinking about how they may implement a method they are not familiar with and/or to build a working relationship with someone to help you with that method.

  • University of Toronto Scientific Coders

    Meets weekly to run workshops and co-working sessions.

  • SciNet Events and Courses

    SciNet is U of T’s super computing organization. Don’t let that intimidate you though; they run amazing introductory courses on linux, R, Python, and HPC.

  • SickKids: Bioinformatics Tools & Tricks: Hands-On Tutorials for Biologists