Scientific Computing Fundamentals for CAMH Researchers

Are you looking for our October 2016 series of workshops? Click here

A six-day, self-paced series of workshops on scientific computing fundamentals taught with ♥ by CAMH researcher-nerds (we’re part of the CAMH Scientific Computing Working Group).

Brice Aminou
IT Programmer/ Analyst
Epigenetics Lab
Nikhil Bhagwat
Graduate Student
Kimel Lab
Navona Calarco
Research Analyst
Kimel Lab | VR Lab
Erin W. Dickie
Research Analyst
Kimel Lab
Ricardo Harripaul
Graduate Student
Vincent Lab
Colin Hawco
Scientist
Kimel Lab
Steve Hawley
Research Methods Specialist
Slaight Centre
Nuwan Hettige
Graduate Student
De Luca Lab
Janelle Hinds
Research Analyst
Slaight Centre
Melissa Levesque
Postdoctoral Fellow
Kimel Lab
Yuliya Nikolova
Postdoctoral Fellow
Sibille Lab
Jon Pipitone
Research Methods Specialist
Kimel Lab
Natalia Potapova
Research Methods Specialist
CAMH IT
David Rotenberg
Manager, Scientific Computing
CAMH IT
Marcos Sanches
Statistician
Research IT
Laura Stefanik
Graduate Student
Kimel Lab
Joseph Viviano
Research Methods Specialist
Kimel Lab
Andy Wang
Research Methods Specialist
Research IT
Why

Because …

… research is becoming more computational and you’ve probably never been formally trained in general computing skills.

That’s a problem.

Software is your experimental apparatus.

Just like cleaning test tubes and pipetting, computing is a basic skill you need to be competent with.

These workshops will focus on some computing skills fundamentals you’ll need for getting your study data organized, doing repeatable/reproducible analysis, and making use of existing CAMH computing resources to save time.

You should attend if you are doing any sort of scientific computing work, or work that involves repetition that could be automated. If you have questions, send email.

Workshops

Tuesday, May 17th

Ask us Anything about Computers!

9-10am, RS 2062

Everything you’ve always wanted to know about computers but were too afraid to ask!

This an informal, friendly Q&A session for any and all novice questions about computers and computing.

You’ll learn:

  • What’s inside a computer (we’ll take one apart for you)
  • What a hard drive is and how its different from RAM - What the differences are between Mac, Windows, and Linux
  • Whatever else you’re interested in…

Prerequisites: None

You’ll need: Just you.


Introduction to Programming

10am-12pm, RS 2062

Come here if you’d like to learn about programming, but have little to no prior experience. This will be an interactive workshop where you will play a game to learn new concepts.

You’ll learn:

  • What is programming, anyway?
  • Essential programming concepts, such as sequencing, loops, conditionals, functions, variables, and datatypes

Prerequisites: None

You’ll need: A laptop. We’ll be working with python, but you don’t need it installed.



Introduction to Linux and the Shell

1-3pm, RS 4100

Computers aren’t scary (yet) and knowing how to use them will make doing science better/easier/quicker.

You’ll learn:

  • About Unix/Linux
  • What a terminal/shell is
  • Managing (making, moving, editing) files and folders
  • Remote access (SSH/FTP)
  • Where to find help online/etc.
  • Some super useful Linux commands

Prerequisites: None

You’ll need: A laptop. If you’re using Windows, install BASH and mobaXterm.



Introduction to R

3-5pm, RS 4100

Note: This workshops is also offered on Tuesday May 24th, 10am-12pm

R is a free, featureful and sometimes magical language for doing statistical analysis. This workshop will introduce you to R and the Rstudio environment (see Part II for that).

You’ll learn:

  • Reading you data into and R dataframe
  • Sorting/merging/filtering your data tables
  • Getting summary statistics

Prerequisites: Some familiarity with another programming language

You’ll need: A laptop with R and R studio installed.



Wednesday, May 18th

Managing Code, Experiments, & Data

10am-12pm, RS 2015

Research can get messy quickly, but there are some tried and true ways of organizing your experiments that work.

You’ll learn:

  • File and directory naming conventions
  • Version control and backup
  • Creating scripts and metadata to reproduce your work

Prerequisites: Familiarity with the Shell (e.g. cd, ls, mv, cp)

You’ll need: A laptop. If you’re using Windows, install BASH and mobaXterm.



Automating in Linux

1-3pm, RS 4100

Doing work on lots of data by hand is boring and error prone. Learn how to use the shell to automate your work.

You’ll learn:

  • Running commands on many files (globbing, looping, if statements)
  • Reading and writing to files & sorting/filtering data in files
  • Writing scripts, chaining tools together (pipes, redirection)

Prerequisites: Familiarity with the Shell (e.g. cd, ls, mv, cp)

You’ll need: A laptop. If you’re using Windows, install BASH and mobaXterm.



Organizing your Files with GitHub

3-5pm, RS 2062

Git/Github are ugly but necessary tools to help you (and others) work. Git is the ultimate “undo” button for your scripts and data, and also it lets you publish your code online and work with others.

You’ll learn:

  • Track documents/code changes with git and github
  • Merge in changes from other users
  • Discuss issues on GitHub
  • Manage code changes on branches, send pull requests

Prerequisites: Preferably some familiarity with the Shell and version control experience (git, subversion, etc..).

You’ll need: Laptop with git and the shell installed.




Thursday, May 19th

Spreadsheets

10am-12pm, RS 2062

Everyone knows how to use a spreadsheet, but most people use them terribly inefficiently.

You’ll learn:

  • Navigating a spreadsheet
  • Autofilling in sequences, and advanced cut and paste
  • Avoiding “copy and paste” by using cell references
  • What a “formula” is
  • Important formulas to make auto-updating cells

Prerequisites: Basic familiarity with spreadsheets (Excel/Openoffice)

You’ll need: A laptop with Excel or OpenOffice installed.

SPSS

1-3pm, RS 2020 (please note same time as Web Design)

Note: This workshops is also offered on Friday May 27th, 1-3pm

SPSS can be very useful for simple data exploration to advanced statistics. It is widely used at CAMH and is a tool worth knowing.

You’ll learn:

  • What is SPSS
  • Types of data and how to enter them into SPSS
  • Data manipulation – Creating new variables
  • Data manipulation – Sorting
  • Basic data analysis – Means, Frequencies and Crosstabs

Prerequisites: Some familiarity with data in general, no previous knowledge of SPSS is required

You’ll need: Just you - computer and SPSS will be provided in lab. BUT if you have a laptop with SPSS, please bring it!



Web Design

1-3pm, CS 845 (please note same time as SPSS)

You’ll learn:

  • Creating a static web page with HTML tags.
  • Integrating CSS to your HTML page.
  • Make a web page more appealing with CSS.
  • Working with bootstrap.

What you’ll need: A laptop with Notepad++ or CoffeeCup installed.



MS Access

3pm-5pm, RS 2020

Many of us are using are data that is stored in Access databases but do not know how to these databases are built or how to make improvements.

You’ll learn:

  • Basic architecture: table, form, query, report and why do you need them.
  • How to create/modify tables. Tips and tricks.
  • How to create/modify forms. Form Wizard.
  • How to extract data from the database. Queries vs reports. Query Wizard, Report Wizard.
  • How to make something happen when you click on a button.
  • Using parameters to filter data.
  • Linking tables/forms (example: multiple appointments for one person)

Prerequisites: Familiarity with spreadsheets.

What you’ll need: A laptop with MS Access 2010 or access to remote.camh.ca (optional)




Tuesday, May 24th

Introduction to R

10am-12pm, RS 2062

Note: This workshops is also offered on Tuesday May 17th, 3-5pm.

R is a free, featureful and sometimes magical language for doing statistical analysis. This workshop will introduce you to R and the Rstudio enviroment (see Part II for that).

You’ll learn:

  • Reading you data into and R dataframe
  • Sorting/merging/filtering your data tables
  • Getting summary statistics

Prerequisites: Familiarity with another programming language

You’ll need: A laptop with R and R studio installed.

Advanced R

1-3pm, RS 2062

You know the basic functionality of R, but the hardest part is getting your data in order. In this workshop you’ll learn some fundamental ways of organizing and manipulating datasets, as well as modelling/visualising your data.

You’ll learn:

  • Melting/casting to reorganize your dataset
  • Transforming your variables
  • Linear modelling - Plotting and visualization with ggplot2

Prerequisites: Basic familiarity with R (i.e. Introduction to R)

You’ll need: A laptop with R and R studio installed.




Introduction to REDCap

3-5pm, RS 2062

REDCap is a web application for building and managing surveys and databases, and requires little programming knowledge.

You’ll learn:

  • How to design a study in REDCap
  • Sending surveys to participants
  • Setting up basic branching and display logic
  • Exporting data and generating reports
  • Using internal applications to review data quality

Prerequisites: None

You’ll need: Your own laptop (optional) and REDCap account (optional).




Wednesday, May 25th

Introduction to Python

9-11am, CS 734 note CS location

This short module is designed to give students familiar with programming a basic introduction to the python language.

You’ll learn:

  • Data types (e.g. lists and dictionaries)
  • String manipulation
  • Reading and writing to files
  • Modules and packages
  • Basic plotting with matplotlib

Prerequisites: Familiarity with another programming language.

You’ll need: A laptop with python installed.




Introduction to the SCC

11am-12pm, CS 734 note CS location

The Specialised Computing Centre (SCC) is CAMH’s own super-computer.

You can use this to speed up your processing-intensive analyses by distributing your work across many computers at once.

You’ll learn:

  • How the SCC is organized into different types of nodes
  • How to load software using the SCC’s modue system
  • How to get your data to and from the SCC using rsync
  • How to submit jobs on the SCC queue

Prerequisites: Familiarity with the Unix shell (e.g. ‘Introduction to Linux and the Shell’ and ‘Automating in Linux’ workshops)

You’ll need: An account on the SCC. We’ll provide computers, but if you’re bringing a Windows laptop, install BASH and mobaXterm.




Advanced SCC (Scientific Computing)

1-3pm, RS 2062

The purpose of this course is to expose students to automating process and working on the Specialized Computing Cluster.

You’ll learn:

  • How to run many samples in paraell using GNU Parallel
  • Debugging your jobs and Interactive Nodes
  • Best practices for using the cluster efficiently

Prerequisites: Familiarity with BASH scripting (e.g.’Introduction to Linux and the Shell’ and ‘Automating in Linux’ workshops) and familiarity with the SCC (e.g. Introduction to the SCC).

You’ll need: An account on the SCC, and a laptop. If it’s a Windows computer, install BASH and mobaXterm.




MATLAB

3-5pm, RS 2062

MATLAB is a scripting and programming language paired with an interactive environment that focuses on manipulation, analysis, and visualization of numerical data.

You’ll learn:

  • Using the graphical MATLAB environment and workspace
  • MATLAB variables (double, string, cell, struct)
  • Basic syntax and operations (for loops, while, if, switch, try)
  • How to call a function
  • How to write scripts and functions

Prerequisites: Familiarity with another programming or scripting language.

You’ll need: A laptop with MATLAB installed.




Thursday, May 26th

Writing Papers with RStudio

9-10:30am, RS 2062

Keeping your figures and data tables in sync with your paper text, and sharing it for editing can be a pain. There is another way.

You’ll learn:

  • Markdown, the plain-text format for writing documents
  • Creating figures, tables, and stats that update automatically
  • Managing your bibliography and citations
  • Creating a paper ready for journal submission

Prerequisites: Some R and Rstudio. Bonus points for familiarity with Git.

You’ll need: A laptop with R and Rstudio installed.


How to (ethically) use Photoshop for Figures

10:30am-12pm, RS 2062

Using some microscopy image data as an example, we’ll cover the all the steps in creating a figure for publication and address some of the ethical considerations when presenting your data.

You’ll learn:

  • Setting up your document
  • Importing images and keeping your layers organized
  • Aligning and re-sizing images
  • Adding text and shape layers
  • Masking vs cropping
  • Ethical file management, and use of layer effects and filters

Prerequisites: A basic familiarity with Photoshop may be useful, but not required.

You’ll need: Your own laptop with Photoshop installed (optional)




Python for MRI

1-3pm, RS 2062

Python is useful to MRI data. It’s flexible. And it’s free!

You’ll learn:

  • Use the interactive Python terminal
  • Import your NIFTI data
  • Interact with it and visualize it
  • Use common command-line python tools to analyze your NIFTI data

Prerequisites: Familiarity with Python.

You’ll need: A laptop with [Python] installed.


Advanced REDCap

3-5pm, RS 2062

You’ve built a project in REDCap, and now would like to learn more about this software.

You’ll learn:

  • HTML formatting and built-in templates
  • Examples of branching logic and calculating fields
  • Examples of piping
  • Public survey link vs. individual links
  • Data Quality module
  • Randomization Module
  • REDCap Mobile
  • And any questions you may have

Prerequisites: Some familiarity with REDCap.

You’ll need: Your own laptop (optional) and REDCap account (optional).


SPSS

1-3pm, RS 2020

Note: This workshops is also offered on Thursday May 19th, 1-3pm

SPSS can be very useful for simple data exploration to advanced statistics. It is widely used at CAMH and is a tool worth knowing.

You’ll learn:

  • What is SPSS
  • Types of data and how to enter them into SPSS
  • Data manipulation – Creating new variables
  • Data manipulation – Sorting
  • Basic data analysis – Means, Frequencies and Crosstabs

Prerequisites: Some familiarity with data in general, no previous knowledge of SPSS is required

You’ll need: Just you - computer and SPSS will be provided in lab, BUT if you have a laptop with SPSS, please bring it!



Register

Workshop Schedule and Registration

Are you looking for our October 2016 series of workshops? Click here

Powered by Eventbrite
Preparation

Install the following bits of software before you come to the workshops. For more detailed installation instructions go here.

Send us email if you are having trouble.

Linux/Shell (windows users only)

R

Python

MATLAB

Version Control with Git

Optionally:

REDCap

MS Access

Spreadsheets

SCC

More Help

What more help/instruction? There are lots of things going on at CAMH and U of T that you can get involved in:

  • **Weekly Office Hours - CS 163, Tuesday 1-2pm

    All staff, students and trainees welcome to drop in. This is an informal space to ask your knowledgeable colleagues about any computing issue, programming/scripting question, or analysis design.

  • Peers Teaching Peers - CS 163 Tue 12-1pm, Lunch provided!

    Weekly presentations by students and staff. The purpose of PTP is simply to get others thinking about how they may implement a method they are not familiar with and/or to build a working relationship with someone to help you with that method.

  • University of Toronto Scientific Coders

    Meets weekly to run workshops and co-working sessions.

  • SciNet Events and Courses

    SciNet is U of T’s super computing organization. Don’t let that intimidate you though; they run amazing introductory courses on linux, R, Python, and HPC.

  • SickKids: Bioinformatics Tools & Tricks: Hands-On Tutorials for Biologists