BIOL 6297: Programming for Biologists

Course Goals:

This course will introduce students with no prior background to computing. It will enable students to "get their feet wet" with computational analyses. Topics will include: how to use a UNIX machine and work from the command-line, how to submit and monitor jobs using a cluster, how to compile software, and how to organize and manage large projects.

We will learn about diverse tools to accomplish computational tasks, including shell scripts, UNIX one-liners, generic scripting languages (Python), and languages for data visualization and statistical analysis (R). This course is intended to provide students who have no prior programming experience with the basic skill set needed to undertake computational projects as part of their research training.

Articles on Programming:

"Learn to Code, Learn to Think" (NPR story)

"For Big-Data Scientists, 'Janitor Work' is Key Hurdle to Insights (NY Times)

- A great article that explains the necessity of our course: 'data wrangling' is the main hurdle to getting bioinformatics research done.

"It's time to reboot bioinformatics education" (Toddot Blog)


Course Materials:


Schedule of Topics

Lecture 1 - Course Overview

Lecture 2 - Working from the command line (Part 1)

Practice exercises: Command-Line Exercises

LCTHW: Command-line crash course

Reference: UNIX reference card

Lecture 3 - Working from the Command Line (Part 2)

Practice Exercises: Command-Line Exercises, Part 2

Lecture 4: Working with Text Editors

Practice Exercises Using Emacs

Emacs Keystrokes - Explanation

Emacs Reference Card

Lecture 5: Awk and Sed

Practice Exercises Using Awk

Awk One-Liners

Lecture 6: Shell Programming

Bash practice exercises

Bash Practice: Answers to Problem 1

Bash Practice: Answers to Problem 2

Lecture 7: Running Jobs on the Cluster

Additional Notes on Xanadu and CACDS (Jerry Ebalunode's Presentation)

Lecture 8: Introduction to Python

Code Academy

Learn Code the Hard Way

Lecture 9: Python (cont'd): Dictionaries, Lists, Writing Functions

Lecture 10: Python (cont'd): Dictionaries, Formatting Text, Writing Classes

Lecture 11: Python (cont'd): Classes, Regular Expressions

Lecture 12: Regular Expressions, R

Instructions for Installing IPython

Lecture 13: Introduction to R

Lecture 14: Plotting in R

Example Plotting Code

Lecture 15: IPython Notebook

Ricardo's Example Notebooks: here and here