Lars Vilhuber, with Flavio Stanchi, Sylverie Herbert

2016/8/15-2016/8/17

**Time**: *9:00 - 5:00 p.m.* (we will typically end earlier)

**Location**: *Ives 105*

The goal of this class is to showcase high-performance techniques and tools for economics students. The goal is NOT to teach a full course on SAS, Stata, Matlab, R, etc. - there are other classes for that. We will teach just enough of each programming language to be able to highlight additional techniques (for SAS and Matlab, we will teach specialized workshops on each in a bit more depth) This course is designed to open your eyes to the possibilities, scratching the surface, but mostly not diving into any particular depths. Follow-on short courses may solve those needs. For specific programming languages, we point to offerings elsewhere on campus, for instance at CISER.

Second year Ph.D. in Economics or other social sciences.

- Working knowledge of at least one statistical programming language (R, SAS, Stata, Matlab, Gauss) - the specific languange is not important.
- Bring your laptop to class!

- Request an account on Econ Cluster on the account request page
- Fill out the online survey (sent out by email) ( results for 2013, 2014 and 2015)

9:00-9:30 Introduction (Lars) with reference to earlier survey results.

9:30-10:15 HP resources at Cornell, elsewhere (Lars)

<>

10:30-12:00 Basics (Lars), The command line (lecture notes)

12:00-12:30 Getting access to ECCO (SSH and NX)

13:30-15:00 Learning to qsub - Hello World example on ECCO (live in class)

<>

15:15-16:45 Introduction to parallel processing (Lars)

9:00-10:30 Git Basics of version control (Lars), Subversion (Sylverie): lecture notes ( Long tutorial referenced in class ) Git (Flavio) slides, notes, Setting up your repositories

<>

11:00-12:00 Subroutines and scalable programming (Lars)

13:00-14:45 Putting it into practice: Trying out parallel processing

15:00-16:30 CHOICES:

- A practical example: Big Data Matlab programming
- Leveraging parallel programming techniques in Matlab
- Explicit parallel programming in SAS (Lars) Example

- 9:00-11:00 Going beyond statistical programming languages: compilers, libraries, and virtual machines - setting up an Amazon EC2 server (live) (more in-depth tutorial)
- 11:00-12:00 CHOICES (tentative)
- Considerations for data management (Lars)
- Setting up a cluster (experimental, optional)

- 12:00 Workshop ends

- Some programs referenced in the class
- Basics courses for SAS, Stata, R, Matlab at CISER at http://ciser.cornell.edu/beta/workshops/ (new times will be posted soon)
- Computing for Data Analysis Coursera course and the classes on YouTube
- Code and Data
- Assessing time and memory usage in R: a nice brief tutorial
*(but with the conclusion to use Matlab…)* - Learning how to use doParallel in R: doParallel vignette