Syllabus

This class is about scientific and statistical computing. It is intended to provide you with a strong foundation in computing skills that are increasingly necessary for a practicing statistician and scientists generally.

The main topics that we will learn about during the class are

  • The R environment and programming language
    Basics, data structures, control flow, graphics, writing functions.
    simulation, sampling, exploratory data analysis.
  • Data manipulation & Regular Expressions
    Basic input techniques for rectangular data, for non-rectangular data, text manipulation and regular expressions
  • Shell tools and programming and working with other languages.
  • Web-related computing, Web services, XML
    Accessing data via the Web: Scraping data from HTML pages, HTML forms, REST services, SOAP, parsing XML.
    Creating graphics for the web, e.g., Google Earth, SVG animations, ...
  • Relational database management systems (RDBMS)
    concepts of databases, relational model, structured query language (SQL) and accessing databases from R.
  • We will encounter these topics in the context of exploring real data. Much of the work will involve manipulating and exploring data and making sense of it through summaries and creative graphical displays. We will also use the computer and programming to perform simulations of stochastic processes. We will also use some statistical modeling, covering statistical methods that you may not have seen in other classes (e.g. k-nearest neighbors, cross validation, bootstrapping). We will cover these heuristically rather than with formal theory. So you will learn the computing topics by using them in actual settings.

    The primary goals of the class are

    See detailed topics for more information


    Grading

  • 70% 4 or 5 homeworks
  • 20% Final project.
  • 10% Class participation
    This includes asking and answering questions in class, on the class mailing list, in office hours and generally being engaged.
  • Policies

    1. You can discuss approaches to problems with other students.
    2. You cannot copy code from other students.
    3. You can look for hints, code and solutions on the Web, but you must acknowledge them in your writeups.
    4. Reports:

    Duncan Temple Lang
    <duncan@wald.ucdavis.edu>
    Last modified: Fri Sep 25 07:48:08 PDT 2009