I reserve the right to make changes to the syllabus

Description and Outline

Statistics 243 is an introduction to statistical computing taught using R. The course will cover both programming concepts and statistical computing concepts.

Programming concepts may include:

Statistical computing topics may include:

The coverage of these topics complement the models/methods discussed in the rest of the statistics graduate curriculum. We will also cover the basics of UNIX/Linux, in particular some basic shell scripting.

Note that I aim to have the course be useful to those who already know a fair amount of R by (1) covering more advanced aspects of R, and (2) through the extensive coverage of the statistical computing topics.

Informal prerequisites

If you are not a statistics or biostatistics graduate student, please chat with me if you’re not sure if this course makes sense for you. A background in calculus, linear algebra, probability and statistics is expected, as well as a basic ability to operate on a computer (but not necessary a UNIX variant).

Furthermore, I’m expecting you will know the basics of R, at the level of the material in the R bootcamp offered by Chris Paciorek Aug. 20-21, 2016. If you don’t have that background you’ll need to spend time in the initial couple weeks getting up to speed.

Objectives of the Course

The goals of the course are that, by the end of the course, students be able to:

Primary References

Problem Sets

Problems will sometimes be somewhat open-ended, so those coming in at different levels may explore things with more or less sophistication. I’m also open to you defining your own assignment for a given topic, if you are working on a specific problem. E.g., instead of working on a particular text manipulation problem I assign, you might work with your own text data. Check with me before forging ahead.

We will be less willing to help you if you come to our office hours or Piazza at the last minute. Working with computers can be unpredictable, so give yourself plenty of time for the assignments.

Problem Set grading

The grading scheme for problem sets is:

If you turn in a PS late, I’ll bump you down a number. If you turn it in really late (i.e., after I start grading them), I may bump you down two levels. No credit after solutions are distributed.

Group Project Policy

Collaboration Policy

I encourage you to work together and help each other out, in the context of the following guidelines.

Class Time

My goal is to have classes be an interactive environment. This is both more interesting for all of us (hopefully) and more effective in learning the material. I encourage you to ask questions and will pose questions to the class to think about and discuss. To increase time for discussion and assimilation of the material in class, before some classes I may ask that you read material in advance of class.

Student backgrounds with computing will vary. For those of you with limited background on a topic, I encourage you to ask questions during class so I know what you find confusing. For those of you with extensive background on a topic (there will invariably be some topics where one of you will know more about it than I do), I encourage you to pitch in with your perspective. In general, there are many ways to do things on a computer, particularly in a UNIX environment, so it will help everyone (including me) if we hear multiple perspectives/ideas.

Please do not use phones during class and limit laptop use to the material being covered.

Email Policy

Academic Honesty

Please see the last section of this document for more information on the Campus Honor Code, which I expect you to follow.

The student community at UC Berkeley has adopted the following Honor Code: “As a member of the UC Berkeley community, I act with honesty, integrity, and respect for others.” The hope and expectation is that you will adhere to this code.

Collaboration and Independence: Reviewing lecture and reading materials and studying for exams can be enjoyable and enriching things to do with fellow students. This is recommended. However, unless otherwise instructed, homework assignments are to be completed independently and materials submitted as homework should be the result of one’s own independent work.

Cheating: A good lifetime strategy is always to act in such a way that no one would ever imagine that you would even consider cheating. Anyone caught cheating on a quiz or exam in this course will receive a failing grade in the course and will also be reported to the University Center for Student Conduct. In order to guarantee that you are not suspected of cheating, please keep your eyes on your own materials and do not converse with others during the quizzes and exams.

Plagiarism: To copy text or ideas from another source without appropriate reference is plagiarism and will result in a failing grade for your assignment and usually further disciplinary action. For additional information on plagiarism and how to avoid it, see, for example:

Academic Integrity and Ethics: Cheating on exams and plagiarism are two common examples of dishonest, unethical behavior. Honesty and integrity are of great importance in all facets of life. They help to build a sense of self-confidence, and are key to building trust within relationships, whether personal or professional. There is no tolerance for dishonesty in the academic world, for it undermines what we are dedicated to doing – furthering knowledge for the benefit of humanity.