Syllabus
N.B. Syllabus subject to change.
Students in Stat C131A are expected to have read the syllabus in its entirety by the second week of the course.
Course Details π₯
Description π
Stat C131A is an upper-division course that follows Data 8 or STAT 20. The course will teach a range of statistical methods that are used to solve data problems, including standard parametric statistical models, multiple linear regression and classification, and possibly Bayesian methods, sampling methods, and more advanced Machine Learning models. We will use the R statistical language, and will obtain hands-on experience in implementing a range of statistical methods on numerous real world datasets.
Communication
Strictly follow these guidelines whenever you need to get in touch:
- Create a public ED post. This will be helpful for so many others!
- If the matter is confidential, create a private ED post (visible to the course staff only).
- If the matter is even more confidential, you can email one specific staff member, but this should be an absolute last resort. We reserve the right to ignore your emails, if we believe you should have gone through ED instad. Canvas messages will be ignored.
Lectures π§βπ«
MWF 11am-12noon @ Weill 101
- Lecture attendance is mandatory but you have 5 drops! See below for more details.
PLEASE, DO NOT COME TO CLASS IF YOU THINK YOU MIGHT BE SICK! EVEN A LITTLE!!!
- Youβll recover more slowly, get other people sick, and lose lecture time, OH, group tutoring and in general make the class harder and less useful for yourself and everyone else.
- we are a mask friendly class
- if you need more than 5 drops for extreme circumstances please get in touch on ED and make sure to have documentation ready.
Labs π§ͺ
Lab 101: Tuesday and Thursday, 10am-11am @ Evans 332
Lab 102: Tuesday and Thursday, 4-5pm @ Evans 330
- You should only be signed up for one lab section.
- Lab attendance is optional, but highly encouraged.
Office hours (OH) ποΈ
We may reschedule OH if needed, even at the last minute. The Google Calendar is the up-to-date source of truth.
For security reasons, Zoom links for OH are posted on bcourses.
Andreaβs Office Hours:
- Mondays (Evans 339) 2pm-3pm
- Wednesdays (zoom) 9pm-10pm
- Thursdays (zoom) 3pm-4pm
Vanβs Office Hours:
- Tuesdays, Thursdays (Evans 446) 11am-12pm
Why/when come to OH:
- OH is a great opportunity to discuss not only topics directly related to the course, but also anything else thatβs on your mind.
- We also welcome questions about career trajectories and research opportunities at UC Berkeley and beyond.
- Keep in mind that you do not need to come to office hours with an agenda. Listening in is welcomed and encouraged!
Study groups π₯
We encourage you to work together in groups to solidify your understanding of the course material.
You can use ED Discussion to find group mates.
Our goal is to form the study groups ASAP, so students can begin discussing the first homework assignment.
Waitlist, Concurrent enrollment, and Auditing π
Students who wish to audit should get in touch with the staff to get added to bcourses/ED Discussion.
Stat C131A will not be enrolling concurrent enrollment students in Spring 2026. However, concurrent enrollment students are welcome to audit the course.
Waitlisted students should contact staff to get added to bcourses/ED Discussion/Gradescope, attend lecture, and complete assignments. However, there is no guarantee that they will eventually get a spot in the class, if other students do not drop out. Make sure that you do not have conflicts that prevent the system from enrolling you in the class.
Course platforms π₯οΈ
bcourses will only be used for announcements and secure course material, like exam solutions, grades, and office hour Zoom links.
All other course materials will be posted on the public course homepage.
Assignments should be submitted via Gradescope.
All other course communication will take place via Ed.
Grades π―
Grades are calculated as follows:
- Lecture attendance and participation: 5% (5 drops)
- Labs: 15% (2 drops)
- Homework: 20% (1 drop)
- Final project: 20%
- Exams: 40%
- 20%: larger of (final_part_1, 50% * midterm + 50% * final_part_1)
- 20%: final_part_2
The final will be broken down into two parts:
one (final_part_1) that covers the same material as the midterm and will be averaged out with the score on the midterm, unless you did better on the final, in which case youβll just get the score of the final_part_1. This is so if you fail the midterm, you get a fresh chance to redeem yourself on the final.
the other (final_part_2) that covers material after the midterm
Drops of lowest scores are automated. You do not need to get in touch about them.
To pass the class (with a C- or P) you need to score at least 66% overall.
Grades will not be curved.
- In other words, there is no limit to the proportion of students with an A, B, etc. You are incentivized to help each other learn. This is not the place for competition.
- However, grade cutoffs may be slightly adjusted at the end of the semester, in case exam/assignments difficulty is not calibrated perfectly.
Lecture technology policy β π©βπ» \(~\) β π±
Most lectures will consist of an interactive problem-solving session, followed by a hands-on demo or coding session.
Laptops and tablets with attached keyboards are not allowed during the problem-solving session, though you are permitted to use a tablet to take handwritten notes. This is because laptops are extremely distracting for yourself, the people around you, and the instructor.
- [exception] If you need to use technology for accessibility reasons, the previous bullet does not apply to you. Please reach out to the staff, so we know.
- Laptop use is encouraged during the hands-on demo and coding sessions.
Phones are allowed during lecture. It is preferable to use a phone to submit conceptual questions and neighbor discussion answers during lecture.
The course staff reserves the right to reduce your lecture attendance grade for violating the technology policy.
Lecture recordings π₯
Lectures will be recorded automatically.
- The course staff cannot guarantee audio or video quality.
- Lecture recordings are posted on bcourses.
Labs and office hours are not recorded.
The homework assignments may occasionally ask you to watch additional recordings to supplement the lecture material (e.g., if we run out of time covering an essential topic).
Concept checks β
We will use in-class concept checks and neighbor discussions to track attendance.
- Concept checks are not graded.
- Concept checks are answered via [this form]
- Submitting a concept check outside of standard lecture time is considered cheating and an honor code violation. We will use your seat number and submission time to validate that your responses were entered during lecture time. We reserve the right to photograph the lecture hall to verify attendance.
Neighbor discussions π£οΈ
In addition to concept checks, there may be one or more neighbor discussions during each lecture.
- Neighbor discussion answers are submitted via [this form]
- Neighbor discussion answers are not graded.
Homework π
There are 6 homework assignments planned, though the exact number may change.
- Homework will be a combination of computational exercises and data analysis using the computer, as well as conceptual questions.
- Homework assignments are weighted equally.
HW is generally due every other week.
- Homework assignments will be posted to the course website at least one week before the HW deadline.
- All homework assignments will be submitted via Gradescope and are due by 11:59 pm of the due date.
We will drop your lowest-scoring homework assignment.
Poorly organized assignments will be docked points at the discretion of the grader.
- It is critical to have empathy for the person who will be reviewing your work.
Labs π§ͺ
During lab sessions, a GSI will review conceptual material and help you work through lab coding assignments.
- Lab sections meet twice a week.
We plan to have 12 lab assignments.
- Each assignment will teach you how to perform the analyses shown in class using
R. - Labs are intended to be finished or mostly finished during section.
- HW assignments may build on the exercises covered in lab.
Lab assignments are generally due on Mondays at 11:59 pm and should be submitted via Gradescope.
- Labs are graded on completion, not accuracy. As long as you have made a serious and convincing attempt at finishing them, youβll get full credit.
Late Work β°
HW/Labs are due at 11:59pm. However, you can still submit work after midnight until the next C131A lecture time and weβll apply an automated deduction of 15%.
The only exceptions are students with Letters of Accommodations.
Final project π
The final project will be due on the last day of reading week, Friday May 8th.
More details on the final project will be provided later in the semester.
Exams β
The midterm is scheduled for Wednesday March 18 and will take place during lecture in Weill 101.
The final exam will take place Tuesday May 12, 7-10pm (scheduled by the registrar). Location TBD.
If you cannot attend an exam due to an extenuating circumstance, please contact the course staff on ED ASAP to determine whether your circumstance qualifies for a make-up exam.
Textbooks and resources π
Everything you need to know for Stat C131A will be covered in lectures, labs, and assignments.
- It is possible to do very well in Stat C131A without ever referring to an outside textbook or resource.
However, most of the course material is covered by the online textbook developed specifically for C131A.
- You can find the textbook here.
The StatQuest YouTube Channel is an excellent resource.
- StatQuest provides videos on many of the topics we will cover in class. The instructor is very entertaining!
If you would like some additional optional reading, you can try the following books.
- Theory Meets Data by Ani Adhikari. This is the online book for STAT 88 that covers introductory probability at the level of Stat 20.
- R for Data Science, by Garrett Grolemund and Hadley Wickham. This is a free online book that covers the
tidyverseset ofRpackages. - The Statistical Sleuth: A Course in Methods of Data Analysis by Ramsey and Schafer
- Introductory Statistics with R by Peter Dalgaard
None of these books covers all of the topics we will cover in C131A, nor do they necessarily have the same perspective and focus as this class. But for those students wanting some additional structure or R assistance, these books may be helpful and should be at the right level for this class.
Stat 20 and Data 8 are similar courses, but each covers some subjects that the other does not. While we will cover these topics in class, you may find the following useful background if you are seeing them for the first time (more to follow):
- Computational and Inferential Thinking: The Foundations of Data Science, by Ani Adhikari and John DeNero β chapters 11-13.
This is the online book used by Data 8. These chapters introduce hypothesis testing using only resampling ideas, ideas which are not necessarily covered in Stat 20.
Policy on Large Language Models (LLMs) π¬
LLMs (e.g., ChatGPT) are becoming increasingly important in the workplace.
However, we do not understand to what extent they might take away from the effectiveness of the learning process.
Therefore, youβre very welcome to use chatbots to learn and brainstorm but not to generate solutions to any of the work that you will need to submit.
If you find an especially positive or negative use case of an LLM for any component of the course, please share it with the staff. We are excited to hear what you find!
Code in ED posts
If you include code in your Ed post, please use the code editing fonts:
Standard font is hard to read:
ββ Attaching packages βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ tidyverse 1.3.0 ββ β ggplot2 3.3.2 β purrr 0.3.4 β tibble 3.0.3 β dplyr 1.0.2 β tidyr 1.1.2 β stringr 1.4.0 β readr 1.3.1 β forcats 0.5.0 ββ Conflicts ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ tidyverse_conflicts() ββ x dplyr::filter() masks stats::filter() x dplyr::lag() masks stats::lag()
# hereβs my plot code
x <- ggplot(df) + geom_point(aes(x = year, y = count))
Code font is easier to read:
ββ Attaching packages βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ tidyverse 1.3.0 ββ
β ggplot2 3.3.2 β purrr 0.3.4
β tibble 3.0.3 β dplyr 1.0.2
β tidyr 1.1.2 β stringr 1.4.0
β readr 1.3.1 β forcats 0.5.0
ββ Conflicts ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ tidyverse_conflicts() ββ
x dplyr::filter() masks stats::filter()
x dplyr::lag() masks stats::lag()
# here's my plot code
x <- ggplot(df) + geom_point(aes(x = year, y = count))
Computing environment π₯οΈ
The official course materials use the R programming language.
- As in Data 8 and Stat 20, labs and assignments will be distributed via DataHub.
You do not need to know anything about R to take this course.
- We will provide resources for you to learn everything you need to know.
The concepts taught in this course are language-agnostic.
- In other words, everything you learn in this class can be readily implemented using a combination of other tools (e.g.,
Python,SQL, etc.). - Note that LLMs are an excellent aid for translating your knowledge across different programming language and software.
Academic Honesty Policy π
Homework and projects must be completed independently, with the following exceptions:
- You may discuss specific issues/questions you have about the homework at a high level, but you must not sit down and do the assignment jointly.
- Giving advice about code or coding tips is also not cheating, but you can not directly share code with other classmates.
For exams, cheating includes, but is not limited to, using electronic materials in an exam beyond that allowed, copying off another personβs exam or quiz, allowing someone to copy off of your exam or quiz, and having someone take an exam or quiz for you.
Requesting, obtaining, and/or using solutions from previous years or from the internet or other sources, if such happen to be available, is considered cheating.
- Any evidence of cheating will result in a score of zero (0) on the entire assignment or examination, and perhaps a failing grade in the class.
- We will always report incidences of cheating to the Office of Student Conduct, which may administer additional punishment.
Accommodations π
UC Berkeley is committed to creating a learning environment that meets the needs of its diverse student body including students with disabilities.
- If you anticipate or experience any barriers to learning in this course, please feel welcome to discuss your concerns with Andrea, whether after class, in office hours, via Ed, or via email.
If you already have a Letter of Accommodation, and you want to discuss it, please get in touch with the staff ASAP.
- We can accommodate you more easily if you provide this information early in the semester.
- We cannot guarantee that last-minute requests for accommodation will be provided.
If you have a disability, or think you may have a disability, you can work with the Disabled Studentsβ Program (DSP) to determine any accommodations you may need to have equal access in this course.
- The Disabled Studentsβ Program (DSP) is the campus office responsible for authorizing disability-related academic accommodations, in cooperation with the students themselves and their instructors.
- You can find more information about the DSP application process here.
- Josh is available if you have any questions or concerns about your accommodations.
- In the event of a disagreement, the proper procedure is for you to work with your DSP Specialist and your DSP Specialist to work with Josh toward a resolution.
Accessible DS education for all β
In support of our commitment to making Data Science education inviting, engaging, and respectful for people of diverse identities, backgrounds, experiences, and perspectives, I want to relay the following three items from the Data Science Undergraduate Studies (DSUS):
Device Lending options
Students can access device lending options through the Student Technology Equity Program (STEP) program.
Data Science Student Climate
Data Science Undergraduate Studies faculty and staff are committed to creating a community where every person feels respected, included, and supported. We recognize that incidents may happen, sometimes unintentionally, that run counter to this goal. There are many things we can do to try to improve the climate for students, but we need to understand where the challenges lie. If you experience a remark, or disrespectful treatment, or if you feel you are being ignored, excluded or marginalized in a course or program-related activity, please speak up. Consider talking to your instructor, but you are also welcome to contact Executive Director Christina Teller at cpteller@berkeley.edu or report an incident anonymously through this online form.
Community Standards
Posts on ED must relate to the course and be in alignment with Berkeleyβs Principles of Community and the Berkeley Campus Code of Student Conduct. We expect all posts to demonstrate appropriate respect, consideration, and compassion for others. Please be friendly and thoughtful; our community draws from a wide spectrum of valuable experiences. Posts that violate these standards will be removed.