Data Analysis for IR 101: R Scripting
Participants are assumed to have experience manipulating spreadsheet data but little or no experience working with R scripts. You will learn how to write scripts in R Studio to describe and automate the same data operations you already know how to do in Excel. Each week you will be introduced to a topic, work through group exercises during class, work through more advanced exercises during a live problem session, and develop a course-long personal project.
1 Learning objectives
- Learn how to use R Studio to develop, test, and execute automated data processes
- Learn the basic commands from R’s tidyverse package, to replicate common data sheet operations, including data summaries
- Learn the basics of the R language, like arrays, strings, and control
2 Requirements of the class
- Elapsed time: 6 weeks
- Weekly online synchronous sessions (4 of each + 1 wrap-up)
- Lessons: Mondays at 3-4:30pmET. During this session, we will introduce new topics, demonstrate them, and provide time for students to experiment with them.
- Problem sessions: Wednesdays at 3-4pmET. During this session, students will work through a problem set, alone or in groups, and ask questions as they arise. No new topics will be introduced.
- Required tasks each week: Work on homework for approximately five hours each week outside of class
- Final project
- Students will be encouraged to think about what project they will choose throughout the course
- Students will be given one week to work on their project
- We devote the last class to student discussions of their projects
- Students will turn in their project end of the course
- Exit interview: Students will arrange a 30-minute discussion with the professors about R, the course, and their project.
3 Before the first live session
You (or your IT group if your computer is locked down by your organization) need to follow the instructions on the setup page. This basically consists of the following:
- Install
R
(>= v4.4.2) - Install
RStudio
(>= 2024.12.0)- And set your preferences for it.
- Install R packages:
tidyverse
tidylog
4 Schedule for the course
4.1 Week 1: Basics of data manipulation
4.1.1 Lesson: February 3 at 3pm
- Agenda
- We will discuss (and demonstrate where appropriate) each of the following. You will also have the opportunity to try out some of the commands.
- Introduction to
R
&RStudio
- Comparison of
R
andExcel
- Demonstrate the process of working with
R
&RStudio
- Introduction to the basics
- Data import and export:
read_csv()
andwrite_csv()
- Data tools:
select()
,filter()
,mutate()
- The pipe operator (
|>
) - Basic math operations
- Basic logic (
AND
,OR
,NOT
) operations
- Data import and export:
- Familiarize with
RforIR.com
- To-do after class
-
- Work through each of the lessons (below) before our upcoming problem session this week.
- After working through the lessons, then work through this week’s homework (below).
- Resources
-
- Lessons
- Advanced lesson
- In class
- Homework
4.1.2 Problem session: February 5 at 3pm
The contents of this class will be determined by student interests and needs. We will not cover any new material. This time will be spent providing time for students to work on the homework assignment. If students have questions about the Lessons for the week, then we will be glad to address those as well.
4.2 Week 2: Pivot tables
4.2.1 Lesson: February 10 at 3pm
- Agenda
-
- Grouping and ungrouping
- Summarize and mutate within groups
- Functions:
group_by()
,summarize()
,arrange()
,head()
- Calculations:
mean()
,median()
,max()
,min()
- Pivoting with
spread()
andgather()
- Functions:
- To-do after class
-
- Work through each of the lessons (below) before our upcoming problem session this week.
- After working through the lessons, then work through this week’s homework (below).
- Resources
4.2.2 Problem session: February 12 at 3pm
The contents of this class will be determined by student interests and needs. We will not cover any new material. This time will be spent providing time for students to work on the homework assignment. If students have questions about the Lessons for the week, then we will be glad to address those as well.
4.3 Week 3: Data joins
4.3.1 Lesson: February 24 at 3pm
- Agenda
-
- Left joins
- Primary & foreign keys
- ER diagrams
- To-do after class
-
- Work through each of the lessons (below) before our upcoming problem session this week.
- After working through the lessons, then work through this week’s homework (below).
- Resources
4.3.2 Problem session: February 26 at 3pm
The contents of this class will be determined by student interests and needs. We will not cover any new material. This time will be spent providing time for students to work on the homework assignment. If students have questions about the Lessons for the week, then we will be glad to address those as well.
4.4 Week 4: Putting it all together
4.4.1 Lesson: March 3 at 3pm
- Agenda
-
- Discuss and demonstrate the basics of the
for
loop - Discuss and demonstrate how to put data tables in a workbook (as a communications tool)
- Demonstrate several integrated
R
scripts that show a range of examples of what can be accomplished with the tools that we have learned about in this class
- Discuss and demonstrate the basics of the
- To-do after class
-
- Work through the lesson (below) before our upcoming problem session this week.
- Scan through and attempt to interpret the university example that you have in your possession.
- After working through the lesson, then work through this week’s homework (below).
- Resources
4.4.2 Problem session: March 5 at 3pm
The contents of this class will be determined by student interests and needs. We will not cover any new material. This time will be spent providing time for students to work on the homework assignment. If students have questions about the Lessons for the week, then we will be glad to address those as well.
4.5 Week 5: Project development
Students work on Personal/Institutional Project
4.6 Week 6: Wrap-up
4.6.1 Discussion: March 17 at 3pm
Submit & present final project; schedule discussion time with professors
- Turn in R script for the final project
- Present a summary in class
- Schedule a meeting for a course and project wrap-up discussion