Summary so far
AI Acknowledgment: Claude helped me write this recap based on the course content I wrote, and I verified and edited it.
Module 1.1: Introduction to the Course
Key Concepts
- Course setup and environment preparation
- Introduction to reproducible research principles
What You Accomplished
- Installed the
{usethis}package for streamlined R workflows- We discussed running code from the console vs. from a script
- We discussed common error messages related to packages and their functions
- Configured Git with your identity using
usethis::use_git_config() - Created a GitHub personal access token for secure authentication
- Set up credentials using
gitcreds::gitcreds_set()
Why This Matters
Proper setup ensures smooth integration between R, Git, and GitHub throughout your research workflow. The {usethis} package automates many common tasks, reducing errors and saving time.
Module 1.2: Introduction to Git and GitHub
Key Concepts
- Forking: Creating your own copy of someone else’s repository on your GitHub account
- Cloning: Downloading a repository from GitHub to your local computer
- Staging: Preparing files for commit (equivalent to
git add)- Checkbox next to the file in the Git tab in RStudio
- Committing: Saving changes with a descriptive message
- Commit (checkmark button); you must write a message and then click “Commit”
- Pushing: Uploading changes to GitHub
- Push (up arrow button); this syncs your local changes with the remote repository
What You Accomplished
- Forked the course repository on GitHub
- Cloned your fork to your local computer using RStudio
- Made your first edit (added your name to README.md)
- Practiced the Git workflow: stage → commit → push
- Viewed diffs to see exactly what changed
Why This Matters
Version control tracks every change to your code, allowing you to collaborate safely, recover from mistakes, and maintain a complete history of your project’s development.
Module 1.3: R Projects and File Management
Key Concepts
- R Projects: Self-contained working environments that improve reproducibility
- File organization: Structured folder hierarchies for different types of files
- Environment management: Starting fresh each session to avoid hidden dependencies
What You Accomplished
Created a proper file structure:
epi590r-in-class/ ├─ epi590r-in-class.Rproj ├─ README.md ├─ R/ │ └─ clean-data-bad.R ├─ data/ │ ├─ raw/ │ │ └─ nlsy.csv │ └─ clean/Configured RStudio to start with a clean environment
Analyzed problematic code patterns in
clean-data-bad.R
Why This Matters
Organized projects are easier to navigate, share, and reproduce. Starting with a clean environment each time prevents hidden dependencies that could break your code when run on different computers.
Module 1.4: The {here} Package
Key Concepts
- Relative vs. Absolute Paths:
here()creates paths relative to your project root - Cross-platform Compatibility: Paths work on Windows, Mac, and Linux
- Project Portability: Code runs regardless of where the project folder is located
What You Accomplished
- Installed and learned to use the
{here}package - Compared
here::here()vsgetwd()behavior - Examined improved code in
clean-data-good.Rthat uses{here} - Explored the course dataset (NLSY data)
Code Comparison
Bad (absolute paths):
setwd("/Users/myname/Documents/project")
data <- read.csv("data/raw/nlsy.csv")Good (relative paths with here that automatically start from the Project root):
data <- read.csv(here::here("data/raw/nlsy.csv"))Why This Matters
Using {here} makes your code portable and prevents the “works on my machine” problem. Your collaborators can run your code without modification.
Module 1.5: Starting From Scratch
Key Concepts
- Project/Repository Creation Workflow: Local first, then connect to GitHub
.gitignore: Preventing sensitive or unnecessary files from being tracked
What You Accomplished
- Created a new R project with Git initialization
- This can be for your final project!
- Created a new GitHub repository that will be linked to this project
- Connected your local repository to a new GitHub repository using the terminal to run Git commands
- Created and configured
.gitignoreto protect sensitive files - Set up a proper folder structure for a new project
The Complete Workflow
- Create R Project → New Directory → Enable Git
- Make Initial Commit → Stage → Commit locally
- Create GitHub Repository → New repo on GitHub
- Connect Them → Use terminal commands to push
- Organize Files → Create folder structure
- Protect Secrets → Configure
.gitignore
Why This Matters
Starting projects correctly from the beginning saves time and prevents problems later. Proper .gitignore configuration protects sensitive data and keeps repositories clean.
Key Takeaways
Best Practices You’ve Learned
- Always use R Projects for self-contained, portable analysis
- Use relative paths with
{here}instead ofsetwd() - Commit early and often with descriptive messages
- Organize files systematically with clear folder structures
- Never commit sensitive data - use
.gitignore - Start with a clean environment each session
Common Mistakes to Avoid
- Using absolute file paths that only work on your computer
- Saving and restoring R workspace between sessions
- Forgetting to stage files before committing
- Not writing descriptive commit messages
- Storing sensitive data in version control