Final project

The goal of the project is to conduct a fully reproducible analysis that will be easy to reproduce even if the underlying data changes (say, we have to remove a row of data).

You can use any data you want as long as it is able to be shared with the teaching team. Some suggestions were given here.

The final project will be graded out of 11 points, which correspond to the objectives listed below.

You will “turn in” your project by submitting the link on Canvas to your github repo by September 4. If it is a private repository, please invite the teaching team (louisahsmith and TBD) as collaborators so that we can access it.

Specific objectives

  • Create a {gtsummary} table of descriptive statistics about your data (1 pt)
  • Fit a regression and present well-formatted results from the regression (1 pt)
    • The regression doesn’t have to be of any particular scientific interest, and you don’t have to interpret it in any particular way
    • You may use {gtsummary} or {broom} or both
  • Create a figure (1 pt)
    • It doesn’t need to look pretty; feel free to adapt some of the examples from class, which were as simple as hist(data$variable) and as complicated as the forest plot in the {broom} section
    • Feel free to look at some of the ggplot2 material from another course I taught
  • Write and use a function that does something with the data (1 pt)
    • It could be as simple as, for example, a new function to calculate a statistic for different variables, or a function that takes a dataframe and returns a summary table or fits a regression
  • Create and render a quarto document that includes at least:
    • The table, regression results, and figure, with appropriate captions (1 pt)
    • Inline R code in at least 2 places, 1 pulling a statistic from a table (i.e., using gtsummary::inline_text()) and 1 printing something else (like we did with the mean age in the example) (1 pt)
    • Cross-references to a table and a figure at least once each (1 pt)
    • A brief description of the data, including its source (1 pt)
  • Read in a dataset and save a file (can be data, table, figure, etc.) Use the {here} package every time you refer to file paths (at least twice) (1 pt)
  • Commit and push your work to GitHub as you go (1 pt)
  • In a README file, include any notes necessary for us to easily reproduce your analysis (e.g., “Run script.R and then render document.qmd”) as well as some information about your data (1 pt)
    • We should be able to make a minor change to the underlying data, then re-run the analysis to see how the change affects the results
Rubric
Description Points
There was no attempt to achieve the objective 0
There is an obvious attempt to achieve the objective, but an error in the code or something similar prevents it from working 0.5
The objective is fully achieved 1