Did you know that you can navigate the posts by swiping left and right?

Simple Biostats Package (SBP)

31 May 2024 . Category . Comments #research #education-research #datawhys #data-science #machine-learning #programming #statistics #service #outreach

As part of our DataWhys project, we have been collaborating with Stan Pounds at St. Jude Children’s Research Hospital.

Stan is involved in multiple aspects of graduate education at St. Jude, and has part of that, he has created a set of R scripts that simplify the use of common R tasks down to a single function call.

For example, grade=describe("Grade",nki70) will, for the Grade variable of the nki70 dataset:

  • Create a plot of the appropriate type (in this case a bar chart with a bar for each level of the categorical variable and height equal to the $n$ of that level)
  • Create a table with $n$ and percents for those variables
  • Output a textual description of the results, i.e. “The categorical variable Grade has 144 observations: 48 Poorly diff (33.33%), 55 Intermediate (38.19%), 41 Well diff (28.47%).”

This is quite an interesting high level interface to R and in several ways, it overlaps/matches some of our project goals for creating a high level blocks programming language.

Over the past 9 months, our team has worked with Stan to migrate these scripts to a CRAN compliant R package. Although the package is not currently under review by CRAN, you can install it from GitHub with the following command:

devtools::install_github("stan-pounds/sbp")

These functions work seemlessly with our JupyterLab Blockly R extension and so create a goal-level blocks language for R - one that can be combined seemlessly with lower level blocks. A potential pedagogical approach for future work would be to transition students from goal-level blocks to a standard-R-level blocks and then finally to code.