Did you know that you can navigate the posts by swiping left and right?

KIDS23 Workshops

17 May 2023 . Category . Comments #service #outreach #datawhys #data-science #programming #blocks

We presented two workshops at the Knowledge In Data Science (KIDS23) Symposium at St. Jude Children’s Research Hospital.

Both workshops were based on our DataWhys curriculum.

The first workshop was oriented towards those with limited knowledge of programming. Here is the abstract:

Blocks-based Data Science (Python)

Participants in this workshop will learn and use a blocks-based programming language to solve data science problems in the JupyterLab computational notebook environment. Blocks-based programming allows users unfamiliar with programming to manipulate blocks, which fit together like puzzle pieces, to generate Python code. This removes some of the burden of learning to program (memorizing syntax, syntax errors, etc) and allows users to focus on solving data science problems. This workshop will cover introductory content covering loading data from files, basic data manipulation, plotting, and descriptive statistics. A free companion online course that takes the same approach to more advanced topics will be offered in the near future.

On GitHub here and here

The second workshop was oriented towards those with some knowledge of R who wanted to learn more about data manipulation. Here’s that abstract:

Data Manipulation using dplyr (R)

Participants in this workshop will learn a variety of data manipulation techniques using the popular R package dplyr (and friends), including selecting rows, selecting columns, pivoting, and summarizing data, all inside the JupyterLab computational notebook environment. Participants highly familiar with R will work directly with R code; participants less familiar with R can use a blocks-based programming language that generates R code. Blocks-based programming allows users unfamiliar with programming to manipulate blocks, which fit together like puzzle pieces,removing some of the burden of learning to program (memorizing syntax, syntax errors, etc) and allowing users to focus on solving data science problems. A free companion online course that takes the same approach to more advanced topics will be offered in the near future.

On GitHub here and here

Both went well, but the dplyr workshop worked particularly well. I think this was because the dplyr workshop had participants who were generally at the same skill level while the basic workshop strangely had a wide range of skill levels, including people who knew how to program already.