Advocating for Automation:

Adapting Current Tools in Environmental Science through R

Hannah Podzorski

rstudio::conf(2022)

July 27, 2022

Why Automate?

Confused anime dude meme, where he's a programmer confusing a basic task with something that needs automation.

Reducing the Activation Energy

Line plot showing the progress of activation energy, or the energy needed to complete a chemical reation.

Same plot of activation energy showing that activation energy has been reduced.

Let’s Start with the Pitch

Differences in Workflow

Reactionary Workflow

Four boxes labeled A, B, C, and D.

Four boxes labeled A, B, C, and D with an asterisk next to A.

Four boxes labeled A, B, C, and D with an asterisks next to A and B.

Four boxes labeled A, B, C, and D with an asterisks next to A, B, and C.

Four boxes labeled A, B, C, and D with an asterisks next to A, B, C, and D.

Automated Workflow

Four boxes connected by arrows labeled A, B, C, and D.

Four boxes connected by arrows labeled A, B, C, and D with an asterisk next to A.

Four boxes connected by arrows labeled A, B, C, and D with an asterisk next to A and f(X) between each box.

Pros of Automation

  • Reproducibility
  • It can be simple!
  • Saves time
  • Less human interaction means less errors

Where to Start?

  • Start small, task can be automated in the same amount of time as the original task.
  • Meet team members where they are.

{openxlsx}

write.csv(data, "data.csv")

{openxlsx}

write.csv(data, "data.csv")

openxlsx::write.xlsx(data, "data.xlsx")

 

Example of formatted excel table.

{officer}

 

library(officer)
library(rvg)

plot <- rvg::dml(ggobj = plot)

pptx <-read_pptx() %>%
  add_slide() %>%
  ph_with(plot, ph_location(left = 1.3, top = 0.4, width = 8.75, height = 6.9))

print(pptx, "./R/Fig-Example.pptx")

{officer}

 

library(officer)

plot <- rvg::dml(ggobj = plot)

pptx <-read_pptx() %>%
  add_slide() %>%
  ph_with(plot, ph_location(left = 1.3, top = 0.4, width = 8.75, height = 6.9))

print(pptx, "./R/Fig-Example.pptx")

{officer}

Delayed
Gratification

ProUCL

  • Statistical Software for Left Censored Environmental Data
    • Calculates Upper Confidence Limits (UCLs)
  • Developed by the U.S Environmental Protection Agency (EPA)

 

EPA logo

ProUCL Automation

ProUCL Output

Example of the output file from ProUCL

Was automation the way to go?

Pros

  • Regulators are happy
  • Reduces human error
  • Saves time

Cons

  • Stability of ProUCL
  • Requires special set up

Final Thoughts

  • It’s ok to start small.
  • All skill sets are welcome!
  • Ultimate goal is to leave more time for greater value tasks.

Questions?

Slides and code availabe at, github.com/hannahpodzorski/advocating-for-automation

 

Contact Information:

hpodzorski@gsi-net.com

twitter - @hpodz

github - hannahpodzorski