---
title: "Reproducibility and Session Auditing Workflows"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Reproducibility and Session Auditing Workflows}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, eval = FALSE)
library(devkit)
```

# Introduction

Reproducibility is the cornerstone of robust data science and package development. However, R scripts often introduce hidden side effects—modifying global options, graphics parameters, or directory paths—or fail when run in a clean environment. 

`devkit` provides a suite of auditing tools to monitor and guarantee reproducibility.

---

# 🕵️ Auditing Script Side Effects

R scripts often modify settings like `options()`, `par()`, or the working directory (`setwd()`). If a script does not restore these settings on exit, it leaves the user's environment in a mutated state.

`audit_script()` monitors a target script for such side effects. It runs the script, compares the environment's parameters before and after, and provides an interactive choice to revert changes.

```r
# Audit a script for environment side-effects
audit_script("scripts/generate_plots.R")
```

---

# ⚠️ Detecting Namespace Masking

Namespace conflicts occur when multiple attached packages export functions with the same name (e.g., `filter()` in both `dplyr` and `stats`). This can lead to silent bugs if the package search path changes.

`detect_masking()` identifies all conflicts between currently attached packages and provides a report of conflicts and resolution paths.

```r
# Detect all namespace masking in the current session
mask_report <- detect_masking()

# Check detected conflicts
print(mask_report$conflicts)
```

---

# 🧪 Clean-Room Simulation

To ensure that your script does not rely on variables or objects defined in your active global environment, you should test it in a vanilla R session.

`simulate_clean_room()` launches a separate, clean R process (using `R --vanilla`) to execute the script and returns the result, verifying that the script is truly self-contained.

```r
# Run the script in an isolated vanilla R session
clean_res <- simulate_clean_room("scripts/model_fitting.R")

print(clean_res$success) # TRUE if the script executed with exit code 0
```

---

# 📸 Session Snapshots for Portability

If you need to share your code or deploy it to production, you must document the exact versions of the packages attached to your current session.

`export_snapshot()` scans your session for external packages and generates a reproducible installer script. Running this generated script on another machine installs the exact package versions required.

```r
# Export a reproducibility script lock file
export_snapshot(
  filename = "reproduce_env.R",
  include_versions = TRUE
)
```

---

# ⏱️ Performance Benchmarking across Git Branches

When refactoring code to improve speed, you should verify and quantify the performance improvement across Git branches.

`benchmark_branches()` runs a specific benchmarking script against multiple Git branches (e.g., `main` vs. a feature branch), automatically switching branches, executing the script, timing it, and restoring your original Git state when finished.

```r
# Compare execution times between development and main branches
bench_results <- benchmark_branches(
  script = "scripts/benchmark_heavy_load.R",
  branches = c("main", "feature/optimise-joins"),
  reps = 3
)

# Inspect the timing comparison dataframe
print(bench_results)
```