---
title: "Structured Logging"
date: "Updated as of `r as.Date(Sys.time())`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Structured Logging}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
library(log4r)

# Dummy Plumber request & response.
req <- list(
  REQUEST_METHOD = "POST", PATH_INFO = "/upload",
  QUERY_STRING = "", HTTP_USER_AGENT = "curl/7.58.0",
  REMOTE_ADDR = "124.133.52.161"
)
res <- list(status = 401)

# Dummy CSV-parsing stuff:
filename <- "catpics_01.csv"
entries <- data.frame(x = 1:4124)
start <- Sys.time()
```

Traditionally, messages emitted from R packages or scripts are unstructured messages, like this one from the **shiny** package:

``` text
Listening on http://localhost:8080
```

A richer, more structured representation of this log message might be:

``` text
level=INFO ts=2021-10-21T20:21:01Z message="starting Shiny" host=localhost port=8080 shiny_version=1.6.0 appdir=projects/mycoolapp
```

This second message uses *structured logging*, attaching relevant metadata to a log message as standalone fields.

Structured logs have two advantages:

* They tend to have more standard, predictable content, which can make reading
  them easier (especially if you did not write the message yourself); and

* They are much, much easier for log management and aggregation systems to
  query -- fields can be used to aggregate logs into metrics like "how
  many times has X happened" and "how long does task Y take on average", and
  individual fields can be used to answer questions like "what happened to user
  Z over their last few logins".
  
**log4r** includes support for two of the most popular structured logging
formats:

* `json_log_layout`: Emit log messages as JSON, likely the most widely-used.

* `logfmt_log_layout`: Emit log messages using the more human-friendly [logfmt](https://brandur.org/logfmt).

To use these formats, you can pass additional arguments to the existing logging
functions `info()`, `warn()` and `error()`.

## JSON Logs

The most popular format for structured logging is probably JSON, which you can
configure as follows:

```{r json}
logger <- logger(appenders = console_appender(json_log_layout()))
```

As an example, suppose you are logging unauthorised requests to a [Plumber](https://www.rplumber.io/) API. You might
have a log message with fields like the following:

```{r json-example}
# Here "req" and "res" are slightly fake request & response objects.
info(
  logger, message = "authentication failed",
  method = req$REQUEST_METHOD,
  path = req$PATH_INFO,
  params = sub("^\\?", "", req$QUERY_STRING),
  user_agent = req$HTTP_USER_AGENT,
  remote_addr = req$REMOTE_ADDR,
  status = res$status
)
```

## logfmt Logs

An alternative to JSON is the popular, more human-friendly logfmt style, which
you can configure as follows:

```{r logfmt}
logger <- logger(appenders = console_appender(logfmt_log_layout()))
```

As an example, you might have the following in a script that processes many CSV
files:

```{r logfmt-example}
info(
  logger, message = "processed entries", file = filename,
  entries = nrow(entries),
  elapsed = unclass(difftime(Sys.time(), start, units = "secs"))
)
```