August 26th, 2015

Who Is Jim Hester?

  • lintr - Static Code Analysis / Linting
  • covr - Code Coverage Analysis / Reporting

Lintr Motivation

Existing Alternatives

  • lint - Andrew Redd
    • Stagnated development, Feb 5, 2013
    • Slow performance?
    • Limited set of linters
  • svTools - Philippe Grosjean, Romain Francois
    • Major development 2010, last bugfix release Mar 2014
    • many features un-related to linting
    • linting implementation wraps codetools functions
  • codetools - Luke Tierney
    • Used internally by R CMD check
    • Checks for possible usage errors, not style
  • shinyapps - Kevin Ushey
    • Shiny app website only

Demo

  • R terminal
  • Vim
  • Emacs
  • Sublime Text
  • RStudio
  • Travis-CI

Configuration

  • Some linters have arguments
    • line_length_linter(120)
  • Arguments to lint function
  • Per project configuration file
  • Exclusions
    • Whole files
    • Line
    • Ranges
    • In-source and outside
  • camelCase vs snake_case vs ambiguous.case

Implementation

  • base::parse()
    • Parse errors -> errors
  • utils::getParseData()
sf <- srcfile("bad.R")
p <- try(parse(text=readLines(sf$filename),
               srcfile=sf,
               keep.source = TRUE))
## Error in parse(text = readLines(sf$filename), srcfile = sf, keep.source = TRUE) : 
##   bad.R:11:0: unexpected end of input
## 9:   5}  
## 10: {
##    ^
(pd <- getParseData(sf))
##     line1 col1 line2 col2  id parent                token terminal
## 120     1    1     9    4 120      0         equal_assign    FALSE
## 1       1    1     1    3   1      3               SYMBOL     TRUE
## 3       1    1     1    3   3    120                 expr    FALSE
## 2       1    5     1    5   2    120            EQ_ASSIGN     TRUE
## 118     1    7     9    4 118    120                 expr    FALSE
## 4       1    7     1   14   4    118             FUNCTION     TRUE
## 5       1   15     1   15   5    118                  '('     TRUE
## 6       1   16     1   18   6    118       SYMBOL_FORMALS     TRUE
## 7       1   19     1   19   7    118                  ')'     TRUE
## 115     2    1     9    4 115    118                 expr    FALSE
## 10      2    1     2    1  10    115                  '{'     TRUE
## 22      3    3     3   25  22    115                 expr    FALSE
## 12      3    3     3   14  12     14               SYMBOL     TRUE
## 14      3    3     3   14  14     22                 expr    FALSE
## 13      3   16     3   17  13     22          LEFT_ASSIGN     TRUE
## 21      3   19     3   25  21     22                 expr    FALSE
## 15      3   19     3   21  15     17               SYMBOL     TRUE
## 17      3   19     3   21  17     21                 expr    FALSE
## 16      3   23     3   23  16     21                  '+'     TRUE
## 18      3   25     3   25  18     19            NUM_CONST     TRUE
## 19      3   25     3   25  19     21                 expr    FALSE
## 53      4    3     4   41  53    115                 expr    FALSE
## 25      4    3     4    6  25     27               SYMBOL     TRUE
## 27      4    3     4    6  27     53                 expr    FALSE
## 26      4    8     4    9  26     53          LEFT_ASSIGN     TRUE
## 52      4   11     4   41  52     53                 expr    FALSE
## 28      4   11     4   16  28     30               SYMBOL     TRUE
## 30      4   11     4   16  30     52                 expr    FALSE
## 29      4   18     4   19  29     52          LEFT_ASSIGN     TRUE
## 50      4   21     4   41  50     52                 expr    FALSE
## 31      4   21     4   26  31     33 SYMBOL_FUNCTION_CALL     TRUE
## 33      4   21     4   26  33     50                 expr    FALSE
## 32      4   27     4   27  32     50                  '('     TRUE
## 40      4   28     4   31  40     50                 expr    FALSE
## 34      4   28     4   28  34     35            NUM_CONST     TRUE
## 35      4   28     4   28  35     40                 expr    FALSE
## 36      4   29     4   29  36     40                  ':'     TRUE
## 37      4   30     4   31  37     38            NUM_CONST     TRUE
## 38      4   30     4   31  38     40                 expr    FALSE
## 39      4   32     4   32  39     50                  ','     TRUE
## 43      4   33     4   36  43     50           SYMBOL_SUB     TRUE
## 44      4   38     4   38  44     50               EQ_SUB     TRUE
## 45      4   40     4   40  45     46            NUM_CONST     TRUE
## 46      4   40     4   40  46     50                 expr    FALSE
## 47      4   41     4   41  47     50                  ')'     TRUE
## 66      5    3     5   12  66    115                 expr    FALSE
## 56      5    3     5    6  56     58               SYMBOL     TRUE
## 58      5    3     5    6  58     66                 expr    FALSE
## 57      5    7     5    7  57     66                  '['     TRUE
## 59      5    9     5    9  59     60            NUM_CONST     TRUE
## 60      5    9     5    9  60     66                 expr    FALSE
## 61      5   10     5   10  61     66                  ','     TRUE
## 64      5   12     5   12  64     66                  ']'     TRUE
## 76      6    3     6   13  76    115                 expr    FALSE
## 70      6    3     6    5  70     72               SYMBOL     TRUE
## 72      6    3     6    5  72     76                 expr    FALSE
## 71      6    7     6    8  71     76          LEFT_ASSIGN     TRUE
## 73      6   10     6   13  73     75            STR_CONST     TRUE
## 75      6   10     6   13  75     76                 expr    FALSE
## 89      7    3     7   17  89    115                 expr    FALSE
## 79      7    3     7    7  79     81               SYMBOL     TRUE
## 81      7    3     7    7  81     89                 expr    FALSE
## 80      7    9     7   10  80     89          LEFT_ASSIGN     TRUE
## 88      7   12     7   17  88     89                 expr    FALSE
## 82      7   12     7   14  82     84               SYMBOL     TRUE
## 84      7   12     7   14  84     88                 expr    FALSE
## 83      7   15     7   15  83     88                  '+'     TRUE
## 85      7   17     7   17  85     86            NUM_CONST     TRUE
## 86      7   17     7   17  86     88                 expr    FALSE
## 107     8    3     8   19 107    115                 expr    FALSE
## 92      8    3     8    4  92    107                   IF     TRUE
## 93      8    5     8    5  93    107                  '('     TRUE
## 100     8    6     8   16 100    107                 expr    FALSE
## 94      8    6     8    8  94     96               SYMBOL     TRUE
## 96      8    6     8    8  96    100                 expr    FALSE
## 95      8   10     8   11  95    100                   EQ     TRUE
## 97      8   13     8   16  97     99            STR_CONST     TRUE
## 99      8   13     8   16  99    100                 expr    FALSE
## 98      8   17     8   17  98    107                  ')'     TRUE
## 102     8   19     8   19 102    103            NUM_CONST     TRUE
## 103     8   19     8   19 103    107                 expr    FALSE
## 110     9    3     9    3 110    111            NUM_CONST     TRUE
## 111     9    3     9    3 111    115                 expr    FALSE
## 112     9    4     9    4 112    115                  '}'     TRUE
## 123    10    1    10    1 123      0                  '{'     TRUE
##             text
## 120             
## 1            fun
## 3               
## 2              =
## 118             
## 4       function
## 5              (
## 6            one
## 7              )
## 115             
## 10             {
## 22              
## 12  one.plus.one
## 14              
## 13            <-
## 21              
## 15           oen
## 17              
## 16             +
## 18             1
## 19              
## 53              
## 25          four
## 27              
## 26            <-
## 52              
## 28        newVar
## 30              
## 29            <-
## 50              
## 31        matrix
## 33              
## 32             (
## 40              
## 34             1
## 35              
## 36             :
## 37            10
## 38              
## 39             ,
## 43          nrow
## 44             =
## 45             2
## 46              
## 47             )
## 66              
## 56          four
## 58              
## 57             [
## 59             1
## 60              
## 61             ,
## 64             ]
## 76              
## 70           txt
## 72              
## 71            <-
## 73          'hi'
## 75              
## 89              
## 79         three
## 81              
## 80            <-
## 88              
## 82           two
## 84              
## 83             +
## 85             1
## 86              
## 107             
## 92            if
## 93             (
## 100             
## 94           txt
## 96              
## 95            ==
## 97          'hi'
## 99              
## 98             )
## 102            4
## 103             
## 110            5
## 111             
## 112            }
## 123            {

Example linter

assignment_linter <- function(source_file) {
  lapply(ids_with_token(source_file, "EQ_ASSIGN"),
    function(id) {
      parsed <- source_file$parsed_content[id, ]
      Lint(
        filename = source_file$filename,
        line_number = parsed$line1,
        column_number = parsed$col1,
        type = "style",
        message = "Use <-, not =, for assignment.",
        line = source_file$lines[parsed$line1]
        )
    })
}

Example linter

trailing_whitespace_linter <- function(source_file) {
  res <- re_matches(source_file$lines,
    rex(capture(name = "space", some_of(" ", regex("\\t"))), or(newline, end)),
    global = TRUE,
    locations = TRUE)

  lapply(seq_along(source_file$lines), function(itr) {

      mapply(
        FUN = function(start, end) {
          if (is.na(start)) {
            return()
          }
          line_number <- names(source_file$lines)[itr]
          Lint(
            filename = source_file$filename,
            line_number = line_number,
            column_number = start,
            type = "style",
            message = "Trailing whitespace is superfluous.",
            line = source_file$lines[as.character(line_number)],
            ranges = list(c(start, end)),
            linter = "trailing_whitespace_linter"
            )
        },
        start = res[[itr]]$space.start,
        end = res[[itr]]$space.end,
        SIMPLIFY = FALSE
        )
  })
}

Implementation

  • linting speed an issue
    • ~ 20 seconds to lint lintr, 60 files
  • Caching
    • Experimental
    • Per expression
    • Cache dependencies still a work in progress

Future Directions

  • Bioconductor linters
    • Very close to hadley style
    • camelCase
    • no spaces in argument lists (a=b)
  • Google Style
  • Improve Performance
    • C/C++ helper utility functions?
  • Automatic reformatting/tidying (formatR)
    • False positives

Lintr

  • Integrated with common editors
  • Works with both packages and scripts
  • Style, syntax and potential usage errors
  • Easy to understand output
  • Configurable
  • Lintr questions?

Covr

What is it?

  • Test Coverage - How much of my code is run by tests?
  • Mid-December 2014
  • Test/Example/Vignette Coverage
    • R Code
    • Compiled C/C++/Fortran Code

Motivation

Existing Alternatives

  • R-coverage by Karl Forner
    • Modify R source and add instrumentation
    • Requires patching and recompiling R source
  • testCoverage by Tom Taverner, Chris Campbell, Suchen Jin
    • Alternate parser
    • Complicated implementation
    • No S4 support
    • Limited output formats
    • Challenging usage instructions

Demo

  • R terminal
  • Shiny application
  • Coveralls.io
  • Codecov.io

Configuration

  • Exclusions (Cheating!)
    • Whole files
    • By Line
    • Ranges
    • In-source annotations and exclusion argument.

Implementation

R’s Abstract Syntax Tree

  • How it works vignette
  • Walk the Abstract Syntax Tree
    • If a call with srcref
      • Add a trace function before call
      • Perform the call

fun <- function(x, ...) {
  recurse <- function(y) {
    lapply(y, fun, f = f, ...)
  }

  if (is.atomic(x) || is.name(x)) {
  } else if (is.call(x)) {
    as.call(recurse(x))
  } else if (is.function(x)) {
    formals(x) <- fun(formals(x), ...)
    body(x) <- fun(body(x), ...)
    x
  } else if (is.pairlist(x)) {
    as.pairlist(recurse(x))
  } else if (is.expression(x)) {
    as.expression(recurse(x))
  } else if (is.list(x)) {
    recurse(x)
  } else {
    stop("Unknown language class: ", paste(class(x), collapse = "/"),
      call. = FALSE)
  }
}

Modify Calls

  • How to insert function without changing output?
  • Braces evaluate expressions, return result of last
identical({ 1 + 2; 3 + 4 }, `{`(1 + 2, 3 + 4))
## [1] TRUE
`{`(count(), as.call(recurse(x)))

Source References

  • Where in source a call is from?
  • srcref
    • option(keep.source = TRUE)
    • srcref attribute attached to each call

f1 <- function(x) {
  x <- x + 1
  y <- x
  y
}

covr:::trace_calls(f1)
## function (x) 
## {
##     if (TRUE) {
##         covr:::count("<text>:2:3:2:12:3:12:2:2")
##         x <- x + 1
##     }
##     if (TRUE) {
##         covr:::count("<text>:3:3:3:8:3:8:3:3")
##         y <- x
##     }
##     if (TRUE) {
##         covr:::count("<text>:4:3:4:3:3:3:4:4")
##         y
##     }
## }

Replacing functions

  • Replace all references with modified versions.
  • testthat::with_mock()
    • C function replaces function pointer
    • stores original definition (reversible)

S4 Methods

  • Normal functions defined directly in package namespace
  • S4 methods are defined in an environment based on their generic
replacements_S4 <- function(env) {
  generics <- getGenerics(env)

  unlist(recursive = FALSE,
    Map(generics@.Data, generics@package, USE.NAMES = FALSE,
      f = function(name, package) {
      what <- methodsPackageMetaName("T", paste(name, package, sep = ":"))

      table <- get(what, envir = env)

      lapply(ls(table, all.names = TRUE), replacement, env = table)
    })
  )
}

Compiled Code

  • Gcov
    • built into gcc and clang
      • -fprofile-arcs -ftest-coverage
      • -O0
    • Need to override default and package Makevars
      • PKG_CFLAGS puts optimization before default -O2
    • Temporarily point to different global Makevars. (retain ~/R/Makevars values)
    • No results until process terminated
      • Call R subprocess

Running Tests

  • base::source2 on tests/*.[Rr]
  • Test framework agnostic

Running Vignettes

  • knitr::knit(tangle = TRUE) # generate R script
  • Run with base::source2

Running Examples

  • tools:::.createExdotR # generate R script from .Rd
  • script massaged to use temp directory and not quit.

Coverage Services

  • Track coverage over time
  • Report on coverage drops from contributions.
    • Coveralls.io
    • Codecov.io
  • Expect a JSON file
    • coverage per line
    • source code per line
  • Travis CI
    • Travis Job ID

Covr Usage / Shields

Future

  • Ideas?
  • Comments / Questions?