Testing couchdb

It is quite hard to test couchdb currently. The main difficulties are related to complex setup functions and weak isolation of test cases. There are some functions in test_util module which simplify setup a little bit but not by much. The purpose of this proposal is to define some requirements for testing infrastructure and proposing a solution which satisfy most of these requirements.

Why is it so hard to write certain kinds of tests?

Setup / teardown overhead.
Difficult-to-reproduce failure modes.
Tests which require clusters rather than single nodes.
Tests for issues that only manifest at scale (e.g. >100 nodes).
Verifying/manipulating aspects of internal state during / after a test.
PRs for functional changes are not isolated to single repositories which makes integration with CI (required to enforce any must-have-passing-tests rules we might want) or manual testing of multi-repo changes difficult.
Code which is difficult to test due to coupling of IO with pure logic.

What might make things easier?

A high level API for creating test fixtures (e.g. nodes/clusters in specific failure states but also other things like configuration files, logs, etc).
A high level API for manipulating node/cluster state (e.g. stop/start node, simulate packet loss, partition one or more nodes).
Lower-level constructs for directly modifying code under-test e.g.:
- forcing a particular error to be thrown at a particular time
- tapping into the output of a logger
- tapping into metrics collector
- being able to tell that specified function has been called and possibly get arguments passed to that function
- being able to store terms into temporary storage during the execution of a test case
- facilities to group tests
- ability to suppress logging or individual log messages
- be able to run same behaviour tests against different implementations
- be able to run same test suites against different configurations
Tooling for testing specific branches of the sub-repositories (e.g. https://cloudup.com/cOgxRPbt9aP).
A manifest repo which links to all proposed changes that span multiple repos (e.g. https://github.com/iilyak/couchdb-manifest).
Refactoring to separate IO from pure logic.
Adding function specifications and increased use of dialyzer as part of the CI chain.
Track test suite execution time to detect performance degradation.
Refactoring to be able to specify names for named processes

Proposal

Create couch_test app which would contain helper functions to make eunit easier to use for couchdb testing. It would include solutions for some of the identified earlier problems.
Write new tests using cdt:setup cdt:make_casses
Use riak_test's intercepts
Update Makefile to use src/couch_test/bin/cdt ci
Exact implementation might be different from the design bellow
Improve as we go

couch_test (cdt) design

couch_test/
  +-- include/
    +-- intercept.hrl
    +-- cdt.hrl
  +-- intercepts/
  +-- setups/ - Would contain setup/teardown code for reuse
  +-- src
    +-- intercept.erl
    +-- cdt.erl
    +-- combinatorics.erl
  +-- bin/
    +-- cdt
  +-- rebar.config
  +-- etc/
    +-- cdt.conf.example
    +-- local.conf.example
    +-- test_cluster.conf.example

Just to illustrate the idea here is the list of commands for cdt (eventually).

cdt --help

cdt ci
cdt all
cdt unit
cdt integration
cdt system
cdt perf
cdt props
cdt scale -nodes 200
cdt vmargs -period 3600000 -nodes 3
- run VM with different vmargs (generated using powerset) to find out best combination of options and flags. We restart VM with next set of options after 1 hour
cdt -file file.erl
cdt -module module
cdt -test module:fun
cdt mixed -nodes 10 -old versionA -new versionB

setup modules

-module(cdt_chttpd_setup).

-export([chttpd/4]).

chttpd(A, B, C, D) ->
    {setup(A, B, C, D), teardown()}.

setup(A, B, C, D) ->
    fun(Ctx) ->
        %% use A and B to setup chttpd
        %% store C and D in context for latter use in tests
        NewCtx = update_ctx(Ctx, C, D),
        {?MODULE, NewCtx}
    end.

teardown() ->
    fun(Ctx) ->
       %% Clean up the things using Ctx to find out what to do
       {?MODULE, Ctx}
    end.

Setups could be composed into a chain

-module(my_unit_test).

-import_lib("cdt.hrl").

%% Imports are optional
-import([cdt_chttpd_setup, [chttpd/4]]).
-import([cdt_couch_setup, [couch/3]]).
-import([cdt_cluster_setup, [cluster/1]]).
-import([cdt_fault_setup, [disconnect/2, drop_packets/3]]).

setup(Type) ->
    Chain = [
        couch(Type, 1, 2),
        chttpd(backdoor, foo, bar, baz),
        cluster(3),
        disconnect(1, 2),
        drop_packets(2, 3, 30) %% 30% of packets to drop between db2 and db3 nodes
    ],
    Args = [],
    Opts = [],
    cdt:setup(Chain, Args, Opts).

teardown(_Type, Ctx) ->
    cdt:teardown(Ctx).

Injecting networking problems

Since a crossplatform solution is required it is better to use something erlang based for fault injection. We could extend epmdpxy to simulate latency or connectivity problems. It should be extended in such a way to be able to selectively induce problems between specified nodes (without affecting the communication between test master and test slaves). In this case nodes should be started using:

ERL_EPMD_PORT=43690 erl

Tapping to logger

Don't log or produce noisy output by default. However should be able to control verbosity. It is also possible to split the output.

stdout: Errors and output from eunit
file per test suite run: For detailed logging

In some cases should be able to:

suppress log message
check error message produced by logger

The easiest way to achieve both goals is using permanent intercept for couch_log.erl. Another approach could be a special couch_log backend.

Fixtures

Store fixtures in tests/fixtures of the applications we are testing. We also might have some common fixtures in couch_test/fixtures. All fixtures should be templates. couch_test app would have some helpers to find and include the fixtures.

It would be helpful to support following types of fixtures:

module
file
data structure
- <name>.script - similar to file:script but with template rendering
- <name> - similar to file:consult but with template rendering

Grouping test cases

Use list of lists as a test name to determine if grouping is needed. For example:

apply_options_test_() ->
    Funs = [fun ensure_apply_is_called/2],
    Cases = combinatorics:powerset([pipe, concurrent]),
    cdt:make_cases(
        ["apply options tests", "Apply with options: ~p"],
        fun setup/1, fun teardown/2,
        Cases, Funs).

This would generate following test cases

{
    "apply options tests",
    [
        {
            "Apply with options: []",
            [
                {
                    foreachx, fun setup/1, fun teardown/2,
                    [
                        {[], fun ensure_apply_is_called/2}
                    ]
                }
            ]
        },
        {
            "Apply with options: [pipe]",
            [
                {
                    foreachx, fun setup/1, fun teardown/2,
                    [
                        {[pipe], fun ensure_apply_is_called/2}
                    ]
                }
            ]
        },
        {
            "Apply with options: [concurent]",
            [
                {
                    foreachx, fun setup/1, fun teardown/2,
                    [
                        {[concurent], fun ensure_apply_is_called/2}
                    ]
                }
            ]
        },
        {
            "Apply with options: [pipe, concurrent]",
            [
                {
                    foreachx, fun setup/1, fun teardown/2,
                    [
                        {[pipe, concurrent], fun ensure_apply_is_called/2}
                    ]
                }
            ]
        },
    ]

Tests annotations

In order to distinguish kinds of tests we would need to annotate test cases. We could use one of the following in order of my personal preference (any other ideas?):

Implement parse transform using merl to support annotations:

-scope([integration, unit, cluster]).
my_tests_() - >
    ok.

Split different kinds of tests into different modules and maybe keep them in different directories
Pass scope to cdt:make_cases.
Introduce naming convention for a test name
- i_my_integration_tests_() -> ok.
- u_my_unit_tests_() -> ok.
Have a module where we add every test case into approporiate scope.