Artemis Experiment Tutorial

The scientific method can generally involves the following loop:

  1. Come up with a hypothesis.
  2. Create an experiment to test this hypothesis
  3. Run experiment, observe, record results.
  4. Given results, either celebrate, or return to step 1.

All too often, what we actually do is the following:

  1. Come up with hypothesis
  2. Create an experiment to test this hypothesis
  3. Run, experiment, observe results.
  4. Be somehow unsatisfied with results, change some parameter in experiment.
  5. Run experiment again with new parameters.
  6. Observe results again, remain unsatisfied, revert to step 5 several times.
  7. Get tired of waiting for experiments to run just to analyze results. Start saving results of experiments, so they can be loaded an analyzed at any time.
  8. Run several experiments, load and compare results.
  9. Realize we've forgotten which parameters correspond to which results.
  10. Try to run all experiments again and save their results, this time keeping track of their parameters, but realize we've forgotten the parameters we used in our first experiments.
  11. Eventually get to some result, resolve to be more organized the next time around.
  12. Repeat from step 1.

The Artemis Experiment Framework aims to help organize this process. It does this with the following functionality.

  1. Recording Experimental Results: The framework captures the console output, plots, and results of an experiment so that we can review them later without having to run the experiment again.
  2. Creating Variations on Experiments: The framework allows us to define variants on experiments by changing some parameter. This allows up to build and maintain a list of variations on our experiment, and the results of running these variations.
  3. Comparing Experiments: Using the saved results of each experiment, Artemis allows us to easily go back and compare the results of different experiments.

We will demonstrate these in the following tutorial.

Introduction

Suppose we want to run a simple experiment: We kidnap 5 drunks from the local bar, take them to a point in a secluded field or parking lot, then release them, and record their progress. While the ethics committee processes our request, we run a simulation to get some preliminary results. We may code our simulation as follows:

In [16]:
import numpy as np 
from matplotlib import pyplot as plt
%matplotlib notebook


def demo_drunkards_walk(n_steps=500, n_drunkards=5, homing_instinct = 0, n_dim=2, seed=1234):
    """
    Release several drunkards in a field to randomly stumble around.  Record their progress.
    """
    rng = np.random.RandomState(seed)
    drunkards = np.zeros((n_steps+1, n_drunkards, n_dim))
    for t in range(1, n_steps+1):
        drunkards[t] = drunkards[t-1]*(1-homing_instinct) + rng.randn(n_drunkards, n_dim)
        if t%100==0:
            print('Status at step {}: Mean: {}, STD: {}'.format(t, drunkards[t].mean(), drunkards[t].std()))

    plt.plot(drunkards[:, :, 0], drunkards[:, :, 1])
    plt.grid()
    plt.xlabel('$\Delta$ Longitude (arcseconds)')
    plt.ylabel('$\Delta$ Latitude (arcseconds)')
    plt.show()
In [17]:
demo_drunkards_walk()
Status at step 100: Mean: 1.5740582153762035, STD: 9.782813314403361
Status at step 200: Mean: 5.750244697580776, STD: 14.128414176433994
Status at step 300: Mean: 10.259067941176745, STD: 18.33460868775832
Status at step 400: Mean: 8.36816763773262, STD: 20.314295644038435
Status at step 500: Mean: 12.13866345091756, STD: 21.834830623997316

Running Experiments

Now, suppose our simulation takes a long time to run. We would like to record our results so that we review them later without having to re-run the experiment. We can achieve this by decorating our experiment with the "@experiment_function" decordator. The decorator registers the function demo_drunkards_walk, as an "experiment", which allows us to capture its output when it is run:

In [18]:
import numpy as np
from artemis.experiments import experiment_function
from artemis.experiments.ui import browse_experiments
from matplotlib import pyplot as plt
from artemis.experiments.experiments import clear_all_experiments
clear_all_experiments()  # Removes any previous versions of demo_drunkards_walk that have may been registered
%matplotlib notebook


@experiment_function
def demo_drunkards_walk(n_steps=500, n_drunkards=5, homing_instinct = 0, n_dim=2, seed=1234):
    """
    Release several drunkards in a field to randomly stumble around.  Record their progress.
    """
    rng = np.random.RandomState(seed)
    drunkards = np.zeros((n_steps+1, n_drunkards, n_dim))
    for t in range(1, n_steps+1):
        drunkards[t] = drunkards[t-1]*(1-homing_instinct) + rng.randn(n_drunkards, n_dim)
        if t%100==0:
            print('Status at step {}: Mean: {}, STD: {}'.format(t, drunkards[t].mean(), drunkards[t].std()))
    
    plt.plot(drunkards[:, :, 0], drunkards[:, :, 1])
    plt.grid()
    plt.xlabel('Step')
    plt.ylabel('Drunkard Position')
    plt.show()

We can now run this experiment by calling browse_experiments() to open the experiment user interface, and entering run 0, meaning "run experiment 0, record all figures and console output". (We could also do this programatically with demo_drunkards_walk.run()). In the menu, enter run 0

In [20]:
demo_drunkards_walk.browse(close_after=True)
=============================== Experiments ===============================
| #   | Start Time   | Duration   | Status   | Args Changed?   | Result   |
===========================================================================
0  demo_drunkards_walk                                                    |
===========================================================================
Enter command or experiment # to run (h for help) >> run 0
INFO:artemis:========== Running Experiment: demo_drunkards_walk ==========
    Status at step 100: Mean: 1.5740582153762035, STD: 9.782813314403361
    Status at step 200: Mean: 5.750244697580776, STD: 14.128414176433994
    Status at step 300: Mean: 10.259067941176745, STD: 18.33460868775832
    Status at step 400: Mean: 8.36816763773262, STD: 20.314295644038435
    Status at step 500: Mean: 12.13866345091756, STD: 21.834830623997316
INFO:artemis:Saved Figure: /Users/peter/.artemis/experiments/2018.05.10T09.23.08.710713-demo_drunkards_walk/fig-2018.05.10T09.23.08.801297-unnamed.fig.pkl
INFO:artemis:Saving Result for Experiment "2018.05.10T09.23.08.710713-demo_drunkards_walk"
INFO:artemis:========== Done Running Experiment: demo_drunkards_walk ==========
    Finished running 1 experiment.

Viewing Results of Experiments

Now if we want to go back later and see these results, we can enter the UI again and enter show 0 meaning "show the results of experiment 0". In the meny, unter show 0

In [21]:
%matplotlib notebook
demo_drunkards_walk.browse(close_after=True)
===================================== Experiments ====================================
|   # | Start Time       | Duration   | Status          | Args Changed?   | Result   |
======================================================================================
0  demo_drunkards_walk                                                               |
|   0 | May 10, 09:23:08 | 100.3ms    | Ran Succesfully | <No Change>     | None     |
======================================================================================
Enter command or experiment # to run (h for help) >> show 0
    ========================== 2018.05.10T09.23.08.710713-demo_drunkards_walk ==========================
    ----------------------------------------------- Info -----------------------------------------------
    ExpInfoFields.NAME: demo_drunkards_walk
    ExpInfoFields.ID: 2018.05.10T09.23.08.710713-demo_drunkards_walk
    ExpInfoFields.DIR: /Users/peter/.artemis/experiments/2018.05.10T09.23.08.710713-demo_drunkards_walk
    ExpInfoFields.ARGS: ['n_steps=500', 'n_drunkards=5', 'homing_instinct=0', 'n_dim=2', 'seed=1234']
    ExpInfoFields.FUNCTION: demo_drunkards_walk
    ExpInfoFields.TIMESTAMP: 2018-05-10 09:23:08.710713
    ExpInfoFields.MODULE: __main__
    ExpInfoFields.FILE: <unknown>
    ExpInfoFields.STATUS: Ran Succesfully
    ExpInfoFields.USER: peter
    ExpInfoFields.MAC: 34:36:3B:87:0A:B6
    ExpInfoFields.PID: 33934
    ExpInfoFields.RUNTIME: 0.10027384757995605
    ExpInfoFields.N_FIGS: 1
    ExpInfoFields.FIGS: ['fig-2018.05.10T09.23.08.801297-unnamed.fig.pkl']
    
    ----------------------------------------------- Logs -----------------------------------------------
    Status at step 100: Mean: 1.5740582153762035, STD: 9.782813314403361
    Status at step 200: Mean: 5.750244697580776, STD: 14.128414176433994
    Status at step 300: Mean: 10.259067941176745, STD: 18.33460868775832
    Status at step 400: Mean: 8.36816763773262, STD: 20.314295644038435
    Status at step 500: Mean: 12.13866345091756, STD: 21.834830623997316
    
    
    
    ---------------------------------------------- Result ----------------------------------------------
    None

This displays all the output of the experiment, and should show the figure that was created.

Creating Variants

We now want to try changing parameters to our experiment. We could of course simply change the default arguments and run again, but then our saved experiment no longer corresponds to the new version of this experiment. We also want to be able to re-run our original experiment whenever we want (without having to write down the parameters it was run with the first time). To keep track of our variats without losing the original experiment, we can use the add_variant method.

Suppose, in the following example, that we want to give our drunkards a "homing instinct" that makes them tend towards the origin. We create two variants of our experiment with different degrees of homing instinct:

In [22]:
import numpy as np
from artemis.experiments import experiment_function
from artemis.experiments.ui import browse_experiments
from matplotlib import pyplot as plt
from artemis.experiments.experiments import clear_all_experiments
clear_all_experiments()  # Removes previous versions of demo_drunkards_walk that have been registered

@experiment_function
def demo_drunkards_walk(n_steps=500, n_drunkards=5, homing_instinct = 0, n_dim=2, seed=1234):
    """
    Release several drunkards in a field to randomly stumble around.  Record their progress.
    """
    rng = np.random.RandomState(seed)
    drunkards = np.zeros((n_steps+1, n_drunkards, n_dim))
    for t in range(1, n_steps+1):
        drunkards[t] = drunkards[t-1]*(1-homing_instinct) + rng.randn(n_drunkards, n_dim)
        if t%100==0:
            print('Status at step {}: Mean: {}, STD: {}'.format(t, drunkards[t].mean(), drunkards[t].std()))

    plt.plot(drunkards[:, :, 0], drunkards[:, :, 1])
    plt.grid()
    plt.xlabel('$\Delta$ Longitude (arcseconds)')
    plt.ylabel('$\Delta$ Latitude (arcseconds)')
    plt.show()


demo_drunkards_walk.add_variant(homing_instinct = 0.01)
demo_drunkards_walk.add_variant(homing_instinct = 0.1)
Out[22]:
<artemis.experiments.experiments.Experiment at 0x10697c6d8>

We can now open browse_experiments(), and see that our record of experiment 0 is still saved, and we now have two new experiments which have not yet been run. We can run them by entering run 1,2.

In [24]:
demo_drunkards_walk.browse(close_after = True)
===================================== Experiments ====================================
|   # | Start Time       | Duration   | Status          | Args Changed?   | Result   |
======================================================================================
0  demo_drunkards_walk                                                               |
|   0 | May 10, 09:23:08 | 100.3ms    | Ran Succesfully | <No Change>     | None     |
--------------------------------------------------------------------------------------
1  demo_drunkards_walk.homing_instinct=0.01                                          |
--------------------------------------------------------------------------------------
2  demo_drunkards_walk.homing_instinct=0.1                                           |
======================================================================================
Enter command or experiment # to run (h for help) >> run 1,2
INFO:artemis:========== Running Experiment: demo_drunkards_walk.homing_instinct=0.01 ==========
INFO:artemis:Saved Figure: /Users/peter/.artemis/experiments/2018.05.10T09.23.36.039941-demo_drunkards_walk.homing_instinct=0.01/fig-2018.05.10T09.23.36.067127-unnamed.fig.pkl
INFO:artemis:Saving Result for Experiment "2018.05.10T09.23.36.039941-demo_drunkards_walk.homing_instinct=0.01"
    Status at step 100: Mean: 0.6284074061502792, STD: 6.745529105888236
    Status at step 200: Mean: 3.42914692268251, STD: 6.901844038947912
    Status at step 300: Mean: 4.119219235749395, STD: 6.182039186566688
    Status at step 400: Mean: 0.04952071906583251, STD: 7.198510452516606
    Status at step 500: Mean: 1.9251505932914377, STD: 6.322245223150539
INFO:artemis:========== Done Running Experiment: demo_drunkards_walk.homing_instinct=0.01 ==========
INFO:artemis:========== Running Experiment: demo_drunkards_walk.homing_instinct=0.1 ==========
INFO:artemis:Saved Figure: /Users/peter/.artemis/experiments/2018.05.10T09.23.36.100733-demo_drunkards_walk.homing_instinct=0.1/fig-2018.05.10T09.23.36.126876-unnamed.fig.pkl
INFO:artemis:Saving Result for Experiment "2018.05.10T09.23.36.100733-demo_drunkards_walk.homing_instinct=0.1"
    Status at step 100: Mean: -0.05698281139892725, STD: 2.008052880616599
    Status at step 200: Mean: 1.0761950373348965, STD: 1.9443877580504882
    Status at step 300: Mean: 0.7718960738517964, STD: 1.995496166552934
    Status at step 400: Mean: -0.5917210818614274, STD: 2.575890432995089
    Status at step 500: Mean: -0.04862033895858388, STD: 2.670268295919608
INFO:artemis:========== Done Running Experiment: demo_drunkards_walk.homing_instinct=0.1 ==========
    Finished running 2 experiments.

Note that we can also could also create variants of our variants if we wanted. For instance, if we wanted to try a drunkard's walk in 3D:

X = demo_drunkards_walk.add_variant(homing_instinct = 0.1)
X.add_variant(n_dim=3)

Separating Display and Computation

The above is ok if our experiments run quickly and we just want to plot what the drunkards are doing. But we may want to do some other analysis on our results after running the experiment (without having to start again). Or we may simply want to change the way we plot our results, without having to re-run everythign. In these cases, it becomes beneficial to separate plotting from computing the results. We can use the display_function argument to do this. This display_function should accept the return value of your experiment as its first argument.

In [28]:
import numpy as np
from artemis.experiments import experiment_function, ExperimentFunction
from artemis.experiments.ui import browse_experiments
from matplotlib import pyplot as plt
from artemis.experiments.experiments import clear_all_experiments
clear_all_experiments()  # Removes previous versions of demo_drunkards_walk that have been registered
%matplotlib notebook  

def display_drunkards_walk(record):
    print('===== CREATING PLOT OF RECORD {} NOW ===='.format(record.get_id()))
    drunkards = record.get_result()
    plt.plot(drunkards[:, :, 0], drunkards[:, :, 1])
    plt.grid()
    plt.xlabel('$\Delta$ Longitude (arcseconds)')
    plt.ylabel('$\Delta$ Latitude (arcseconds)')
    plt.show()


@ExperimentFunction(show=display_drunkards_walk)
def demo_drunkards_walk(n_steps=500, n_drunkards=5, homing_instinct = 0, n_dim=2, seed=1234):
    """
    Release several drunkards in a field to randomly stumble around.  Record their progress.
    """
    rng = np.random.RandomState(seed)
    drunkards = np.zeros((n_steps+1, n_drunkards, n_dim))
    for t in range(1, n_steps+1):
        drunkards[t] = drunkards[t-1]*(1-homing_instinct) + rng.randn(n_drunkards, n_dim)
        if t%100==0:
            print('Status at step {}: Mean: {}, STD: {}'.format(t, drunkards[t].mean(), drunkards[t].std()))
    return drunkards


demo_drunkards_walk.add_variant(homing_instinct = 0.01)
demo_drunkards_walk.add_variant(homing_instinct = 0.1)
Out[28]:
<artemis.experiments.experiments.Experiment at 0x106706c50>

First, since we've changed the code for our experiment, we delete old experiments and run them all again (output not shown):

In [ ]:
variants = demo_drunkards_walk.get_all_variants()
for experiment in variants:
    for record in experiment.get_records():
        record.delete()
    experiment.run()

Then browse through our results, and view the results of experiment 1: "demo_drunkards_walk.homing_instinct=0.01", by entering show 1.

In [30]:
demo_drunkards_walk.browse(close_after=True)
========================================== Experiments =========================================
|   # | Start Time       | Duration   | Status          | Args Changed?   | Result             |
================================================================================================
0  demo_drunkards_walk                                                                         |
|   0 | May 10, 09:24:49 | 14.8ms     | Ran Succesfully | <No Change>     | <(501,5,2)ndarray> |
------------------------------------------------------------------------------------------------
1  demo_drunkards_walk.homing_instinct=0.01                                                    |
|   0 | May 10, 09:24:49 | 17.6ms     | Ran Succesfully | <No Change>     | <(501,5,2)ndarray> |
------------------------------------------------------------------------------------------------
2  demo_drunkards_walk.homing_instinct=0.1                                                     |
|   0 | May 10, 09:24:49 | 25.5ms     | Ran Succesfully | <No Change>     | <(501,5,2)ndarray> |
================================================================================================
Enter command or experiment # to run (h for help) >> show 1
    ===== CREATING PLOT OF RECORD 2018.05.10T09.24.49.863098-demo_drunkards_walk.homing_instinct=0.01 NOW ====

Comparing Results across experiments

Eventually we want to compare the results of different experiments. For this, we can define the comparison_function argument. This accepts a dictionary, indexed by the experiment name, with values being the return values from saved experiments.

In [31]:
import numpy as np
from artemis.experiments import ExperimentFunction
from artemis.experiments.ui import browse_experiments
from matplotlib import pyplot as plt
from artemis.experiments.experiments import clear_all_experiments
clear_all_experiments()  # Removes previous versions of demo_drunkards_walk that have been registered
%matplotlib notebook  

def compare_drunkards_walk(records):
    
    plot_handles = []
    for i, record in enumerate(records):
        drunkards = record.get_result()
        plot_handles.append(plt.plot(drunkards[:, :, 0], drunkards[:, :, 1], color='C{}'.format(i)))
    plt.grid()
    plt.xlabel('$\Delta$ Longitude (arcseconds)')
    plt.ylabel('$\Delta$ Latitude (arcseconds)')
    plt.legend([p[0] for p in plot_handles], [record.get_experiment().get_id() for record in records])
    plt.show()
    

@ExperimentFunction(compare=compare_drunkards_walk)
def demo_drunkards_walk(n_steps=500, n_drunkards=5, homing_instinct = 0, n_dim=2, seed=1234):
    """
    Release several drunkards in a field to randomly stumble around.  Record their progress.
    """
    rng = np.random.RandomState(seed)
    drunkards = np.zeros((n_steps+1, n_drunkards, n_dim))
    for t in xrange(1, n_steps+1):
        drunkards[t] = drunkards[t-1]*(1-homing_instinct) + rng.randn(n_drunkards, n_dim)
        if t%100==0:
            print('Status at step {}: Mean: {}, STD: {}'.format(t, drunkards[t].mean(), drunkards[t].std()))
    return drunkards


demo_drunkards_walk.add_variant(homing_instinct = 0.01)
demo_drunkards_walk.add_variant(homing_instinct = 0.1)
Out[31]:
<artemis.experiments.experiments.Experiment at 0x1069c26d8>

Now you can compare the results by entering compare all

In [32]:
demo_drunkards_walk.browse(close_after=True)
========================================== Experiments =========================================
|   # | Start Time       | Duration   | Status          | Args Changed?   | Result             |
================================================================================================
0  demo_drunkards_walk                                                                         |
|   0 | May 10, 09:24:49 | 14.8ms     | Ran Succesfully | <No Change>     | <(501,5,2)ndarray> |
------------------------------------------------------------------------------------------------
1  demo_drunkards_walk.homing_instinct=0.01                                                    |
|   0 | May 10, 09:24:49 | 17.6ms     | Ran Succesfully | <No Change>     | <(501,5,2)ndarray> |
------------------------------------------------------------------------------------------------
2  demo_drunkards_walk.homing_instinct=0.1                                                     |
|   0 | May 10, 09:24:49 | 25.5ms     | Ran Succesfully | <No Change>     | <(501,5,2)ndarray> |
================================================================================================
Enter command or experiment # to run (h for help) >> compare all

Conclusion

The value of the experiment framework is that it lets you keep track of the things you've tried and the outcomes. This is intended to replace the mish-mash of solutions that people usually use when doing this kind of thing (e.g. saving old commands in terminal, writing results to file and manually loading them later, etc).