Navigating results¶

Once all tasks have completed, the data will be uploaded into a database. Currently, both sqlite and postgresql have been tested, but mysql should work in principle as well.

The data is organised into a simple collection of tables. A benchmark is composed of a single run that groups together several instances. Each instance is a combination of a metric, tool, input data and options. Each metric outputs a tab-separated table that is uploaded into a separate table and adds runtime execution performance into tables called _timings. A typical collection of tables looks like this:

> .tables
# Maintenance tables
run
instance
# Timing tables
metric_timings
tool_timings
# metric tables
bedtools_stats_allele_frequency
bedtools_jaccard
bcftools_stats_depth_distribution
bcftools_stats_indel_context_length
bcftools_stats_indel_context_summary
bcftools_stats_indel_distribution
bcftools_stats_quality
bcftools_stats_singleton_stats
bcftools_stats_substitution_types
bcftools_stats_summary_numbers
vcftools_tstv_by_count
vcftools_tstv_summary

Table overview¶

run

Information about a benchmark run. Columns:

id
Identification number of this run
author
The user name of the person running the pipeline
created
Date the benchmark run was created
pipeline_name
The name of the pipeline
pipeline_version
The pipeline version (git commit), typically the current git commit.
config
The benchmark configuration file in json format
title
The title of the benchmark run, see Configuring a benchmark
description
The description of the benchmark run, see Configuring a benchmark
instance

An instantiation of a combination of a particular metric, tool, input data and options.

id
Identification number of this instance
run_id
Reference to run
completed
Time that computation was completed
input
Input data
metric_name
Name of the metric
metric_version
Version of the metric
metric_options
Options supplied to the metric
tool_name
Name of the tool
tool_version
Tool version
tool_options
Options supplied to the tool
meta_data
Other environment variables
timings

Timing information

instance_id
Reference to instance
host
Execution host
started
Time that job was submitted
completed
Time that job was completed
total_t
Total time of job, including waiting in the queue
wall_t
Time spend in user/system in total
user_t
Time spend in user in job script
sys_t
Time spend in system in job script
child_user_t
Time spend in user in child processes. This is typically the tool/metric being executed
child_sys_t
Time spend in system in child processes. This is typically the tool/metric being executed
statement
Command line statement
tags

List of tags

run_id
Reference to the run
tag
A tag associated with the run
arvados_job

Arvados job information. This table is only present if arvados is --engine=arvados has been used

run_id
Reference to the run
owner_uuid
Arvados UUID of the owner
job_uuid
Arvados UUID of the job
output_uuid
Arvados UUID of the output

Daisy

Navigation

  • Installation
  • Usage
    • Overview
    • Configuring a benchmark
    • Navigating results
    • Tag reference
    • Command Reference
  • Variant caller benchmark
  • Development
  • Tutorials
  • Glossary

Related Topics

  • Documentation overview
    • Usage
      • Previous: Configuring a benchmark
      • Next: Tag reference

Quick search

©2015,2016,2017,2018 Andreas Heger. | Powered by Sphinx 1.8.5 & Alabaster 0.7.12 | Page source