1
0
mirror of https://github.com/ARM-software/workload-automation.git synced 2025-02-20 20:09:11 +00:00

doc: re-write augmentations docs

Rewrite the "Output processors and Instruments" section into
"Augmentations" section.
This commit is contained in:
Sergei Trofimov 2018-05-14 14:34:17 +01:00 committed by Marc Bonnici
parent c6fae6fa55
commit d0368cf176

View File

@ -504,206 +504,185 @@ turn override global settings.
Output Processors and Instruments
----------------------------------
Output Processors
^^^^^^^^^^^^^^^^^
Output processors, as the name suggests, handle the processing of output
generated form running workload specs. By default, WA enables a couple of basic
output processors (e.g. one generates a csv file with all scores reported by
workloads), which you can see in ``~/.workload_automation/config.yaml``. However,
WA has a number of other, more specialized, output processors (e.g. for
uploading to databases). You can list available output processors with
``wa list output_processors`` command. If you want to permanently enable a
output processor, you can add it to your ``config.yaml``. You can also enable a
output processor for a particular run by specifying it in the ``config`` section
in the agenda. As the name suggests, ``config`` section mirrors the structure of
``config.yaml``, and anything that can be specified in the latter, can also be
specified in the former.
As with workloads, output processors may have parameters that define their
behaviour. Parameters of output processors are specified a little differently,
however. Output processor parameter values are listed in the config section,
namespaced under the name of the output processor.
For example, suppose we want to be able to easily query the output generated by
the workload specs we've defined so far. We can use ``sqlite`` output processor
to have WA create an sqlite_ database file with the results. By default, this
file will be generated in WA's output directory (at the same level as
results.csv); but suppose we want to store the results in the same file for
every run of the agenda we do. This can be done by specifying an alternative
database file with ``database`` parameter of the output processor:
.. code-block:: yaml
config:
augmentations:
- sqlite
sqlite:
database: ~/my_wa_results.sqlite
iterations: 5
workloads:
- id: 01_dhry
name: dhrystone
label: dhrystone_15over6
runtime_params:
cpu0_governor: performance
workload_params:
threads: 6
mloops: 15
- id: 02_memc
name: memcpy
- id: 03_cycl
name: cyclictest
iterations: 10
A couple of things to observe here:
- There is no need to repeat the output processors listed in ``config.yaml``. The
processors listed in ``augmentations`` entry in the agenda will be used
*in addition to* those defined in the ``config.yaml``.
- The database file is specified under "sqlite" entry in the config section.
Note, however, that this entry alone is not enough to enable the output
processor, it must be listed in ``augmentations``, otherwise the "sqilte"
config entry will be ignored.
- The database file must be specified as an absolute path, however it may use
the user home specifier '~' and/or environment variables.
.. _sqlite: http://www.sqlite.org/
Augmentationts
--------------
Augmentations are plugins that augment the execution of workload jobs with
additional functionality; usually, that takes the form of generating additional
metrics and/or artifacts, such as traces or logs. There are two types of
augmentations:
Instruments
^^^^^^^^^^^
These "instrument" a WA run in order to change it's behavior (e.g.
introducing delays between successive job executions), or collect
additional measurements (e.g. energy usage). Some instruments may depend
on particular features being enabled on the target (e.g. cpufreq), or
on additional hardware (e.g. energy probes).
WA can enable various "instruments" to be used during workload execution.
Instruments can be quite diverse in their functionality, but the majority of
instruments available in WA today are there to collect additional data (such as
trace) from the device during workload execution. You can view the list of
available instruments by using ``wa list instruments`` command. As with output
processors, a few are enabled by default in the ``config.yaml`` and additional
ones may be added in the same place, or specified in the agenda using
``augmentations`` entry.
Output processors
These post-process metrics and artifacts generated by workloads or
instruments, as well as target metadata collected by WA, in order to
generate additional metrics and/or artifacts (e.g. generating statistics
or reports). Output processors are also used to export WA output
externally (e.g. upload to a database).
For example, we can collect power events from trace cmd by using the ``trace-cmd``
instrument.
The main practical difference between instruments and output processors, is that
the former rely on an active connection to the target to function, where as the
latter only operated on previously collected results and metadata. This means
that output processors can run "off-line" using ``wa process`` command.
Both instruments and output processors are configured in the same way in the
agenda, which is why they are grouped together into "augmentations".
Augmentations are enabled by listing them under ``augmentations`` entry in a
config file or ``config`` section of the agenda.
.. code-block:: yaml
config:
augmentations:
- trace-cmd
- csv
trace-cmd:
trace_events: ['power*']
iterations: 5
workloads:
- id: 01_dhry
name: dhrystone
label: dhrystone_15over6
runtime_params:
cpu0_governor: performance
workload_params:
threads: 6
mloops: 15
- id: 02_memc
name: memcpy
- id: 03_cycl
name: cyclictest
iterations: 10
augmentations: [trace-cmd]
Instruments are not "free" and it is advisable not to have too many enabled at
once as that might skew results. For example, you don't want to have power
measurement enabled at the same time as event tracing, as the latter may prevent
cores from going into idle states and thus affecting the reading collected by
the former.
The code above illustrates an agenda entry to enabled ``trace-cmd`` instrument.
Instruments, like output processors, may be enabled (and disabled -- see below)
on per-spec basis. For example, suppose we want to collect /proc/meminfo from the
device when we run ``memcpy`` workload, but not for the other two. We can do that using
``sysfs_extractor`` instrument, and we will only enable it for ``memcpy``:
If your have multiple ``augmentations`` entries (e.g. both, in your config file
and in the agenda), then they will be combined, so that the final set of
augmentations for the run will be their union.
.. note:: WA2 did not have have augmentationts, and instead supported
"instrumentation" and "result_processors" as distinct configuration
enetries. For compantibility, these entries are still supported in
WA3, however they should be considered to be depricated, and their
use is discouraged.
Configuring augmentations
^^^^^^^^^^^^^^^^^^^^^^^^^
Most augmentations will take parameters that modify their behavior. Parameters
available for a particular augmentation can be viewed using ``wa show
<augmentation name>`` command. This will also show the default values used.
Values for these parameters can be specified by creating an entry with the
augmentation's name, and specifying parameter values under it.
.. code-block:: yaml
config:
augmentations:
- trace-cmd
- csv
trace-cmd:
trace_events: ['power*']
iterations: 5
workloads:
- id: 01_dhry
name: dhrystone
label: dhrystone_15over6
runtime_params:
cpu0_governor: performance
workload_params:
threads: 6
mloops: 15
- id: 02_memc
name: memcpy
augmentations: [sysfs_extractor]
- id: 03_cycl
name: cyclictest
iterations: 10
augmentations: [trace-cmd]
trace-cmd:
events: ['sched*', 'power*', irq]
buffer_size: 100000
As with ``config`` sections, the ``augmentations`` entry in the spec needs only to
list additional instruments and does not need to repeat instruments specified
elsewhere.
The code above specifies values for ``events`` and ``buffer_size`` parameters
for the ``trace-cmd`` instrument, as well as enabling it.
You may specify configuration for the same augmentation in multiple locations
(e.g. your config file and the config section of the agenda). These entries will
be combined to form the final configuration for the augmentation used during the
run. If different values for the same parameter are present in multiple entries,
the ones "more specific" to a particular run will be used (e.g. values in the
agenda will override those in the config file).
.. note:: Creating an entry for an augmentation alone does not enable it! You
**must** list it under ``augmentations`` in order for it to be enabed
for a run. This makes it easier to quickly enabled and diable
augmentations with complex configurations, and also allows defining
"static" configuation in top-level config, without actually enabling
the augmentation for all runs.
.. note:: At present, it is only possible to enable/disable instruments on
per-spec base. It is *not* possible to provide configuration on
per-spec basis in the current version of WA (e.g. in our example, it
is not possible to specify different ``sysfs_extractor`` paths for
different workloads). This restriction may be lifted in future
versions of WA.
Disabling augmentations
^^^^^^^^^^^^^^^^^^^^^^^
As seen above, plugins specified with ``augmentations`` clauses get added to
those already specified previously. Just because an instrument specified in
``config.yaml`` is not listed in the ``config`` section of the agenda, does
not mean it will be disabled. If you do want to disable an instrument, you can
always remove/comment it out from ``config.yaml``. However that will be
introducing a permanent configuration change to your environment (one that can
be easily reverted, but may be just as easily forgotten). If you want to
temporarily disable a output processor or an instrument for a particular run,
you can do that in your agenda by prepending a tilde (``~``) to its name.
Sometimes, you may wish to disable an augmentation for a particular run, but you
want to keep it enabled in general. You *could* modify your config file to
temporarily disable it. However, you must then remember to re-enable it
afterwards. This could be inconvenient and error prone, especially if you're
running multiple experiments in parallel and only want to disable the
augmentation for one of them.
For example, let's say we want to disable ``cpufreq`` instrument enabled in our
``config.yaml`` (suppose we're going to send results via email and so want to
reduce to total size of the output directory):
Instead, you can explicitly disable augmentation by specifying its name prefixed
with a tilde (``~``) inside ``augumentations``.
.. code-block:: yaml
config:
iterations: 5
augmentations:
- ~cpufreq
- csv
sysfs_extractor:
paths: [/proc/meminfo]
csv:
use_all_classifiers: True
augmentations: [trace-cmd, ~cpufreq]
The code above enables ``trace-cmd`` instrument and disables ``cpufreq``
instrument (which is enabled in the default config).
If you want to start configuration for an experiment form a "blank slate" and
want to disable all previously-enabled augmentations, without necessarily
knowing what they are, you can use the special ``~~`` entry.
.. code-block:: yaml
config:
augmentations: [~~, trace-cmd, csv]
The code above disables all augmentations enabled up to that point, and enabled
``trace-cmd`` and ``csv`` for this run.
.. note:: The ``~~`` only disables augmentations from previously-processed
sources. Its ordering in the list does not matter. For example,
specifying ``augmentations: [trace-cmd, ~~, csv]`` will have exactly
the same effect as above -- i.e. both trace-cmd *and* csv will be
enabled.
Workload-specific augmentation
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
It is possible to enable or disable (but not configure) augmentations at
workload or section level, as well as in the global config, in which case, the
augmentations would only be enabled/disabled for that workload/section. If the
same augmentation is enabled at one level and disabled at another, as will all
WA configuration, the more specific settings will take precedence over the less
specific ones (i.e. workloads override sections that, in turn, override global
config).
Augmentations Example
^^^^^^^^^^^^^^^^^^^^^
.. code-block:: yaml
config:
augmentations: [~~, fps]
trace-cmd:
events: ['sched*', 'power*', irq]
buffer_size: 100000
file_poller:
files:
- /sys/class/thermal/thermal_zone0/temp
sections:
- classifers:
type: energy
augmentations: [energy_measurement]
- classifers:
type: trace
augmentations: [trace-cmd, file_poller]
workloads:
- id: 01_dhry
name: dhrystone
label: dhrystone_15over6
runtime_params:
cpu0_governor: performance
workload_params:
threads: 6
mloops: 15
- id: 02_memc
name: memcpy
augmentations: [sysfs_extractor]
- id: 03_cycl
name: cyclictest
iterations: 10
- gmail
- geekbench
- googleplaybooks
- name: dhrystone
augmentations: [~fps]
The example above shows an experiment that runs a number of workloads in order
to evaluate their thermal impact and energy usage. All previously-configured
augmentations are disabled with ``~~``, so that only configuration specified in
this agenda is enabled. Since most of the workloads are "productivity" use cases
that do not generate their own metrics, ``fps`` instrument is enabled to get
some meaningful performance metrics for them; the only exception is
``dhrystone`` which is a benchmark that reports its own metrics and has not GUI,
so the instrument is disabled for it using ``~fps``.
Each workload will be run in two configurations: once, to collect energy
measurements, and once to collect thermal data and kernel trace. Trace can give
insight into why a workload is using more or less energy than expected, but it
can be relatively intrusive and might impact absolute energy and performance
metrics, which is why it is collected separately. classifiers_ are used to
separate metrics from the two configurations in the results.
Other Configuration
-------------------