doc: re-write augmentations docs

Rewrite the "Output processors and Instruments" section into "Augmentations" section.
2025-07-04 14:13:35 +01:00 · 2018-05-14 14:34:17 +01:00
parent c6fae6fa55
commit d0368cf176
1 changed files with 152 additions and 173 deletions
--- a/doc/source/how_tos/users/agenda.rst
+++ b/doc/source/how_tos/users/agenda.rst
@ -504,206 +504,185 @@ turn override global settings.



-Output Processors and Instruments
----------------------------------
-
-Output Processors
-^^^^^^^^^^^^^^^^^
-
-Output processors, as the name suggests, handle the processing of output
-generated form running workload specs. By default, WA enables a couple of basic
-output processors (e.g. one generates a csv file with all scores reported by
-workloads), which you can see in ``~/.workload_automation/config.yaml``. However,
-WA has a number of other, more specialized, output processors (e.g. for
-uploading to databases). You can list available output processors with
-``wa list output_processors`` command. If you want to permanently enable a
-output processor, you can add it to your ``config.yaml``. You can also enable a
-output processor for a particular run by specifying it in the ``config`` section
-in the agenda. As the name suggests, ``config`` section mirrors the structure of
-``config.yaml``, and anything that can be specified in the latter, can also be
-specified in the former.
-
-As with workloads, output processors may have parameters that define their
-behaviour. Parameters of output processors are specified a little differently,
-however. Output processor parameter values are listed in the config section,
-namespaced under the name of the output processor.
-
-For example, suppose we want to be able to easily query the output generated by
-the workload specs we've defined so far. We can use ``sqlite`` output processor
-to have WA create an sqlite_ database file with the results. By default, this
-file will be generated in WA's output directory (at the same level as
-results.csv); but suppose we want to store the results in the same file for
-every run of the agenda we do. This can be done by specifying an alternative
-database file with ``database`` parameter of the output processor:
-
-
-.. code-block:: yaml
-
-        config:
-                augmentations:
-                    - sqlite
-                sqlite:
-                        database: ~/my_wa_results.sqlite
-                iterations: 5
-        workloads:
-                - id: 01_dhry
-                  name: dhrystone
-                  label: dhrystone_15over6
-                  runtime_params:
-                        cpu0_governor: performance
-                  workload_params:
-                        threads: 6
-                        mloops: 15
-                - id: 02_memc
-                  name: memcpy
-                - id: 03_cycl
-                  name: cyclictest
-                  iterations: 10
-
-A couple of things to observe here:
-
- There is no need to repeat the output processors listed in ``config.yaml``. The
-  processors listed in ``augmentations`` entry in the agenda will be used
-  *in addition to* those defined in the ``config.yaml``.
- The database file is specified under "sqlite" entry in the config section.
-  Note, however, that this entry alone is not enough to enable the output
-  processor, it must be listed in ``augmentations``, otherwise the "sqilte"
-  config entry will be ignored.
- The database file must be specified as an absolute path, however it may use
-  the user home specifier '~' and/or environment variables.
-
-.. _sqlite: http://www.sqlite.org/
+Augmentationts
+--------------

+Augmentations are plugins that augment the execution of workload jobs with
+additional functionality; usually, that takes the form of generating additional
+metrics and/or artifacts, such as traces or logs. There are two types of
+augmentations:

 Instruments
-^^^^^^^^^^^
+        These "instrument" a WA run in order to change it's behavior (e.g.
+        introducing delays between successive job executions), or collect
+        additional measurements (e.g. energy usage). Some instruments may depend
+        on particular features being enabled on the target (e.g. cpufreq), or
+        on additional hardware (e.g. energy probes).

-WA can enable various "instruments" to be used during workload execution.
-Instruments can be quite diverse in their functionality, but the majority of
-instruments available in WA today are there to collect additional data (such as
-trace) from the device during workload execution. You can view the list of
-available instruments by using ``wa list instruments`` command. As with output
-processors, a few are enabled by default in the ``config.yaml`` and additional
-ones may be added in the same place, or specified in the agenda using
-``augmentations`` entry.
+Output processors
+        These post-process metrics and artifacts generated by workloads or
+        instruments, as well as target metadata collected by WA, in order to
+        generate additional metrics and/or artifacts (e.g. generating statistics
+        or reports). Output processors are also used to export WA output
+        externally (e.g. upload to a database).

-For example, we can collect power events from trace cmd by using the ``trace-cmd``
-instrument.
+The main practical difference between instruments and output processors, is that
+the former rely on an active connection to the target to function, where as the
+latter only operated on previously collected results and metadata. This means
+that output processors can run "off-line" using ``wa process`` command.
+
+Both instruments and output processors are configured in the same way in the
+agenda, which is why they are grouped together into "augmentations".
+Augmentations are enabled by listing them under ``augmentations`` entry in a
+config file or ``config`` section of the agenda.

 .. code-block:: yaml

        config:
-            augmentations:
-                - trace-cmd
-                - csv
-            trace-cmd:
-                    trace_events: ['power*']
-            iterations: 5
-        workloads:
-            - id: 01_dhry
-              name: dhrystone
-              label: dhrystone_15over6
-              runtime_params:
-                    cpu0_governor: performance
-              workload_params:
-                    threads: 6
-                    mloops: 15
-            - id: 02_memc
-              name: memcpy
-            - id: 03_cycl
-              name: cyclictest
-              iterations: 10
+                augmentations: [trace-cmd]

-Instruments are not "free" and it is advisable not to have too many enabled at
-once as that might skew results. For example, you don't want to have power
-measurement enabled at the same time as event tracing, as the latter may prevent
-cores from going into idle states and thus affecting the reading collected by
-the former.
+The code above illustrates an agenda entry to enabled ``trace-cmd`` instrument.

-Instruments, like output processors, may be enabled (and disabled -- see below)
-on per-spec basis. For example, suppose we want to collect /proc/meminfo from the
-device when we run ``memcpy`` workload, but not for the other two. We can do that using
-``sysfs_extractor`` instrument, and we will only enable it for ``memcpy``:
+If your have multiple ``augmentations`` entries (e.g. both, in your config file
+and in the agenda), then they will be combined, so that the final  set of
+augmentations for the run  will be their union.
+
+.. note:: WA2 did not have have augmentationts, and instead supported
+          "instrumentation" and "result_processors" as distinct configuration
+          enetries. For compantibility, these entries are still supported in
+          WA3, however they should be considered to be depricated, and their
+          use is discouraged.
+
+
+Configuring augmentations
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Most augmentations will take parameters that modify their behavior. Parameters
+available for a particular augmentation can be viewed using ``wa show
+<augmentation name>`` command. This will also show the default values used.
+Values for these parameters can be specified by creating an entry with the
+augmentation's name, and specifying parameter values under it.

 .. code-block:: yaml

        config:
-            augmentations:
-                - trace-cmd
-                - csv
-            trace-cmd:
-                    trace_events: ['power*']
-            iterations: 5
-        workloads:
-                - id: 01_dhry
-                  name: dhrystone
-                  label: dhrystone_15over6
-                  runtime_params:
-                        cpu0_governor: performance
-                  workload_params:
-                        threads: 6
-                        mloops: 15
-                - id: 02_memc
-                  name: memcpy
-                  augmentations: [sysfs_extractor]
-                - id: 03_cycl
-                  name: cyclictest
-                  iterations: 10
+                augmentations: [trace-cmd]
+                trace-cmd:
+                        events: ['sched*', 'power*', irq]
+                        buffer_size: 100000

-As with ``config`` sections, the ``augmentations`` entry in the spec needs only to
-list additional instruments and does not need to repeat instruments specified
-elsewhere.
+The code above specifies values for ``events`` and ``buffer_size`` parameters
+for the ``trace-cmd`` instrument, as well as enabling it.
+
+You may specify configuration for the same augmentation in multiple locations
+(e.g. your config file and the config section of the agenda). These entries will
+be combined to form the final configuration for the augmentation used during the
+run. If different values for the same parameter are present in multiple entries,
+the ones "more specific" to a particular run will be used (e.g. values in the
+agenda will override those in the config file).
+
+.. note:: Creating an entry for an augmentation alone does not enable it! You
+          **must** list it under ``augmentations`` in order for it to be enabed
+          for a run. This makes it easier to quickly enabled and diable
+          augmentations with complex configurations, and also allows defining
+          "static" configuation in top-level config, without actually enabling
+          the augmentation for all runs.

-.. note:: At present, it is only possible to enable/disable instruments  on
-          per-spec base. It is *not* possible to provide configuration on
-          per-spec basis in the current version of WA (e.g. in our example, it
-          is not possible to specify different ``sysfs_extractor`` paths for
-          different workloads). This restriction may be lifted in future
-          versions of WA.

 Disabling augmentations
 ^^^^^^^^^^^^^^^^^^^^^^^

-As seen above, plugins specified with ``augmentations`` clauses get added to
-those already specified previously. Just because an instrument specified in
-``config.yaml`` is not listed in the ``config`` section of the agenda, does
-not mean it will be disabled. If you do want to disable an instrument, you can
-always remove/comment it out from ``config.yaml``. However that will be
-introducing a permanent configuration change to your environment (one that can
-be easily reverted, but may be just as easily forgotten). If you want to
-temporarily disable a output processor or an instrument for a particular run,
-you can do that in your agenda by prepending a tilde (``~``) to its name.
+Sometimes, you may wish to disable an augmentation for a particular run, but you
+want to keep it enabled in general. You *could* modify your config file to
+temporarily disable it. However, you must then remember to re-enable it
+afterwards. This could be inconvenient and error prone, especially if you're
+running multiple experiments in parallel and only want to disable the
+augmentation for one of them.

-For example, let's say we want to disable ``cpufreq`` instrument enabled in our
-``config.yaml`` (suppose we're going to send results via email and so want to
-reduce to total size of the output directory):
+Instead, you can explicitly disable augmentation by specifying its name prefixed
+with a tilde (``~``) inside ``augumentations``.

 .. code-block:: yaml

        config:
-                iterations: 5
-                augmentations:
-                    - ~cpufreq
-                    - csv
-                sysfs_extractor:
-                        paths: [/proc/meminfo]
-                csv:
-                    use_all_classifiers: True
+                augmentations: [trace-cmd, ~cpufreq]
+
+The code above enables ``trace-cmd`` instrument and disables ``cpufreq``
+instrument (which is enabled in the default config).
+
+If you want to start configuration for an experiment form a "blank slate" and
+want to disable all previously-enabled augmentations, without necessarily
+knowing what they are, you can use the special ``~~`` entry.
+
+.. code-block:: yaml
+
+        config:
+                augmentations: [~~, trace-cmd, csv]
+
+The code above disables all augmentations enabled up to that point, and enabled
+``trace-cmd`` and ``csv`` for this run.
+
+.. note:: The ``~~`` only disables augmentations from previously-processed
+          sources. Its ordering in the list does not matter. For example,
+          specifying ``augmentations: [trace-cmd, ~~, csv]`` will have exactly
+          the same effect as above -- i.e. both trace-cmd *and* csv will be
+          enabled.
+
+Workload-specific augmentation
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+It is possible to enable or disable (but not configure) augmentations at
+workload or section level, as well as in the global config, in which case, the
+augmentations would only be enabled/disabled for that workload/section. If the
+same augmentation is enabled at one level and disabled at another, as will all
+WA configuration, the more specific settings will take precedence over the less
+specific ones (i.e. workloads override sections that, in turn, override global
+config).
+
+
+Augmentations Example
+^^^^^^^^^^^^^^^^^^^^^
+
+
+.. code-block:: yaml
+
+        config:
+                augmentations: [~~, fps]
+                trace-cmd:
+                        events: ['sched*', 'power*', irq]
+                        buffer_size: 100000
+                file_poller:
+                        files:
+                                - /sys/class/thermal/thermal_zone0/temp
+        sections:
+                - classifers:
+                        type: energy
+                augmentations: [energy_measurement]
+                - classifers:
+                        type: trace
+                augmentations: [trace-cmd, file_poller]
        workloads:
-                - id: 01_dhry
-                  name: dhrystone
-                  label: dhrystone_15over6
-                  runtime_params:
-                        cpu0_governor: performance
-                  workload_params:
-                        threads: 6
-                        mloops: 15
-                - id: 02_memc
-                  name: memcpy
-                  augmentations: [sysfs_extractor]
-                - id: 03_cycl
-                  name: cyclictest
-                  iterations: 10
+                - gmail
+                - geekbench
+                - googleplaybooks
+                - name: dhrystone
+                  augmentations: [~fps]
+
+The example above shows an experiment that runs a number of workloads in order
+to evaluate their thermal impact and energy usage. All previously-configured
+augmentations are disabled with ``~~``, so that only configuration specified in
+this agenda is enabled. Since most of the workloads are "productivity" use cases
+that do not generate their own metrics, ``fps`` instrument is enabled to get
+some meaningful performance metrics for them; the only exception is
+``dhrystone`` which is a benchmark that reports its own metrics and has not GUI,
+so the instrument is disabled for it using ``~fps``.
+
+Each workload will be run in two configurations: once, to collect energy
+measurements, and once to collect thermal data and kernel trace. Trace can give
+insight into why a workload is using more or less energy than expected, but it
+can be relatively intrusive and might impact absolute energy and performance
+metrics, which is why it is collected separately. classifiers_ are used to
+separate metrics from the two configurations in the results.

 Other Configuration
 -------------------