workload-automation/wa/output_processors/csvproc.py

import sys

from devlib.utils.csvutil import csvwriter

from wa import OutputProcessor, Parameter
from wa.framework.exception import ConfigError
from wa.utils.types import list_of_strings


class CsvReportProcessor(OutputProcessor):

    name = 'csv'
    description = """
    Creates a ``results.csv`` in the output directory containing results for
    all iterations in CSV format, each line containing a single metric.

    """

    parameters = [
        Parameter('use_all_classifiers', kind=bool, default=False,
                  global_alias='use_all_classifiers',
                  description="""
                  If set to ``True``, this will add a column for every classifier
                  that features in at least one collected metric.

                  .. note:: This cannot be ``True`` if ``extra_columns`` is set.

                  """),
        Parameter('extra_columns', kind=list_of_strings,
                  description="""
                  List of classifiers to use as columns.

                   .. note:: This cannot be set if ``use_all_classifiers`` is
                             ``True``.

                  """),
    ]

    def validate(self):
        super(CsvReportProcessor, self).validate()
        if self.use_all_classifiers and self.extra_columns:
            msg = 'extra_columns cannot be specified when '\
                  'use_all_classifiers is True'
            raise ConfigError(msg)

    def initialize(self):
        self.outputs_so_far = []  # pylint: disable=attribute-defined-outside-init
        self.artifact_added = False

    def process_job_output(self, output, target_info, run_output):
        self.outputs_so_far.append(output)
        self._write_outputs(self.outputs_so_far, run_output)
        if not self.artifact_added:
            run_output.add_artifact('run_result_csv', 'results.csv', 'export')
            self.artifact_added = True

    def process_run_output(self, output, target_info):
        self.outputs_so_far.append(output)
        self._write_outputs(self.outputs_so_far, output)
        if not self.artifact_added:
            output.add_artifact('run_result_csv', 'results.csv', 'export')
            self.artifact_added = True

    def _write_outputs(self, outputs, output):
        if self.use_all_classifiers:
            classifiers = set([])
            for out in outputs:
                for metric in out.metrics:
                    classifiers.update(list(metric.classifiers.keys()))
            extra_columns = list(classifiers)
        elif self.extra_columns:
            extra_columns = self.extra_columns
        else:
            extra_columns = []

        outfile = output.get_path('results.csv')
        with csvwriter(outfile) as writer:
            writer.writerow(['id', 'workload', 'iteration', 'metric', ] +
                            extra_columns + ['value', 'units'])

            for o in outputs:
                if o.kind == 'job':
                    header = [o.id, o.label, o.iteration]
                elif o.kind == 'run':
                    # Should be a RunOutput. Run-level metrics aren't attached
                    # to any job so we leave 'id' and 'iteration' blank, and use
                    # the run name for the 'label' field.
                    header = [None, o.info.run_name, None]
                else:
                    raise RuntimeError(
                        'Output of kind "{}" unrecognised by csvproc'.format(o.kind))

                for metric in o.result.metrics:
                    row = (header + [metric.name] +
                           [str(metric.classifiers.get(c, ''))
                            for c in extra_columns] +
                           [str(metric.value), metric.units or ''])
                    writer.writerow(row)
Add support for Python 3 Add support for running under Python 3, while maintaining compatibility with Python 2. See http://python-future.org/compatible_idioms.html for more details behind these changes. 2018-05-30 13:58:49 +01:00			`import sys`

			`from devlib.utils.csvutil import csvwriter`
Implment output processing - Implemented result processor infrastructured - Corrected some status tracking issues (differed between states and output). - Added "csv" and "status" result processors (these will be the default enabled). 2017-03-20 16:24:22 +00:00
wa: Rename `results_processors` to `output_processors` For clarity and to better reflect their purpose, rename `results_processors` to `output_processors`. 2018-01-12 15:22:11 +00:00			`from wa import OutputProcessor, Parameter`
Implment output processing - Implemented result processor infrastructured - Corrected some status tracking issues (differed between states and output). - Added "csv" and "status" result processors (these will be the default enabled). 2017-03-20 16:24:22 +00:00			`from wa.framework.exception import ConfigError`
			`from wa.utils.types import list_of_strings`


wa: Rename `results_processors` to `output_processors` For clarity and to better reflect their purpose, rename `results_processors` to `output_processors`. 2018-01-12 15:22:11 +00:00			`class CsvReportProcessor(OutputProcessor):`
Implment output processing - Implemented result processor infrastructured - Corrected some status tracking issues (differed between states and output). - Added "csv" and "status" result processors (these will be the default enabled). 2017-03-20 16:24:22 +00:00
			`name = 'csv'`
			`description = """`
			Creates a ``results.csv`` in the output directory containing results for
			`all iterations in CSV format, each line containing a single metric.`

			`"""`

			`parameters = [`
			`Parameter('use_all_classifiers', kind=bool, default=False,`
			`global_alias='use_all_classifiers',`
			`description="""`
			If set to ``True``, this will add a column for every classifier
			`that features in at least one collected metric.`

			.. note:: This cannot be ``True`` if ``extra_columns`` is set.

			`"""),`
			`Parameter('extra_columns', kind=list_of_strings,`
			`description="""`
			`List of classifiers to use as columns.`

			.. note:: This cannot be set if ``use_all_classifiers`` is
			``True``.

			`"""),`
			`]`

			`def validate(self):`
			`super(CsvReportProcessor, self).validate()`
			`if self.use_all_classifiers and self.extra_columns:`
			`msg = 'extra_columns cannot be specified when '\`
			`'use_all_classifiers is True'`
			`raise ConfigError(msg)`

			`def initialize(self):`
csvproc: Fix process_run_output We currently populate results_so_far with a JobOutput for each Job and then a Result for the RunOutput. This results in a bug when trying to access the id/label/iteration. This is fixed by always ensuring the we store Output objects and not Results (results_so_far is renamed to outputs_so_far to reflect this), and treating the RunOutput specially in _write_outputs. 2017-11-06 17:04:40 +00:00			`self.outputs_so_far = [] # pylint: disable=attribute-defined-outside-init`
Implment output processing - Implemented result processor infrastructured - Corrected some status tracking issues (differed between states and output). - Added "csv" and "status" result processors (these will be the default enabled). 2017-03-20 16:24:22 +00:00			`self.artifact_added = False`

			`def process_job_output(self, output, target_info, run_output):`
csvproc: Fix process_run_output We currently populate results_so_far with a JobOutput for each Job and then a Result for the RunOutput. This results in a bug when trying to access the id/label/iteration. This is fixed by always ensuring the we store Output objects and not Results (results_so_far is renamed to outputs_so_far to reflect this), and treating the RunOutput specially in _write_outputs. 2017-11-06 17:04:40 +00:00			`self.outputs_so_far.append(output)`
			`self._write_outputs(self.outputs_so_far, run_output)`
Implment output processing - Implemented result processor infrastructured - Corrected some status tracking issues (differed between states and output). - Added "csv" and "status" result processors (these will be the default enabled). 2017-03-20 16:24:22 +00:00			`if not self.artifact_added:`
			`run_output.add_artifact('run_result_csv', 'results.csv', 'export')`
			`self.artifact_added = True`

csvproc: Fix process_run_output method name 2017-11-03 14:05:03 +00:00			`def process_run_output(self, output, target_info):`
csvproc: Fix process_run_output We currently populate results_so_far with a JobOutput for each Job and then a Result for the RunOutput. This results in a bug when trying to access the id/label/iteration. This is fixed by always ensuring the we store Output objects and not Results (results_so_far is renamed to outputs_so_far to reflect this), and treating the RunOutput specially in _write_outputs. 2017-11-06 17:04:40 +00:00			`self.outputs_so_far.append(output)`
			`self._write_outputs(self.outputs_so_far, output)`
Implment output processing - Implemented result processor infrastructured - Corrected some status tracking issues (differed between states and output). - Added "csv" and "status" result processors (these will be the default enabled). 2017-03-20 16:24:22 +00:00			`if not self.artifact_added:`
			`output.add_artifact('run_result_csv', 'results.csv', 'export')`
			`self.artifact_added = True`

csvproc: Fix process_run_output We currently populate results_so_far with a JobOutput for each Job and then a Result for the RunOutput. This results in a bug when trying to access the id/label/iteration. This is fixed by always ensuring the we store Output objects and not Results (results_so_far is renamed to outputs_so_far to reflect this), and treating the RunOutput specially in _write_outputs. 2017-11-06 17:04:40 +00:00			`def _write_outputs(self, outputs, output):`
Implment output processing - Implemented result processor infrastructured - Corrected some status tracking issues (differed between states and output). - Added "csv" and "status" result processors (these will be the default enabled). 2017-03-20 16:24:22 +00:00			`if self.use_all_classifiers:`
			`classifiers = set([])`
processors/csv: fix artifact writing Rename loop variable inside _write_outputs so that it doesn't clash with the argument that gets passed (this resulted in writing to the wrong location when writing a run artifact from a job). 2017-11-24 16:26:38 +00:00			`for out in outputs:`
			`for metric in out.metrics:`
Add support for Python 3 Add support for running under Python 3, while maintaining compatibility with Python 2. See http://python-future.org/compatible_idioms.html for more details behind these changes. 2018-05-30 13:58:49 +01:00			`classifiers.update(list(metric.classifiers.keys()))`
Implment output processing - Implemented result processor infrastructured - Corrected some status tracking issues (differed between states and output). - Added "csv" and "status" result processors (these will be the default enabled). 2017-03-20 16:24:22 +00:00			`extra_columns = list(classifiers)`
			`elif self.extra_columns:`
			`extra_columns = self.extra_columns`
			`else:`
			`extra_columns = []`

			`outfile = output.get_path('results.csv')`
Add support for Python 3 Add support for running under Python 3, while maintaining compatibility with Python 2. See http://python-future.org/compatible_idioms.html for more details behind these changes. 2018-05-30 13:58:49 +01:00			`with csvwriter(outfile) as writer:`
Implment output processing - Implemented result processor infrastructured - Corrected some status tracking issues (differed between states and output). - Added "csv" and "status" result processors (these will be the default enabled). 2017-03-20 16:24:22 +00:00			`writer.writerow(['id', 'workload', 'iteration', 'metric', ] +`
			`extra_columns + ['value', 'units'])`

csvproc: Fix process_run_output We currently populate results_so_far with a JobOutput for each Job and then a Result for the RunOutput. This results in a bug when trying to access the id/label/iteration. This is fixed by always ensuring the we store Output objects and not Results (results_so_far is renamed to outputs_so_far to reflect this), and treating the RunOutput specially in _write_outputs. 2017-11-06 17:04:40 +00:00			`for o in outputs:`
			`if o.kind == 'job':`
			`header = [o.id, o.label, o.iteration]`
			`elif o.kind == 'run':`
			`# Should be a RunOutput. Run-level metrics aren't attached`
			`# to any job so we leave 'id' and 'iteration' blank, and use`
			`# the run name for the 'label' field.`
			`header = [None, o.info.run_name, None]`
			`else:`
			`raise RuntimeError(`
			`'Output of kind "{}" unrecognised by csvproc'.format(o.kind))`

Implment output processing - Implemented result processor infrastructured - Corrected some status tracking issues (differed between states and output). - Added "csv" and "status" result processors (these will be the default enabled). 2017-03-20 16:24:22 +00:00			`for metric in o.result.metrics:`
			`row = (header + [metric.name] +`
csvproc: Whitespace 2017-11-03 13:41:00 +00:00			`[str(metric.classifiers.get(c, ''))`
Implment output processing - Implemented result processor infrastructured - Corrected some status tracking issues (differed between states and output). - Added "csv" and "status" result processors (these will be the default enabled). 2017-03-20 16:24:22 +00:00			`for c in extra_columns] +`
			`[str(metric.value), metric.units or ''])`
			`writer.writerow(row)`