Tutorial

This section introduces the code in the package and how it might be used. It is best to read the preceding sections of the documentation first to get a more detailed introduction to the tool components. See the Jupyter notebooks in the examples folder of the repository for further code examples.

Imports

The first step in using SACO is importing it as a package into a script/notebook. The most important functionality becomes available with:

from saco import Dataset, Calculator, Optimiser

From this we can see that the three most important components that a user might interact with are:

  • Dataset: provides a “container” to group and manipulate multiple data tables.

  • Calculator: takes a Dataset as input, calculates “scenario flows” and assesses their compliance against environmental flow targets.

  • Optimiser: takes a Dataset as input then formulates and solves an optimisation problem to find abstraction impact reductions needed to meet flow targets.

We provide an initial guide to these objects/components below. See the “Reference” sections of the documentation for further details, including information about additional helper functions.

Note

The backgrounds to the Calculator and Optimiser components are explained in more detail in the Calculator and Optimiser sections of the documentation.

Dataset

A Dataset is primarily used to store and group together the relevant data tables (i.e. primarily WRGIS tables). One way to get going with a Dataset (and the tool in general) is to load a folder of data table files:

ds = Dataset(data_folder='/path/to/my/data-folder')
ds.load_data()

Here we have created a Dataset object (as ds) and loaded data into memory. As a Dataset object is the main input to the Calculator and Optimiser components of the tool, we could actually now go ahead and run those components. But first let us look a bit more into what a Dataset consists of.

Tables

A Dataset object has individual data tables as its most important attributes. For example, ds.swabs provides access to the SWABS_NBB table of surface water abstractions from WRGIS (with “swabs” being the “short name” for SWABS_NBB). Similar attributes exist for the other key WRGIS tables listed below:

  • asbs: AbsSensBands_NBB (waterbody abstraction sensitivity bands)

  • asb_percs: ASBPercentages (fractional deviations defining the reference flows - typically environmental flow indicator, EFI)

  • dis: Discharges_NBB (point surface water discharges)

  • gwabs: GWABs_NBB (point groundwater abstractions)

  • qnat: QNaturalFlows_NBB (waterbody natural flows)

  • refs: REFS_NBB (reference flows - typically EFI)

  • sup: SupResGW_NBB (point “complex impacts”)

  • swabs: SWABS_NBB (point surface water abstractions)

  • wbs: IntegratedWBs_NBB (waterbody metadata)

An additional table that is not in WRGIS is derived and included in a Dataset:

  • mt: Master (waterbody summary table - water balance terms, compliance, etc)

The Master table is intended to be the key waterbody-level table that brings together the water balance components with information on surplus/deficit and compliance classifications.

The tables have various properties, but most importantly each table class possesses a data attribute, which is a pandas.DataFrame. Therefore, to access the dataframe of surface water abstractions, we can use ds.swabs.data. We can then query or manipulate the data table as we would any pandas.DataFrame. A description of the column indexes and fields is given in Table Fields.

Changing Numbers

A user may wish to changes numbers in a Dataset to improve the data or to test the implications of known, planned, hypothetical or other types of prescribed changes. Two ways to change the numbers in a Dataset are:

  • Modify the numbers in data table files on disk and then load the Dataset using syntax like the load_data example above.

  • Modify a dataframe directly using a table’s data attribute, as described in the preceding paragraph.

An example of the latter approach to change a surface water abstraction impact would be:

ds.swabs.data.loc[ds.swabs.data.index == 'swab-unique-id', 'SWQ95FLWR'] = 1.23

This would change the abstraction impact under the fully licensed (FL) artificial influences scenario at the 95th (natural) flow percentile to 1.23 (Ml/d). But the dataframe could be queried or manipulated in different ways, including through pandas merge/join operations etc.

Note

The Master (mt) table and its data should not be set or edited directly by a typical user (in general). See below about (1) using the Calculator to obtain an updated Master table and (2) using specific methods to ensure that a Dataset and its Master table are ready to go into the Optimiser.

Note

If a user changes a surface or groundwater abstraction impact in a Dataset under a given artificial influences scenario and at a given (natural) flow percentile, any long-term average abstraction columns in the relevant table are not automatically updated (currently). See Dataset for explanation of the available options to make this calculation if needed.

Other Functionality

The Dataset possesses additional functionality to help set table values, write tables to output files, work with the “network” of waterbodies, and prepare for input to the Optimiser component. This functionality (the “methods” of Dataset) are described in Dataset.

Calculator

Once a Dataset has been loaded or constructed (potentially with modifications relative to the “base” WRGIS), it can be supplied as input to the Calculator. As demonstrated below, the run method of the Calculator can then be executed to calculate scenario flows, surpluses/deficits and compliance bands based on the input Dataset:

calculator = Calculator(ds)
output_dataset = calculator.run()

In the example above we rely on default arguments, which will run the Calculator for all scenario/percentile combinations and for the whole domain in the input Dataset (i.e. all waterbodies present). It will also run with some default methodological choices (see below for more on this).

By default, the run method returns a complete Dataset with an updated Master table (i.e. one that is consistent with all the other tables in the Dataset). If we want to save this Dataset (i.e. all of its component tables) at this point, we could do so as follows (see Dataset for guidance on output options):

output_dataset.write_tables(output_folder='/my/output/folder')

However, if we want to customise the execution of the Calculator, we can provide optional arguments, as described in Calculator. One such argument defined on initialisation of the Calculator is named capping_method. This argument controls the approach to “unfeasible” impacts - prescribed abstraction impacts that cannot be satisfied. See Calculator for a more precise explanation of this point. By default, the Calculator takes a WRGIS-like approach to this issue. Using the Calculator documentation to understand the available options, we could override the default and take WBAT-like approach as follows:

calculator = Calculator(ds, capping_method='simple')

We might then for example ask the Calculator.run method to return only an updated Master table as a pandas.DataFrame:

updated_master_table = calculator.run(master_only=True)

These are just a couple of examples of customisation via optional arguments - see Calculator for more options and details.

Note

If capping has been applied to avoid propagation of unfeasible impacts, scenario flows output from the Calculator may be larger than initially expected from performing a simple “ups” water balance calculation for some waterbodies. This is because capping reduced net impacts upstream to retain physical plausibility. The Master table gives the “target” artificial influences components, which are not capped individually.

Note

Any long-term average fields in a Dataset passed to the Calculator are not changed by the execution of the Calculator.run method currently (i.e. they are unchanged in the Dataset or Master table returned).

Optimiser

The role of the Optimiser is to suggest how impacts could best be adjusted to meet flow targets, given some objective(s) and constraints. The solution to this problem is obtained via mixed integer (binary) linear programming.

Dataset Preparation

The starting point for the Optimiser is again a Dataset. However, in this case we need to ensure that certain columns are present in some tables - columns that are not necessarily relevant to the Calculator. The relevant tables and columns are (currently):

  • Master table: requires a flow target column(s) (optional for the Calculator).

  • GWABs_NBB table: requires a flag column to indicate whether a given impact (row) should be available for change in the optimisation (1) or not (0).

  • SWABS_NBB table: as per GWABs_NBB table.

  • SupResGW_NBB table: requires a flag column - we return to this below.

See Table Fields for a guide to the naming conventions for these columns.

The relevant columns can be added or set using the methods in the example snippet below (assuming still that we have a Dataset instance as ds):

ds.set_flow_targets()
ds.set_optimise_flag()

Called in this way, both of these methods will use their default settings, which are described in Dataset. Both methods have optional arguments that can be used to customise flow targets and flag which abstraction impacts will be included/excluded in optimisation.

Note

If any further manipulation of the inclusion/exclusion flag is needed it could be achieved by working with the relevant dataframes (i.e. ds.swabs.data, ds.gwabs.data and ds.sup.data).

Note

For users interested in Environmental Destination (ED) modelling, it should be noted that the default behaviour of the Dataset.set_optimise_flag method does not mimic exactly the configuration used in ED modelling for the second National Framework for Water Resources. It is important for a user to check that they are happy with the flag setup.

Note

Waterbody targets can be (optionally) specified via a separate Fix_Flags table that sits within a Dataset (accessible via the short name wbfx). If present in a Dataset, the flags in this table will be used to customise flow targets (i.e. permitting further relaxation beyond the flow in REFS_NBB). See Table Fields for more details.

Optional Arguments

Once we are happy that a Dataset is ready for the Optimiser, we could invoke the run method of the Optimiser as below:

optimiser = Optimiser(ds)
output_dataset = optimiser.run()
output_dataset.write_tables(output_folder='/my/output/folder')

However, the Optimiser section provides information on options that we may want to customise when setting up the Optimiser (i.e. before execution). One such option concerns the geographical domain considered. By default, the code above will run the Optimiser for the whole domain contained in the input Dataset. The following lines provide an example of how to run for only part of the domain (referring to the most downstream waterbody of interest as an “outlet”):

outlet_waterbody = 'outlet-waterbody-id'  # could be a list of outlet waterbodies

selected_waterbodies = ds.identify_upstream_waterbodies(outlet_waterbody)

optimiser = Optimiser(ds, domain=selected_waterbodies)
output_dataset = optimiser.run()

Other options can be specified too, such as which artificial influences scenario(s) and flow percentile(s) should be considered. Options also exist concerning the objectives of the optimisation and whether any “relaxation” should be applied when attempting to solve for a secondary objective - see Optimiser.

Outputs

The contents of the output from Optimiser.run are similar to a normal Dataset, apart from (keeping complex impacts to one side for the moment):

  • The SWABS_NBB and GWABs_NBB tables now contain abstraction impacts as they are the optimisation has been completed (i.e. the impacts that remain after the “fix”).

  • Similarly, the Master table summarises the water balance and compliance etc for the solution formulated by the Optimiser.

  • Additional tables are present: SWABS_Changes and GWABS_Changes (accessible via the output Dataset’s attributes swabs_chg and gwab_chg, respectively. These tables contain the impact reductions (Ml/d) required relative to a “reference” Dataset - see Optimiser.

Note

Long-term average abstraction is recalculated after optimisation under the assumption that the relative impact profile across the FDC remains constant. However, SWABS with hands-off flow (HOF) conditions are omitted from the recalculation at present to avoid introducing a conservative bias into the estimates. See Dataset and Optimiser for more details.

Complex Impacts

Exploratory and optional functionality has been included to allow specific types of complex impacts to be included in optimisation. As noted above, the SupResGW_NBB table thus requires a flag column to indicate whether a given complex impact (row) should be:

  • 0: Excluded from optimisation

  • 1: Included as a reservoir compensation flow increase

  • 2: Included as a complex abstraction

The default is for all complex impacts to be excluded from optimisation. Thus, the Dataset.set_optimise_flag method will insert a flag column of zeros, if a column has not already been added to the table.

If we wish to include specific complex impacts in optimisation, we need to manually adjust the flag column for the appropriate rows. This can be achieved in memory using dataframe operations, for example:

ds.sup.data.loc[ds.sup.data.index == 'complex-impact-id', 'Optimise_Flag'] = 1

If we specify that a reservoir compensation flow increase is allowed in optimisation, we also need to indicate that maximum increase permitted. This should be done for the relevant scenarios and percentiles under consideration. For example, to permit a maximum reservoir compensation flow increase of 5.0 Ml/d, we could do the following:

ds.sup.data.loc[ds.sup.data.index == 'complex-impact-id', 'SUPFLQ95_MAX_INCREASE'] = 5.0

Once the flag field and any maximum increase fields have been added to the relevant Dataset table, the Optimiser can be run in the way described above. Outputs are as detailed above, with the addition of a SupResGW_Changes table that gives any changes to complex impacts. Note that positive numbers in this table indicate increases to compensation flows or reductions in complex abstraction impacts. The latter sign convention is opposite to that for SWABS_Changes and GWABS_Changes. It arises because complex impacts can be positive or negative.

Some important points to note about the behaviour of complex impact functionality in the Optimiser are:

  • Use of this functionality will typically require local knowledge of specific impacts and how they have been represented in CAMS ledgers and WRGIS.

  • In its current implementation, the Optimiser seeks to minimise increases to reservoir compensation flows in its solution for several reasons. It will only increase a compensation flow if it helps to meet flow targets that would have otherwise been missed. Other approaches may be explored in future.

  • Complex abstractions (flag = 2) are treated in the same way as normal surface and groundwater abstractions. The Optimiser identifies required impact reductions for the scenario/percentile combination in question, rather than changes to complex licence conditions.

  • A user has the option to explore the implications of different prescribed changes to complex impacts by editing the relevant rows in the data table and excluding them from optimisation.