Tutorial
This section introduces the code in the package and how it might be used. It is best to read the preceding sections of the documentation first to get a more detailed introduction to the tool components. See the Jupyter notebooks in the examples folder of the repository for further code examples.
Imports
The first step in using SACO is importing it as a package into a script/notebook. The most important functionality becomes available with:
from saco import Dataset, Calculator, Optimiser
From this we can see that the three most important components that a user might interact with are:
Dataset: provides a “container” to group and manipulate multiple data tables.
Calculator: takes aDatasetas input, calculates “scenario flows” and assesses their compliance against environmental flow targets.
Optimiser: takes aDatasetas input then formulates and solves an optimisation problem to find abstraction impact reductions needed to meet flow targets.
We provide an initial guide to these objects/components below. See the “Reference” sections of the documentation for further details, including information about additional helper functions.
Note
The backgrounds to the Calculator and Optimiser components are explained in more detail in the Calculator and Optimiser sections of the documentation.
Dataset
A Dataset is primarily used to store and group together the relevant data tables
(i.e. primarily WRGIS tables). One way to get going with a Dataset (and the tool
in general) is to load a folder of data table files:
ds = Dataset(data_folder='/path/to/my/data-folder')
ds.load_data()
Here we have created a Dataset object (as ds) and loaded data into memory. As a
Dataset object is the main input to the Calculator and Optimiser components of the
tool, we could actually now go ahead and run those components. But first let us look a
bit more into what a Dataset consists of.
Tables
A Dataset object has individual data tables as its most important attributes. For example,
ds.swabs provides access to the SWABS_NBB table of surface water abstractions from
WRGIS (with “swabs” being the “short name” for SWABS_NBB). Similar attributes exist for
the other key WRGIS tables listed below:
asbs: AbsSensBands_NBB (waterbody abstraction sensitivity bands)
asb_percs: ASBPercentages (fractional deviations defining the reference flows - typically environmental flow indicator, EFI)
dis: Discharges_NBB (point surface water discharges)
gwabs: GWABs_NBB (point groundwater abstractions)
qnat: QNaturalFlows_NBB (waterbody natural flows)
refs: REFS_NBB (reference flows - typically EFI)
sup: SupResGW_NBB (point “complex impacts”)
swabs: SWABS_NBB (point surface water abstractions)
wbs: IntegratedWBs_NBB (waterbody metadata)
An additional table that is not in WRGIS is derived and included in a Dataset:
mt: Master (waterbody summary table - water balance terms, compliance, etc)
The Master table is intended to be the key waterbody-level table that brings together the water balance components with information on surplus/deficit and compliance classifications.
The tables have various properties, but most importantly each table class possesses a
data attribute, which is a pandas.DataFrame. Therefore, to access the dataframe of
surface water abstractions, we can use ds.swabs.data. We can then query or
manipulate the data table as we would any pandas.DataFrame. A description of the column
indexes and fields is given in Table Fields.
Changing Numbers
A user may wish to changes numbers in a Dataset to improve the data or to test the
implications of known, planned, hypothetical or other types of prescribed changes. Two
ways to change the numbers in a Dataset are:
Modify the numbers in data table files on disk and then load the
Datasetusing syntax like theload_dataexample above.Modify a dataframe directly using a table’s
dataattribute, as described in the preceding paragraph.
An example of the latter approach to change a surface water abstraction impact would be:
ds.swabs.data.loc[ds.swabs.data.index == 'swab-unique-id', 'SWQ95FLWR'] = 1.23
This would change the abstraction impact under the fully licensed (FL) artificial influences scenario at the 95th (natural) flow percentile to 1.23 (Ml/d). But the dataframe could be queried or manipulated in different ways, including through pandas merge/join operations etc.
Note
The Master (mt) table and its data should not be set or edited directly by a
typical user (in general). See below about (1) using the Calculator to obtain an
updated Master table and (2) using specific methods to ensure that a Dataset and
its Master table are ready to go into the Optimiser.
Note
If a user changes a surface or groundwater abstraction impact in a Dataset
under a given artificial influences scenario and at a given (natural) flow
percentile, any long-term average abstraction columns in the relevant table are
not automatically updated (currently). See Dataset for
explanation of the available options to make this calculation if needed.
Other Functionality
The Dataset possesses additional functionality to help set table values, write
tables to output files, work with the “network” of waterbodies, and prepare for input
to the Optimiser component. This functionality (the “methods” of Dataset) are
described in Dataset.
Calculator
Once a Dataset has been loaded or constructed (potentially with modifications
relative to the “base” WRGIS), it can be supplied as input to the Calculator. As
demonstrated below, the run method of the Calculator can then be executed to
calculate scenario flows, surpluses/deficits and compliance bands based on the input
Dataset:
calculator = Calculator(ds)
output_dataset = calculator.run()
In the example above we rely on default arguments, which will run the Calculator
for all scenario/percentile combinations and for the whole domain in the input
Dataset (i.e. all waterbodies present). It will also run with some default
methodological choices (see below for more on this).
By default, the run method returns a complete Dataset with an updated Master
table (i.e. one that is consistent with all the other tables in the Dataset). If we
want to save this Dataset (i.e. all of its component tables) at this point, we could
do so as follows (see Dataset for guidance on output options):
output_dataset.write_tables(output_folder='/my/output/folder')
However, if we want to customise the execution of the Calculator, we can provide
optional arguments, as described in Calculator. One such argument
defined on initialisation of the Calculator is named capping_method. This
argument controls the approach to “unfeasible” impacts - prescribed abstraction impacts
that cannot be satisfied. See Calculator for a more precise explanation of this
point. By default, the Calculator takes a WRGIS-like approach to this issue. Using the
Calculator documentation to understand the available options, we could
override the default and take WBAT-like approach as follows:
calculator = Calculator(ds, capping_method='simple')
We might then for example ask the Calculator.run method to return only an updated
Master table as a pandas.DataFrame:
updated_master_table = calculator.run(master_only=True)
These are just a couple of examples of customisation via optional arguments - see Calculator for more options and details.
Note
If capping has been applied to avoid propagation of unfeasible impacts, scenario flows output from the Calculator may be larger than initially expected from performing a simple “ups” water balance calculation for some waterbodies. This is because capping reduced net impacts upstream to retain physical plausibility. The Master table gives the “target” artificial influences components, which are not capped individually.
Note
Any long-term average fields in a Dataset passed to the Calculator are not
changed by the execution of the Calculator.run method currently (i.e. they are
unchanged in the Dataset or Master table returned).
Optimiser
The role of the Optimiser is to suggest how impacts could best be adjusted to meet flow targets, given some objective(s) and constraints. The solution to this problem is obtained via mixed integer (binary) linear programming.
Dataset Preparation
The starting point for the Optimiser is again a Dataset. However, in this case
we need to ensure that certain columns are present in some tables - columns that are not
necessarily relevant to the Calculator. The relevant tables and columns are
(currently):
Master table: requires a flow target column(s) (optional for the Calculator).
GWABs_NBB table: requires a flag column to indicate whether a given impact (row) should be available for change in the optimisation (1) or not (0).
SWABS_NBB table: as per GWABs_NBB table.
SupResGW_NBB table: requires a flag column - we return to this below.
See Table Fields for a guide to the naming conventions for these columns.
The relevant columns can be added or set using the methods in the example snippet below
(assuming still that we have a Dataset instance as ds):
ds.set_flow_targets()
ds.set_optimise_flag()
Called in this way, both of these methods will use their default settings, which are described in Dataset. Both methods have optional arguments that can be used to customise flow targets and flag which abstraction impacts will be included/excluded in optimisation.
Note
If any further manipulation of the inclusion/exclusion flag is needed it could be
achieved by working with the relevant dataframes (i.e. ds.swabs.data,
ds.gwabs.data and ds.sup.data).
Note
For users interested in Environmental Destination (ED) modelling, it should be noted
that the default behaviour of the Dataset.set_optimise_flag method does not
mimic exactly the configuration used in ED modelling for the second National
Framework for Water Resources. It is important for a user to check that they are
happy with the flag setup.
Note
Waterbody targets can be (optionally) specified via a separate Fix_Flags table
that sits within a Dataset (accessible via the short name wbfx). If present
in a Dataset, the flags in this table will be used to customise flow targets
(i.e. permitting further relaxation beyond the flow in REFS_NBB). See Table Fields
for more details.
Optional Arguments
Once we are happy that a Dataset is ready for the Optimiser, we could invoke the
run method of the Optimiser as below:
optimiser = Optimiser(ds)
output_dataset = optimiser.run()
output_dataset.write_tables(output_folder='/my/output/folder')
However, the Optimiser section provides information on options that we
may want to customise when setting up the Optimiser (i.e. before execution). One
such option concerns the geographical domain considered. By default, the code above
will run the Optimiser for the whole domain contained in the input Dataset. The
following lines provide an example of how to run for only part of the domain (referring
to the most downstream waterbody of interest as an “outlet”):
outlet_waterbody = 'outlet-waterbody-id' # could be a list of outlet waterbodies
selected_waterbodies = ds.identify_upstream_waterbodies(outlet_waterbody)
optimiser = Optimiser(ds, domain=selected_waterbodies)
output_dataset = optimiser.run()
Other options can be specified too, such as which artificial influences scenario(s) and flow percentile(s) should be considered. Options also exist concerning the objectives of the optimisation and whether any “relaxation” should be applied when attempting to solve for a secondary objective - see Optimiser.
Outputs
The contents of the output from Optimiser.run are similar to a normal Dataset,
apart from (keeping complex impacts to one side for the moment):
The SWABS_NBB and GWABs_NBB tables now contain abstraction impacts as they are the optimisation has been completed (i.e. the impacts that remain after the “fix”).
Similarly, the Master table summarises the water balance and compliance etc for the solution formulated by the
Optimiser.Additional tables are present: SWABS_Changes and GWABS_Changes (accessible via the output
Dataset’s attributesswabs_chgandgwab_chg, respectively. These tables contain the impact reductions (Ml/d) required relative to a “reference”Dataset- see Optimiser.
Note
Long-term average abstraction is recalculated after optimisation under the assumption that the relative impact profile across the FDC remains constant. However, SWABS with hands-off flow (HOF) conditions are omitted from the recalculation at present to avoid introducing a conservative bias into the estimates. See Dataset and Optimiser for more details.
Complex Impacts
Exploratory and optional functionality has been included to allow specific types of complex impacts to be included in optimisation. As noted above, the SupResGW_NBB table thus requires a flag column to indicate whether a given complex impact (row) should be:
0: Excluded from optimisation
1: Included as a reservoir compensation flow increase
2: Included as a complex abstraction
The default is for all complex impacts to be excluded from optimisation. Thus, the
Dataset.set_optimise_flag method will insert a flag column of zeros, if a column
has not already been added to the table.
If we wish to include specific complex impacts in optimisation, we need to manually adjust the flag column for the appropriate rows. This can be achieved in memory using dataframe operations, for example:
ds.sup.data.loc[ds.sup.data.index == 'complex-impact-id', 'Optimise_Flag'] = 1
If we specify that a reservoir compensation flow increase is allowed in optimisation, we also need to indicate that maximum increase permitted. This should be done for the relevant scenarios and percentiles under consideration. For example, to permit a maximum reservoir compensation flow increase of 5.0 Ml/d, we could do the following:
ds.sup.data.loc[ds.sup.data.index == 'complex-impact-id', 'SUPFLQ95_MAX_INCREASE'] = 5.0
Once the flag field and any maximum increase fields have been added to the relevant
Dataset table, the Optimiser can be run in the way described above. Outputs are
as detailed above, with the addition of a SupResGW_Changes table that gives any changes
to complex impacts. Note that positive numbers in this table indicate increases to
compensation flows or reductions in complex abstraction impacts. The latter sign
convention is opposite to that for SWABS_Changes and GWABS_Changes. It arises because
complex impacts can be positive or negative.
Some important points to note about the behaviour of complex impact functionality in the
Optimiser are:
Use of this functionality will typically require local knowledge of specific impacts and how they have been represented in CAMS ledgers and WRGIS.
In its current implementation, the Optimiser seeks to minimise increases to reservoir compensation flows in its solution for several reasons. It will only increase a compensation flow if it helps to meet flow targets that would have otherwise been missed. Other approaches may be explored in future.
Complex abstractions (flag = 2) are treated in the same way as normal surface and groundwater abstractions. The Optimiser identifies required impact reductions for the scenario/percentile combination in question, rather than changes to complex licence conditions.
A user has the option to explore the implications of different prescribed changes to complex impacts by editing the relevant rows in the data table and excluding them from optimisation.