Skip to content

Advanced functions to use the MaStR SOAP-API

open_mastr.soap_api.download.MaStRAPI

Bases: object

Access the Marktstammdatenregister (MaStR) SOAP API via a Python wrapper

Read about MaStR account and credentials how to create a user account and a role including a token to access the MaStR SOAP API.

Create an MaStRAPI instance with your role credentials

   mastr_api = MaStRAPI(
        user="SOM123456789012",
        key=""koo5eixeiQuoi'w8deighai8ahsh1Ha3eib3coqu7ceeg%ies..."
   )

Alternatively, leave user and key empty if user and token are accessible via credentials.cfg. How to configure this is described here.

    mastr_api = MaStRAPI()

Now, you can use the MaStR API instance to call pre-defined SOAP API queries via the class' methods. For example, get a list of units limited to two entries.

   mastr_api.GetListeAlleEinheiten(limit=2)

Note

As the example shows, you don't have to pass credentials for calling wrapped SOAP queries. This is handled internally.

__init__(user=None, key=None)

PARAMETER DESCRIPTION
user

MaStR-ID (MaStR-Nummer) for the account that was created on https://www.marktstammdatenregister.de Typical format: SOM123456789012

TYPE: str DEFAULT: None

key

Access token of a role (Benutzerrolle). Might look like: "koo5eixeiQuoi'w8deighai8ahsh1Ha3eib3coqu7ceeg%ies..."

TYPE: str DEFAULT: None

open_mastr.soap_api.download.MaStRDownload

Warning

This class is deprecated and will not be maintained from version 0.15.0 onwards. Instead use Mastr.download with parameter method = "bulk" to get bulk downloads of the dataset.

Use the higher level interface for bulk download

MaStRDownload builds on top of MaStRAPI and provides an interface for easier downloading. Use methods documented below to retrieve specific data. On the example of data for nuclear power plants, this looks like

    from open_mastr.soap_api.download import MaStRDownload

    mastr_dl = MaStRDownload()

    for tech in ["nuclear", "hydro", "wind", "solar", "biomass", "combustion", "gsgk"]:
        power_plants = mastr_dl.download_power_plants(tech, limit=10)
        print(power_plants.head())

Warning

Be careful with increasing limit. Typically, your account allows only for 10,000 API request per day.

__init__(parallel_processes=None)

PARAMETER DESCRIPTION
parallel_processes

Specify number of parallel unit data download, respectively the number of processes you want to use for downloading. For single-process download (avoiding the use of python multiprocessing package) choose False. Defaults to number of cores (including hyperthreading).

TYPE: int or bool DEFAULT: None

additional_data(data, unit_ids, data_fcn, timeout=10)

Retrieve addtional informations about units.

Extended information on units is available. Depending on type, additional data from EEG and KWK subsidy program are available. Furthermore, for some units, data about permit is retrievable.

PARAMETER DESCRIPTION
data

data, see :meth:MaStRDownload.download_power_plants

TYPE: str

unit_ids

Unit identifier for additional data

TYPE: list

data_fcn

Name of method from :class:MaStRDownload to be used for querying additional data. Choose from

  • "extended_unit_data" (:meth:~.extended_unit_data): Extended information (i.e. technical, location) about a unit. The exact set of information depends on the data type.
  • "eeg_unit_data" (:meth:~.eeg_unit_data): Unit Information from EEG unit registry. The exact set of information depends on the data.
  • "kwk_unit_data" (:meth:~.kwk_unit_data): Unit information from KWK unit registry.
  • "permit_unit_data" (:meth:~.permit_unit_data): Information about the permit process of a unit.

TYPE: str

timeout

Timeout limit for data retrieval for each unit when using multiprocessing. Defaults to 10.

DEFAULT: 10

RETURNS DESCRIPTION
tuple of list of dict and tuple

Returns additional data in dictionaries that are packed into a list.

    return = (
        [additional_unit_data_dict1, additional_unit_data_dict2, ...],
        [tuple("SME930865355925", "Reason for failing dowload"), ...]
        )

basic_location_data(limit=2000, date_from=None, max_retries=3)

Retrieve basic location data in chunks

Retrieves data for all types of locations at once using MaStRAPI.GetListeAlleLokationen. Locations include

  • Electricity generation location (SEL - Stromerzeugungslokation)
  • Electricity consumption location (SVL - Stromverbrauchslokation)
  • Gas generation location (GEL - Gaserzeugungslokation)
  • Gas consumption location (GVL - Gasverbrauchslokation)
PARAMETER DESCRIPTION
limit

Maximum number of locations to download.

Warning

Mind the daily request limit for your MaStR account, usually 10,000 per day.

DEFAULT: 2000

date_from

If specified, only locations with latest change date newer than this are queried.

DEFAULT: None

max_retries

Maximum number of retries for each chunk in case of errors with the connection to the server.

DEFAULT: 3

YIELDS DESCRIPTION
generator of generators

For each chunk a separate generator is returned all wrapped into another generator. Access with

    chunks = mastr_dl.basic_location_data(
        date_from=datetime.datetime(2020, 11, 7, 0, 0, 0), limit=2010
        )

    for chunk in chunks:
        for location in chunk:
            print(location) # prints out one dict per location one after another

basic_unit_data(data=None, limit=2000, date_from=None, max_retries=3)

Download basic unit information for one data type.

Retrieves basic information about units. The number of unit in bound to limit.

PARAMETER DESCRIPTION
data

Technology data is requested for. See :meth:MaStRDownload.download_power_plants for options. Data is retrieved using MaStRAPI.GetGefilterteListeStromErzeuger. If not given, it defaults to None. This implies data for all available technologies is retrieved using the web service function MaStRAPI.GetListeAlleEinheiten.

TYPE: str DEFAULT: None

limit

Maximum number of units to download. If not provided, data for all units is downloaded.

Warning

Mind the daily request limit for your MaStR account.

TYPE: int DEFAULT: 2000

date_from

If specified, only units with latest change date newer than this are queried. Defaults to None.

DEFAULT: None

max_retries

Maximum number of retries in case of errors with the connection to the server.

DEFAULT: 3

YIELDS DESCRIPTION
list of dict

A generator of dicts is returned with each dictionary containing information about one unit.

download_power_plants(data, limit=None)

Download power plant unit data for one data type.

Based on list with basic information about each unit, subsequently additional data is retrieved:

  • Extended unit data
  • EEG data is collected during support of renewable energy installations by the Erneuerbare-Energie-Gesetz.
  • KWK data is collected to the support program Kraft-Waerme-Kopplung
  • Permit data is available for some installations (German: Genehmigungsdaten)

Data is stored in CSV file format in ~/open-MaStR/data/<version>/ by default.

PARAMETER DESCRIPTION
data

Retrieve unit data for one power system unit. Power plants are grouped by following technologies:

  • 'nuclear'
  • 'hydro'
  • 'solar'
  • 'wind'
  • 'biomass'
  • 'combustion'
  • 'gsgk'
  • 'storage'

TYPE: str

limit

Maximum number of units to be downloaded.

TYPE: int DEFAULT: None

RETURNS DESCRIPTION
DataFrame

Joined data tables.

eeg_unit_data(unit_specs)

Download EEG (Erneuerbare Energien Gesetz) data for a unit.

Additional data collected during a subsidy program for supporting installations of renewable energy power plants.

PARAMETER DESCRIPTION
unit_specs

EegMastrnummer and data type as tuple that for example looks like

.. code-block:: python

tuple("EEG961554380393", "hydro")

TYPE: tuple

RETURNS DESCRIPTION
dict

EEG details about unit, if download successful, otherwise empty dict

tuple

EegMastrNummer and message the explains why a download failed. Format

   tuple("EEG961554380393", "Reason for failing dowload")

extended_unit_data(unit_specs)

Download extended data for a unit.

This extended unit information is provided separately.

PARAMETER DESCRIPTION
unit_specs

EinheitMastrNummer and data type as tuple that for example looks like

   tuple("SME930865355925", "hydro")

TYPE: tuple

RETURNS DESCRIPTION
dict

Extended information about unit, if download successful, otherwise empty dict

tuple

EinheitMastrNummer and message the explains why a download failed. Format

   tuple("SME930865355925", "Reason for failing dowload")

kwk_unit_data(unit_specs)

Download KWK (german: Kraft-Wärme-Kopplung, english: Combined Heat and Power, CHP) data for a unit.

Additional data collected during a subsidy program for supporting combined heat power plants.

PARAMETER DESCRIPTION
unit_specs

KwkMastrnummer and data type as tuple that for example looks like

   tuple("KWK910493229164", "biomass")

TYPE: tuple

RETURNS DESCRIPTION
dict

KWK details about unit, if download successful, otherwise empty dict

tuple

KwkMastrNummer and message the explains why a download failed. Format

   tuple("KWK910493229164", "Reason for failing dowload")

location_data(specs)

Download extended data for a location

Allows to download additional data for different location types, see specs.

PARAMETER DESCRIPTION
specs

Location Mastrnummer and data_name as tuple that for example looks like

   tuple("SEL927688371072", "location_elec_generation")

TYPE: tuple

RETURNS DESCRIPTION
dict

Detailed information about a location, if download successful, otherwise empty dict

tuple

Location MastrNummer and message the explains why a download failed. Format

   tuple("SEL927688371072", "Reason for failing dowload")

permit_unit_data(unit_specs)

Download permit data for a unit.

PARAMETER DESCRIPTION
unit_specs

GenMastrnummer and data type as tuple that for example looks like

   tuple("SGE952474728808", "biomass")

TYPE: tuple

RETURNS DESCRIPTION
dict

Permit details about unit, if download successful, otherwise empty dict

tuple

GenMastrNummer and message the explains why a download failed. Format

   tuple("GEN952474728808", "Reason for failing dowload")

open_mastr.soap_api.mirror.MaStRMirror

Warning

This class is deprecated and will not be maintained from version 0.15.0 onwards. Instead use Mastr.download with parameter method = "bulk" to mirror the MaStR dataset to a local database.

Mirror the Marktstammdatenregister database and keep it up-to-date.

A PostgreSQL database is used to mirror the MaStR database. It builds on functionality for bulk data download provided by MaStRDownload.

A rough overview is given by the following schema on the example of wind power units. Schema on the example of downloading wind power units using the API

Initially, basic unit data gets backfilled with ~.backfill_basic (downloads basic unit data for 2,000 units of type 'solar').

   from open_mastr.soap_api.prototype_mastr_reflected import MaStRMirror

   mastr_mirror = MaStRMirror()
   mastr_mirror.backfill_basic("solar", limit=2000)
Based on this, requests for additional data are created. This happens during backfilling basic data. But it is also possible to (re-)create requests for remaining additional data using MaStRMirror.create_additional_data_requests.

   mastr_mirror.create_additional_data_requests("solar")

Additional unit data, in the case of wind power this is extended data, EEG data and permit data, can be retrieved subsequently by ~.retrieve_additional_data.

    mastr_mirror.retrieve_additional_data("solar", ["unit_data"])

The data can be joined to one table for each data type and exported to CSV files using Mastr.to_csv.

Also consider to use ~.dump and ~.restore for specific purposes.

Note

This feature was built before the bulk download was offered at marktstammdatenregister.de. It can still be used to compare the two datasets received from the API and the bulk download.

__init__(engine, restore_dump=None, parallel_processes=None)

PARAMETER DESCRIPTION
engine

database engine

restore_dump

Save path of SQL dump file including filename. The database is restored from the SQL dump. Defaults to None which means nothing gets restored. Should be used in combination with empty_schema=True.

DEFAULT: None

parallel_processes

Number of parallel processes used to download additional data. Defaults to None.

DEFAULT: None

backfill_basic(data=None, date=None, limit=10 ** 8)

Backfill basic unit data.

Fill database table 'basic_units' with data. It allows specification of which data should be retrieved via the described parameter options.

Under the hood, MaStRDownload.basic_unit_data is used.

PARAMETER DESCRIPTION
data

Specify data types for which data should be backfilled.

  • ['solar']: Backfill data for a single data type.
  • ['solar', 'wind'] (list): Backfill data for multiple technologies given in a list.

DEFAULT: None

date

Specify backfill date from which on data is retrieved

Only data with modification time stamp greater that date is retrieved.

  • datetime.datetime(2020, 11, 27): Retrieve data which is newer than this time stamp
  • 'latest': Retrieve data which is newer than the newest data already in the table. It is aware of a different 'latest date' for each data. Hence, it works in combination with data=None and data=["wind", "solar"] for example.

!!! warning

 Don't use 'latest' in combination with `limit`. This might
 lead to unexpected results.
  • None: Complete backfill

DEFAULT: None

limit

Maximum number of units. Defaults to the large number of 10**8 which means all available data is queried. Use with care!

DEFAULT: 10 ** 8

backfill_locations_basic(limit=10 ** 7, date=None, delete_additional_data_requests=True)

Backfill basic location data.

Fill database table 'locations_basic' with data. It allows specification of which data should be retrieved via the described parameter options.

Under the hood, MaStRDownload.basic_location_data is used.

PARAMETER DESCRIPTION
date

Specify backfill date from which on data is retrieved

Only data with modification time stamp greater that date is retrieved.

  • datetime.datetime(2020, 11, 27): Retrieve data which is newer than this time stamp
  • 'latest': Retrieve data which is newer than the newest data already in the table. !!! warning

    Don't use 'latest' in combination with limit. This might lead to unexpected results. * None: Complete backfill

DEFAULT: None

limit

Maximum number of locations to download. Defaults to None which means no limit is set and all available data is queried. Use with care!

DEFAULT: 10 ** 7

delete_additional_data_requests

Useful to speed up download of data. Ignores existence of already created requests for additional data and skips deletion these.

DEFAULT: True

create_additional_data_requests(technology, data_types=['unit_data', 'eeg_data', 'kwk_data', 'permit_data'], delete_existing=True)

Create new requests for additional unit data

For units that exist in basic_units but not in the table for additional data of data_type, a new data request is submitted.

PARAMETER DESCRIPTION
technology

Specify technology additional data should be requested for.

data_types

Select type of additional data that is to be requested. Defaults to all data that is available for a technology.

DEFAULT: ['unit_data', 'eeg_data', 'kwk_data', 'permit_data']

delete_existing

Toggle deletion of already existing requests for additional data. Defaults to True.

DEFAULT: True

dump(dumpfile='open-mastr-continuous-update.backup')

Dump MaStR database.

PARAMETER DESCRIPTION
dumpfile

Save path for dump including filename. When only a filename is given, the dump is saved to CWD.

TYPE: str or path - like DEFAULT: 'open-mastr-continuous-update.backup'

restore(dumpfile)

Restore the MaStR database from an SQL dump.

PARAMETER DESCRIPTION
dumpfile

Save path for dump including filename. When only a filename is given, the dump is restored from CWD.

TYPE: str or path - like

retrieve_additional_data(data, data_type, limit=10 ** 8, chunksize=1000)

Retrieve additional unit data

Execute additional data requests stored in open_mastr.soap_api.orm.AdditionalDataRequested. See also docs of MaStRDownload.additional_data for more information on how data is downloaded.

PARAMETER DESCRIPTION
data

See list of available technologies in open_mastr.soap_api.download.py.MaStRDownload.download_power_plants.

data_type

Select type of additional data that is to be retrieved. Choose from "unit_data", "eeg_data", "kwk_data", "permit_data".

limit

Limit number of units that data is download for. Defaults to the very large number 10**8 which refers to query data for existing data requests, for example created by ~.create_additional_data_requests.

DEFAULT: 10 ** 8

chunksize

Data is downloaded and inserted into the database in chunks of chunksize. Defaults to 1000.

DEFAULT: 1000

retrieve_additional_location_data(location_type, limit=10 ** 8, chunksize=1000)

Retrieve extended location data

Execute additional data requests stored in open_mastr.soap_api.orm.AdditionalLocationsRequested. See also docs of open_mastr.soap_api.download.py.MaStRDownload.additional_data for more information on how data is downloaded.

PARAMETER DESCRIPTION
location_type

Select type of location that is to be retrieved. Choose from "location_elec_generation", "location_elec_consumption", "location_gas_generation", "location_gas_consumption".

limit

Limit number of locations that data is download for. Defaults large number 10**8 which refers to query data for existing data requests.

DEFAULT: 10 ** 8

chunksize

Data is downloaded and inserted into the database in chunks of chunksize. Defaults to 1000.

DEFAULT: 1000