Advanced functions to use the MaStR SOAP-API
open_mastr.soap_api.download.MaStRAPI
Bases: object
Access the Marktstammdatenregister (MaStR) SOAP API via a Python wrapper
Read about MaStR account and credentials how to create a user account and a role including a token to access the MaStR SOAP API.
Create an MaStRAPI
instance with your role credentials
mastr_api = MaStRAPI(
user="SOM123456789012",
key=""koo5eixeiQuoi'w8deighai8ahsh1Ha3eib3coqu7ceeg%ies..."
)
Alternatively, leave user
and key
empty if user and token are accessible via
credentials.cfg
. How to configure this is described
here.
Now, you can use the MaStR API instance to call pre-defined SOAP API queries via the class' methods. For example, get a list of units limited to two entries.
Note
As the example shows, you don't have to pass credentials for calling wrapped SOAP queries. This is handled internally.
__init__(user=None, key=None)
PARAMETER | DESCRIPTION |
---|---|
user |
MaStR-ID (MaStR-Nummer) for the account that was created on https://www.marktstammdatenregister.de Typical format: SOM123456789012
TYPE:
|
key |
Access token of a role (Benutzerrolle). Might look like: "koo5eixeiQuoi'w8deighai8ahsh1Ha3eib3coqu7ceeg%ies..."
TYPE:
|
open_mastr.soap_api.download.MaStRDownload
Warning
This class is deprecated and will not be maintained from version 0.15.0 onwards.
Instead use Mastr.download
with parameter
method
= "bulk" to get bulk downloads of the dataset.
Use the higher level interface for bulk download
MaStRDownload
builds on top of MaStRAPI
and provides
an interface for easier downloading.
Use methods documented below to retrieve specific data. On the example of
data for nuclear power plants, this looks like
from open_mastr.soap_api.download import MaStRDownload
mastr_dl = MaStRDownload()
for tech in ["nuclear", "hydro", "wind", "solar", "biomass", "combustion", "gsgk"]:
power_plants = mastr_dl.download_power_plants(tech, limit=10)
print(power_plants.head())
Warning
Be careful with increasing limit
. Typically, your account allows only for 10,000 API
request per day.
__init__(parallel_processes=None)
PARAMETER | DESCRIPTION |
---|---|
parallel_processes |
Specify number of parallel unit data download, respectively the number of processes you want to use for downloading. For single-process download (avoiding the use of python multiprocessing package) choose False. Defaults to number of cores (including hyperthreading).
TYPE:
|
additional_data(data, unit_ids, data_fcn, timeout=10)
Retrieve addtional informations about units.
Extended information on units is available. Depending on type, additional data from EEG and KWK subsidy program are available. Furthermore, for some units, data about permit is retrievable.
PARAMETER | DESCRIPTION |
---|---|
data |
data, see :meth:
TYPE:
|
unit_ids |
Unit identifier for additional data
TYPE:
|
data_fcn |
Name of method from :class:
TYPE:
|
timeout |
Timeout limit for data retrieval for each unit when using multiprocessing. Defaults to 10.
DEFAULT:
|
RETURNS | DESCRIPTION |
---|---|
tuple of list of dict and tuple
|
basic_location_data(limit=2000, date_from=None, max_retries=3)
Retrieve basic location data in chunks
Retrieves data for all types of locations at once using
MaStRAPI.GetListeAlleLokationen
.
Locations include
- Electricity generation location (SEL - Stromerzeugungslokation)
- Electricity consumption location (SVL - Stromverbrauchslokation)
- Gas generation location (GEL - Gaserzeugungslokation)
- Gas consumption location (GVL - Gasverbrauchslokation)
PARAMETER | DESCRIPTION |
---|---|
limit |
Maximum number of locations to download. Warning Mind the daily request limit for your MaStR account, usually 10,000 per day.
DEFAULT:
|
date_from |
If specified, only locations with latest change date newer than this are queried.
DEFAULT:
|
max_retries |
Maximum number of retries for each chunk in case of errors with the connection to the server.
DEFAULT:
|
YIELDS | DESCRIPTION |
---|---|
generator of generators
|
For each chunk a separate generator is returned all wrapped into another generator. Access with |
basic_unit_data(data=None, limit=2000, date_from=None, max_retries=3)
Download basic unit information for one data type.
Retrieves basic information about units. The number of unit in
bound to limit
.
PARAMETER | DESCRIPTION |
---|---|
data |
Technology data is requested for. See :meth:
TYPE:
|
limit |
Maximum number of units to download. If not provided, data for all units is downloaded. Warning Mind the daily request limit for your MaStR account.
TYPE:
|
date_from |
If specified, only units with latest change date newer than this are queried.
Defaults to
DEFAULT:
|
max_retries |
Maximum number of retries in case of errors with the connection to the server.
DEFAULT:
|
YIELDS | DESCRIPTION |
---|---|
list of dict
|
A generator of dicts is returned with each dictionary containing information about one unit. |
download_power_plants(data, limit=None)
Download power plant unit data for one data type.
Based on list with basic information about each unit, subsequently additional data is retrieved:
- Extended unit data
- EEG data is collected during support of renewable energy installations by the Erneuerbare-Energie-Gesetz.
- KWK data is collected to the support program Kraft-Waerme-Kopplung
- Permit data is available for some installations (German: Genehmigungsdaten)
Data is stored in CSV file format in ~/open-MaStR/data/<version>/
by
default.
PARAMETER | DESCRIPTION |
---|---|
data |
Retrieve unit data for one power system unit. Power plants are grouped by following technologies:
TYPE:
|
limit |
Maximum number of units to be downloaded.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
DataFrame
|
Joined data tables. |
eeg_unit_data(unit_specs)
Download EEG (Erneuerbare Energien Gesetz) data for a unit.
Additional data collected during a subsidy program for supporting installations of renewable energy power plants.
PARAMETER | DESCRIPTION |
---|---|
unit_specs |
EegMastrnummer and data type as tuple that for example looks like .. code-block:: python tuple("EEG961554380393", "hydro")
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
dict
|
EEG details about unit, if download successful, otherwise empty dict |
tuple
|
extended_unit_data(unit_specs)
Download extended data for a unit.
This extended unit information is provided separately.
PARAMETER | DESCRIPTION |
---|---|
unit_specs |
EinheitMastrNummer and data type as tuple that for example looks like
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
dict
|
Extended information about unit, if download successful, otherwise empty dict |
tuple
|
kwk_unit_data(unit_specs)
Download KWK (german: Kraft-Wärme-Kopplung, english: Combined Heat and Power, CHP) data for a unit.
Additional data collected during a subsidy program for supporting combined heat power plants.
PARAMETER | DESCRIPTION |
---|---|
unit_specs |
KwkMastrnummer and data type as tuple that for example looks like
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
dict
|
KWK details about unit, if download successful, otherwise empty dict |
tuple
|
location_data(specs)
Download extended data for a location
Allows to download additional data for different location types, see specs.
PARAMETER | DESCRIPTION |
---|---|
specs |
Location Mastrnummer and data_name as tuple that for example looks like
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
dict
|
Detailed information about a location, if download successful, otherwise empty dict |
tuple
|
permit_unit_data(unit_specs)
Download permit data for a unit.
PARAMETER | DESCRIPTION |
---|---|
unit_specs |
GenMastrnummer and data type as tuple that for example looks like
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
dict
|
Permit details about unit, if download successful, otherwise empty dict |
tuple
|
open_mastr.soap_api.mirror.MaStRMirror
Warning
This class is deprecated and will not be maintained from version 0.15.0 onwards.
Instead use Mastr.download
with parameter
method
= "bulk" to mirror the MaStR dataset to a local database.
Mirror the Marktstammdatenregister database and keep it up-to-date.
A PostgreSQL database is used to mirror the MaStR database. It builds
on functionality for bulk data download
provided by MaStRDownload
.
A rough overview is given by the following schema on the example of wind power units.
Initially, basic unit data gets backfilled with ~.backfill_basic
(downloads basic unit data for 2,000
units of type 'solar').
from open_mastr.soap_api.prototype_mastr_reflected import MaStRMirror
mastr_mirror = MaStRMirror()
mastr_mirror.backfill_basic("solar", limit=2000)
MaStRMirror.create_additional_data_requests
.
Additional unit data, in the case of wind power this is extended data,
EEG data and permit data, can be
retrieved subsequently by ~.retrieve_additional_data
.
The data can be joined to one table for each data type and exported to
CSV files using Mastr.to_csv
.
Also consider to use ~.dump
and ~.restore
for specific purposes.
Note
This feature was built before the bulk download was offered at marktstammdatenregister.de. It can still be used to compare the two datasets received from the API and the bulk download.
__init__(engine, restore_dump=None, parallel_processes=None)
PARAMETER | DESCRIPTION |
---|---|
engine |
database engine
|
restore_dump |
Save path of SQL dump file including filename.
The database is restored from the SQL dump.
Defaults to
DEFAULT:
|
parallel_processes |
Number of parallel processes used to download additional data.
Defaults to
DEFAULT:
|
backfill_basic(data=None, date=None, limit=10 ** 8)
Backfill basic unit data.
Fill database table 'basic_units' with data. It allows specification of which data should be retrieved via the described parameter options.
Under the hood, MaStRDownload.basic_unit_data
is used.
PARAMETER | DESCRIPTION |
---|---|
data |
Specify data types for which data should be backfilled.
DEFAULT:
|
date |
Specify backfill date from which on data is retrieved Only data with modification time stamp greater that
!!! warning
DEFAULT:
|
limit |
Maximum number of units. Defaults to the large number of 10**8 which means all available data is queried. Use with care!
DEFAULT:
|
backfill_locations_basic(limit=10 ** 7, date=None, delete_additional_data_requests=True)
Backfill basic location data.
Fill database table 'locations_basic' with data. It allows specification of which data should be retrieved via the described parameter options.
Under the hood, MaStRDownload.basic_location_data
is used.
PARAMETER | DESCRIPTION |
---|---|
date |
Specify backfill date from which on data is retrieved Only data with modification time stamp greater that
DEFAULT:
|
limit |
Maximum number of locations to download.
Defaults to
DEFAULT:
|
delete_additional_data_requests |
Useful to speed up download of data. Ignores existence of already created requests for additional data and skips deletion these.
DEFAULT:
|
create_additional_data_requests(technology, data_types=['unit_data', 'eeg_data', 'kwk_data', 'permit_data'], delete_existing=True)
Create new requests for additional unit data
For units that exist in basic_units but not in the table for additional
data of data_type
, a new data request
is submitted.
PARAMETER | DESCRIPTION |
---|---|
technology |
Specify technology additional data should be requested for.
|
data_types |
Select type of additional data that is to be requested. Defaults to all data that is available for a technology.
DEFAULT:
|
delete_existing |
Toggle deletion of already existing requests for additional data. Defaults to True.
DEFAULT:
|
dump(dumpfile='open-mastr-continuous-update.backup')
Dump MaStR database.
PARAMETER | DESCRIPTION |
---|---|
dumpfile |
Save path for dump including filename. When only a filename is given, the dump is saved to CWD.
TYPE:
|
restore(dumpfile)
Restore the MaStR database from an SQL dump.
PARAMETER | DESCRIPTION |
---|---|
dumpfile |
Save path for dump including filename. When only a filename is given, the dump is restored from CWD.
TYPE:
|
retrieve_additional_data(data, data_type, limit=10 ** 8, chunksize=1000)
Retrieve additional unit data
Execute additional data requests stored in
open_mastr.soap_api.orm.AdditionalDataRequested
.
See also docs of MaStRDownload.additional_data
for more information on how data is downloaded.
PARAMETER | DESCRIPTION |
---|---|
data |
See list of available technologies in
|
data_type |
Select type of additional data that is to be retrieved. Choose from "unit_data", "eeg_data", "kwk_data", "permit_data".
|
limit |
Limit number of units that data is download for. Defaults to the very large number 10**8
which refers to query data for existing data requests, for example created by
DEFAULT:
|
chunksize |
Data is downloaded and inserted into the database in chunks of
DEFAULT:
|
retrieve_additional_location_data(location_type, limit=10 ** 8, chunksize=1000)
Retrieve extended location data
Execute additional data requests stored in
open_mastr.soap_api.orm.AdditionalLocationsRequested
.
See also docs of open_mastr.soap_api.download.py.MaStRDownload.additional_data
for more information on how data is downloaded.
PARAMETER | DESCRIPTION |
---|---|
location_type |
Select type of location that is to be retrieved. Choose from "location_elec_generation", "location_elec_consumption", "location_gas_generation", "location_gas_consumption".
|
limit |
Limit number of locations that data is download for. Defaults large number 10**8 which refers to query data for existing data requests.
DEFAULT:
|
chunksize |
Data is downloaded and inserted into the database in chunks of
DEFAULT:
|