Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revise export ports data #1175

Merged
merged 15 commits into from
Nov 28, 2024
Merged
5 changes: 4 additions & 1 deletion Snakefile
Original file line number Diff line number Diff line change
Expand Up @@ -1003,6 +1003,7 @@ rule solve_sector_networks:
rule prepare_ports:
output:
ports="data/ports.csv", # TODO move from data to resources
export_ports="resources/" + SECDIR + "export_ports.csv",
script:
"scripts/prepare_ports.py"

Expand Down Expand Up @@ -1154,9 +1155,11 @@ rule add_export:
export_profile=config["export"]["export_profile"],
snapshots=config["snapshots"],
costs=config["costs"],
custom_export=config["custom_data"]["export_data"],
input:
overrides="data/override_component_attrs",
export_ports="data/export_ports.csv",
custom_export_ports="data/custom/export_ports.csv",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally prefer that all files in resources to be the ones that are actually used in the workflow. This way the user can track a file easily if the file is not in resources then the workflow used the file from the data folder.

Therefore, I suggest to have this rule read only export_ports="resources/" + SECDIR + "export_ports.csv",
and the decision on using custom data or workflow generated one stays in prepare_ports.py.

Thus in prepare_ports.py, either:

  • the file is copied from data/custom to resources if the config["custom_data"]["export_data"] is True
  • or filter_ports function is activated and the output is saved in resources.

Please note that all my comments are subject to discussion if needed.

export_ports="resources/" + SECDIR + "export_ports.csv",
costs=COSTDIR + "costs_{planning_horizons}.csv",
ship_profile="resources/" + SECDIR + "ship_profile_{h2export}TWh.csv",
network=RESDIR
Expand Down
1 change: 1 addition & 0 deletions config.default.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -478,6 +478,7 @@ custom_data:
add_existing: false
custom_sectors: false
gas_network: false # If "True" then a custom .csv file must be placed in "resources/custom_data/pipelines.csv" , If "False" the user can choose btw "greenfield" or Model built-in datasets. Please refer to ["sector"] below.
export_data: false # If "True" then a custom .csv file must be placed in "data/custom/export_ports.csv"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to name it: export_ports. I think it is more intuitive for the users.


industry:
reference_year: 2015
Expand Down
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here it is better to have an example file (not empty) where the needed columns are shown and at least an example row is also available. This way it is easier for the users to insert their custom data in the right format.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah we should also delete the export_ports.csv from data folder as it is not needed anymore. Maybe just move it here as it is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I have done is to move the export_ports.csv file to the custom folder which should serve as an example. Let me know your thoughts on this.

File renamed without changes.
2 changes: 2 additions & 0 deletions doc/release_notes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,8 @@ E.g. if a new rule becomes available describe how to use it `make test` and in o

* Drop entries that contain non-string elements in country column of `CO2_emissions_csv` data in `prepare_transport_data_input.py` script `PR #1166 <https://github.com/pypsa-meets-earth/pypsa-earth/pull/1166>`_

* Revise ports data for export in `add_export.py` related to sector model `PR #1175 <https://github.com/pypsa-meets-earth/pypsa-earth/pull/1175>`_

PyPSA-Earth 0.4.1
=================

Expand Down
10 changes: 9 additions & 1 deletion scripts/add_export.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,12 +33,20 @@ def select_ports(n):
This function selects the buses where ports are located.
"""

if snakemake.params.custom_export:
input_port_data = snakemake.input.custom_export_ports
else:
input_port_data = snakemake.input.export_ports

ports = pd.read_csv(
snakemake.input.export_ports,
input_port_data,
index_col=None,
keep_default_na=False,
).squeeze()

# ports = raw_ports[["name", "country", "fraction", "x", "y"]]
# ports.loc[:, "fraction"] = ports.fraction.round(1)

ports = ports[ports.country.isin(countries)]
if len(ports) < 1:
logger.error(
Expand Down
33 changes: 33 additions & 0 deletions scripts/prepare_ports.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,36 @@ def download_ports():
return wpi_csv


def filter_ports(dataframe):
"""
Filters ports based on their harbor size and returns a DataFrame containing
only the largest port for each country.
"""
# Filter large sized ports
large_ports = dataframe[dataframe["Harbor Size"] == "Large"]
countries_with_large_ports = large_ports["country"].unique()

# Filter out countries with large ports
remaining_ports = dataframe[~dataframe["country"].isin(countries_with_large_ports)]

# Filter medium sized ports from remaining ports
medium_ports = remaining_ports[remaining_ports["Harbor Size"] == "Medium"]
countries_with_medium_ports = medium_ports["country"].unique()

# Filter out countries with medium ports
remaining_ports = remaining_ports[
~remaining_ports["country"].isin(countries_with_medium_ports)
]

# Filter small sized ports from remaining ports
small_ports = remaining_ports[remaining_ports["Harbor Size"] == "Small"]

# Combine all filtered ports
filtered_ports = pd.concat([large_ports, medium_ports, small_ports])

return filtered_ports


if __name__ == "__main__":
if "snakemake" not in globals():
from _helpers import mock_snakemake
Expand Down Expand Up @@ -102,3 +132,6 @@ def download_ports():
ports["fraction"] = ports["Harbor_size_nr"] / ports["Total_Harbor_size_nr"]

ports.to_csv(snakemake.output[0], sep=",", encoding="utf-8", header="true")
filter_ports(ports).to_csv(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we can put an if clause that i suggested in my Snakefile comment.

snakemake.output[1], sep=",", encoding="utf-8", header="true"
)
1 change: 1 addition & 0 deletions test/config.test_myopic.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,7 @@ custom_data:
add_existing: false
custom_sectors: false
gas_network: false # If "True" then a custom .csv file must be placed in "resources/custom_data/pipelines.csv" , If "False" the user can choose btw "greenfield" or Model built-in datasets. Please refer to ["sector"] below.
export_data: false # If "True" then a custom .csv file must be placed in "data/custom/export.csv"


costs: # Costs used in PyPSA-Earth-Sec. Year depends on the wildcard planning_horizon in the scenario section
Expand Down