Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

365 how can post processing function handle nan value from mcs #371

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ and this project adheres to [Semantic Versioning](http://semver.org/).
- Method to get allow Hazard demands from hazard service [#363](https://github.com/IN-CORE/pyincore/issues/363)

### Fixed
- Post-processing cluster fuction handle empty rows from mcs [#365](https://github.com/IN-CORE/pyincore/issues/365)
- Expose all the incore client parameters [#295](https://github.com/IN-CORE/pyincore/issues/295)
- Fixed testing datasets not being cleaned in the database [#367](https://github.com/IN-CORE/pyincore/issues/367)
- Space services methods missing timeout parameters [#375](https://github.com/IN-CORE/pyincore/issues/375)
Expand Down
6 changes: 5 additions & 1 deletion pyincore/utils/dataprocessutil.py
Original file line number Diff line number Diff line change
Expand Up @@ -202,6 +202,10 @@ def _sum_average(series):
# unify mcs and bldg func naming
bldg_func.rename(columns={"building_guid": "guid", "samples": "failure"}, inplace=True)

# drop nan but count their numbers
count_null = (bldg_func["failure"] == "").sum()
bldg_func = bldg_func[bldg_func['failure'] != ""]

func_merged = pd.merge(inventory, bldg_func, on="guid")
mapped_df = pd.merge(func_merged, arch_mapping, on=arch_col)
unique_categories = arch_mapping.groupby(by=["category"], sort=False, as_index=False).count()["category"]
Expand Down Expand Up @@ -249,7 +253,7 @@ def _group_by(by_column, unique):
json_by_cluster = json.loads(cluster_records)
json_by_category = json.loads(category_records)

return {"by_cluster": json_by_cluster, "by_category": json_by_category}
return {"by_cluster": json_by_cluster, "by_category": json_by_category, "NA": int(count_null)}

@staticmethod
def get_max_damage_state(dmg_result):
Expand Down
5 changes: 5 additions & 0 deletions tests/pyincore/utils/test_dataprocessutil.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,11 @@ def _functionality_cluster(client, archetype_mapping="5fca915fb34b193f7a44059b",
bldg_func_state_dataset_path = bldg_func_state_dataset.get_file_path()
bldg_func_state = pd.read_csv(bldg_func_state_dataset_path)

# manufacturing the nan rows for testing
if "failure" in bldg_func_state.columns:
bldg_func_state.loc[0, "failure"] = ""
bldg_func_state.loc[1, "failure"] = ""

ret_json = util.create_mapped_func_result(buildings, bldg_func_state, arch_mapping, arch_column)

with open(title + "_cluster.json", "w") as f:
Expand Down
Loading