Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Callback Error Updating store-trips.data - TypeError: 'float' object is not subscriptable fixed #121

Merged

Conversation

TeachMeTW
Copy link
Contributor

Issue: Callback Error Updating store-trips.data - TypeError: 'float' object is not subscriptable

Description

We are encountering a TypeError while updating store-trips.data. The error traceback indicates that there is an issue with the function applied to the ble_sensed_summary column in the DataFrame. The error occurs because a float object is being treated as if it were a dictionary.

Traceback

Callback error updating store-trips.data Traceback (most recent call last): File "/usr/src/app/app_sidebar_collapsible.py", line 318, in update_store_trips df, user_input_cols = query_confirmed_trips(start_date, end_date, timezone) File "/usr/src/app/utils/db_utils.py", line 101, in query_confirmed_trips df["data.primary_ble_sensed_mode"] = df.ble_sensed_summary.apply(get_max_mode_from_summary) File "/root/miniconda-23.5.2/envs/emission/lib/python3.9/site-packages/pandas/core/series.py", line 4771, in apply return SeriesApply(self, func, convert_dtype, args, kwargs).apply() File "/root/miniconda-23.5.2/envs/emission/lib/python3.9/site-packages/pandas/core/apply.py", line 1123, in apply return self.apply_standard() File "/root/miniconda-23.5.2/envs/emission/lib/python3.9/site-packages/pandas/core/apply.py", line 1174, in apply_standard mapped = lib.map_infer( File "pandas/_libs/lib.pyx", line 2924, in pandas._libs.lib.map_infer File "/usr/src/app/utils/db_utils.py", line 97, in <lambda> get_max_mode_from_summary = lambda md: max(md["distance"], key=md["distance"].get) if len(md["distance"]) > 0 else "INVALID" TypeError: 'float' object is not subscriptable

Issue

The TypeError occurs because the lambda function get_max_mode_from_summary expects md to be a dictionary with a "distance" key that maps to another dictionary. However, it appears that some entries in ble_sensed_summary are float values, which are not subscriptable and thus lead to this error.

Steps to Reproduce

  1. Call the update_store_trips function with appropriate parameters.
  2. Observe the error traceback indicating the TypeError.

Expected Behavior

The ble_sensed_summary column should be processed correctly, and the get_max_mode_from_summary function should handle various data types properly without raising an error.

Suggested Fix

Update the get_max_mode_from_summary lambda function to handle cases where md is not a dictionary. For example:

get_max_mode_from_summary = lambda md: (
    "INVALID" if not isinstance(md, dict) or "distance" not in md or not isinstance(md["distance"], dict) else
    (max(md["distance"], key=md["distance"].get) if len(md["distance"]) > 0 else "INVALID")
)

Consider adding data validation or cleaning steps before applying the lambda function.

@TeachMeTW
Copy link
Contributor Author

Utilized Black Formatter (for formatting purposes)
Modified mongodump

@TeachMeTW
Copy link
Contributor Author

Use Case of Updated Script
./load_mongodump.sh ~/Downloads/openpath_prod_ca_ebike_may_29.tar.gz Stage_database op-admin-dashboard-db-1 openpath_prod_ca_ebike

@TeachMeTW TeachMeTW changed the title #120 Callback Error Updating store-trips.data - TypeError: 'float' object is not subscriptable fixed + Modified Docker Load Dump Script Sep 5, 2024
Copy link
Contributor

@JGreenlee JGreenlee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error handling is great.

As you have seen, this project doesn't use Black or adhere to PEP8. Although that's not ideal, could you revert the formatting changes for now?
Standardized code format is great but when formatting changes are done at the same time as other changes, it is harder to review

If we were to enforce a code style, we would need to decide on it as a team, then update the whole codebase at once, independently from other changes
(we did this on e-mission-phone using Prettier, the JS equivalent of Black)

When I make changes to python code, I do try to adhere to PEP8 on the lines I change. But I don't worry about the rest of the file

docker/load_mongodump.sh Outdated Show resolved Hide resolved
Copy link
Contributor

@shankari shankari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comments are only about the load_mongodump part

docker/load_mongodump.sh Outdated Show resolved Hide resolved
docker-compose-dev.yml Outdated Show resolved Hide resolved
@shankari
Copy link
Contributor

shankari commented Sep 7, 2024

@TeachMeTW can you pull out the loadmongodump changes into a separate PR (e.g. use git checkout to an earlier point and make a new branch)? I am happy to merge that now.

For the ble_summary, I would like to make sure that your approach is consistent with Jack and Abby's, so that can come in a follow-on PR.

docker-compose-dev.yml.bak Outdated Show resolved Hide resolved
docker-compose-dev.yml Outdated Show resolved Hide resolved
docker/load_mongodump.sh Outdated Show resolved Hide resolved
TeachMeTW added a commit to TeachMeTW/op-admin-dashboard that referenced this pull request Sep 7, 2024
@shankari
Copy link
Contributor

shankari commented Sep 8, 2024

Since I merged #122 to master, this only has the actual functionality changes. @JGreenlee and @Abby-Wheelis, I will let you review this before it comes to me.

@@ -94,7 +94,26 @@ def query_confirmed_trips(start_date: str, end_date: str, tz: str):
# Add primary modes from the sensed, inferred and ble summaries. Note that we do this
# **before** filtering the `all_trip_columns` because the
# *_section_summary columns are not currently valid
get_max_mode_from_summary = lambda md: max(md["distance"], key=md["distance"].get) if len(md["distance"]) > 0 else "INVALID"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same functionality that I had to update in #141. Your checks are much more thorough than mine, so I'll go ahead and update to match this, but I set the result to UNKNOWN instead of INVALID, and I'm wondering if we should do the same thing in both places? I chose UNKNOWN since that is what trips where we were unable to sense a mode for most of the distance are displayed as, what purpose does INVALID serve here? Or maybe, what is the difference between an INVALID sensed mode and an UNKNOWN sensed mode?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Abby-Wheelis I am not entirely sure what INVALID would represent here as opposed to unknown. I believe @shankari was the one who originally wrote this line of code that I refactored. I can change it to UNKNOWN to match #141

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TeachMeTW can you look at the commit where I wrote that code and see if I added in an explanation of why my choice? I have some vague recollection of the decision making, but the comments I wrote at the time are likely to be more clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shankari @Abby-Wheelis based on existing comments, this is how I interpreted it.

INVALID: used to signify that the data or result is not just unknown but invalid. It implies that something went wrong or the data provided does not meet the expected format or criteria. It's a more explicit signal that the data is unusable.

UNKNOWN: indicates that the mode is not known, but does not necessarily imply that there was an error or problem with the data. It often means that the information is not available or could not be determined but is expected to be a valid state to encounter.

Should we keep both Invalid and unknown or just use one?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can keep both, thank you for the explanation! Hopefully once we figure out #1088 and resolve the underlying data issues there will be less INVALID sense modes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shankari if possible please resolve the issue if no further reviews/changes are needed

@TeachMeTW TeachMeTW changed the title Callback Error Updating store-trips.data - TypeError: 'float' object is not subscriptable fixed + Modified Docker Load Dump Script Fix Callback Error Updating store-trips.data - TypeError: 'float' object is not subscriptable fixed Sep 10, 2024
Copy link
Contributor

@JGreenlee JGreenlee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the reasoning of using INVALID rather than UNKNOWN

UNKNOWN exists in the data for trips where a mode could not be determined from the sensed information

INVALID is used to mark trips that are missing the ble summary. INVALID is not stored in the data model

Copy link
Contributor

@shankari shankari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

nit/future fix: If we are creating a multi-line function anyway, I don't think it needs to be a lambda, we can make it a real function which may be easier to read.

But that is a style preference that I don't feel that strongly about so merging this anyway

@shankari shankari merged commit 3d28467 into e-mission:master Sep 12, 2024
@shankari
Copy link
Contributor

@JGreenlee @TeachMeTW I think this will address some of the issues with loading data on older deployments (e.g. ccebike/smart-commute/mm-masscec etc). We should spot check that once I have deployed to production.

TeachMeTW added a commit to TeachMeTW/op-admin-dashboard that referenced this pull request Oct 22, 2024
@JGreenlee JGreenlee mentioned this pull request Oct 22, 2024
shankari added a commit that referenced this pull request Oct 22, 2024
TeachMeTW added a commit to TeachMeTW/op-admin-dashboard that referenced this pull request Oct 30, 2024
TeachMeTW added a commit to TeachMeTW/op-admin-dashboard that referenced this pull request Oct 30, 2024
TeachMeTW added a commit to TeachMeTW/op-admin-dashboard that referenced this pull request Oct 30, 2024
TeachMeTW added a commit to TeachMeTW/op-admin-dashboard that referenced this pull request Oct 30, 2024
TeachMeTW added a commit to TeachMeTW/op-admin-dashboard that referenced this pull request Oct 30, 2024
…al disabling, fixed by modifying with statement positioning and return ordering.

Revert whitespace changes

Revert whitespace changes

Bug Fix featuring the uuids misuse from prior commits

Caught bug in the form of trips_data table being improperly named and not existing, fixed by renaming trips_data table.

Fixed issues brought up and unnecessary changes

Unformatted render

Reverted more unnecessary changes

unfix typo

Reverted to old name

Whitespace changes revert

Whitespace changes revert

Whitespace changes revert

Whitespace changes revert

Added e-mission#121

added timings

Added e-mission#121
TeachMeTW added a commit to TeachMeTW/op-admin-dashboard that referenced this pull request Oct 30, 2024
…al disabling, fixed by modifying with statement positioning and return ordering.

Revert whitespace changes

Revert whitespace changes

Bug Fix featuring the uuids misuse from prior commits

Caught bug in the form of trips_data table being improperly named and not existing, fixed by renaming trips_data table.

Fixed issues brought up and unnecessary changes

Unformatted render

Reverted more unnecessary changes

unfix typo

Reverted to old name

Whitespace changes revert

Whitespace changes revert

Whitespace changes revert

Whitespace changes revert

Added e-mission#121

added timings

Added e-mission#121

Fixed minor bug where uuids loaded infinitely due to misuse of interval disabling, fixed by modifying with statement positioning and return ordering.

Revert whitespace changes

Revert whitespace changes

Bug Fix featuring the uuids misuse from prior commits

Caught bug in the form of trips_data table being improperly named and not existing, fixed by renaming trips_data table.

Fixed issues brought up and unnecessary changes

Unformatted render

Reverted more unnecessary changes

unfix typo

Reverted to old name

Whitespace changes revert

Whitespace changes revert

Whitespace changes revert

Whitespace changes revert

Added e-mission#121

added timings

Added e-mission#121

Delete docker-compose-dev.yml.bak

Delete docker/new_mongo.sh
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

4 participants