diff --git a/README.rst b/README.rst index c03886b..a87601f 100644 --- a/README.rst +++ b/README.rst @@ -35,15 +35,15 @@ Sensitivity and Epsilon Analysis * Sensitivity : In a single time stamp, ``1`` merchant can come only once in a particular zip code but can appear in upto ``3`` zip codes. So, if we wanted to release measures about a single zip code sensitivity would be ``1`` but since we want to release data for all zip codes, the sensitivity used for each zip code is ``3``. * Scaling with Time: For multiple time stamps, sensitivity is ``3 * no_of_time_stamps``. * Epsilon Budget: The epsilon spent for each query is ``∈``. -* Scale Calculation: ``Scale = (sqrt(3) * no_of_time_stamps) / ∈``. +* Scale Calculation: ``Scale = (3 * no_of_time_stamps* upper_bound) / ∈``. -Mobility Detection (Airline Merch Category) -------------------------------------------- +Mobility Detection +------------------ Description -This analysis tracks mobility by monitoring differential private time series release of financial transactions in the "Airlines" category, which reflects the transportation sector. +This analysis tracks mobility by monitoring differential private time series release of financial transactions in the ``retail_and_recreation``, ``grocery_and_pharmacy`` and ``transit_stations`` super categories which matches with google mobility data for easy validation. Assumptions @@ -54,17 +54,18 @@ Assumptions Algorithm #. Add City Column: A new ``city`` column is added based on postal codes (``make_preprocess_location``). +#. Add Super Category Column : A new ``merch_super_category`` column is added for classifying transactions into retail_and_recreation, grocery_and_pharmacy and transit_stations categories (``make_preprocess_merchant_mobility``). #. Filter for City: Data for the selected city is filtered (``make_filter``). -#. Filter for Airline Category: Only transactions in the ``Airline`` category are considered (``make_filter``). +#. Filter for super category: data is filtered for retail_and_recreation, grocery_and_pharmacy and transit_stations categories (``make_filter``). #. Filter by Time Frame: Data is filtered for the selected time frame (``make_truncate_time``). #. Transaction Summing & Noise Addition: Sum the number of transactions by postal code for each timestep and add Gaussian noise (``make_private_sum_by``). Sensitivity and Epsilon Analysis -* Sensitivity per Merchant: Sensitivity is 3 for each merchant in the ``Airline`` category. +* Sensitivity per Merchant: Sensitivity is 3 for each merchant. * Scaling with Time: For multiple timesteps, sensitivity is ``3 * no_of_time_steps``. * Epsilon Budget: The epsilon spent per timestep is ∈ . -* Scale Calculation: ``Scale = (3 * no_of_time_steps) / ∈``. +* Scale Calculation: ``Scale = (3 * no_of_time_steps* upper_bound) / ∈``. Validation @@ -100,7 +101,7 @@ Sensitivity and Epsilon Analysis * Sensitivity per Category : Sensitivity is ``3`` for each category (essential or luxurious goods). * Scaling with Time : For multiple timesteps, sensitivity is ``3 * no_of_time_steps``. * Epsilon Budget : The epsilon spent per timestep is ∈. -* Scale Calculation : ``Scale = (3 * no_of_time_steps) / ∈``. +* Scale Calculation : ``Scale = (3 * no_of_time_steps* upper_bound) / ∈``. diff --git a/dist/dp_epidemiology-0.0.8-py3-none-any.whl b/dist/dp_epidemiology-0.0.8-py3-none-any.whl deleted file mode 100644 index 70fa40f..0000000 Binary files a/dist/dp_epidemiology-0.0.8-py3-none-any.whl and /dev/null differ diff --git a/dist/dp_epidemiology-0.0.8.tar.gz b/dist/dp_epidemiology-0.0.8.tar.gz deleted file mode 100644 index 11e66c3..0000000 Binary files a/dist/dp_epidemiology-0.0.8.tar.gz and /dev/null differ diff --git a/dist/dp_epidemiology-0.0.9-py3-none-any.whl b/dist/dp_epidemiology-0.0.9-py3-none-any.whl new file mode 100644 index 0000000..f58ac60 Binary files /dev/null and b/dist/dp_epidemiology-0.0.9-py3-none-any.whl differ diff --git a/dist/dp_epidemiology-0.0.9.tar.gz b/dist/dp_epidemiology-0.0.9.tar.gz new file mode 100644 index 0000000..03eb266 Binary files /dev/null and b/dist/dp_epidemiology-0.0.9.tar.gz differ diff --git a/docs/requirements.txt b/docs/requirements.txt index e69de29..ffba590 100644 Binary files a/docs/requirements.txt and b/docs/requirements.txt differ diff --git a/docs/usage.rst b/docs/usage.rst index d6cf342..46413c6 100644 --- a/docs/usage.rst +++ b/docs/usage.rst @@ -61,13 +61,14 @@ For example: To do mobility inference, -you can use the ``mobility_analyzer.mobility_analyzer()`` function to generate differential private time series of trnsactional data in the "Airlines" category: +you can use the ``mobility_analyzer.mobility_analyzer()`` function to generate differential private time series of trnsactional data in the ``retail_and_recreation``, ``grocery_and_pharmacy`` and ``transit_stations`` super categories: .. autofunction:: mobility_analyzer.mobility_analyzer The ``df`` parameter take pandas dataframe as input with columns ``[ "ID", "date", "merch_category", "merch_postal_code", "transaction_type", "spendamt", "nb_transactions"]``. The ``start_date`` and ``end_date`` parameters take the start and end date of the time frame for which the analysis is to be done. The ``city`` parameter takes the name of the city for which the analysis is to be done. +The ``category`` parameter takes the value of ``retail_and_recreation``, ``grocery_and_pharmacy`` or ``transit_stations`` for which the analysis is to be done. The ``epsilon`` parameter takes the value of epsilon for differential privacy. For example: @@ -75,7 +76,7 @@ For example: >>> from DP_epidemiology import mobility_analyzer >>> from datetime import datetime >>> df = pd.read_csv('data.csv') ->>> mobility_analyzer.mobility_analyzer(df,datetime(2020, 9, 1),datetime(2021, 3, 31),"Medellin",10) +>>> mobility_analyzer.mobility_analyzer(df,datetime(2020, 9, 1),datetime(2021, 3, 31),"Medellin","retail_and_recreation",10) nb_transactions date 0 1258 2020-09-01 1 1328 2020-09-08 diff --git a/pyproject.toml b/pyproject.toml index d052481..e100188 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -4,7 +4,7 @@ build-backend = "hatchling.build" [project] name = "DP_epidemiology" -version = "0.0.8" +version = "0.0.9" dependencies = [ "pandas>=2.1.4", @@ -16,7 +16,8 @@ dependencies = [ "dash", "nbformat", "scipy", - "matplotlib" + "matplotlib", + "dtw", ] authors = [ diff --git a/src/DP_epidemiology/__pycache__/contact_matrix.cpython-310.pyc b/src/DP_epidemiology/__pycache__/contact_matrix.cpython-310.pyc index 2bf2272..5b18607 100644 Binary files a/src/DP_epidemiology/__pycache__/contact_matrix.cpython-310.pyc and b/src/DP_epidemiology/__pycache__/contact_matrix.cpython-310.pyc differ diff --git a/src/DP_epidemiology/__pycache__/hotspot_analyzer.cpython-310.pyc b/src/DP_epidemiology/__pycache__/hotspot_analyzer.cpython-310.pyc index e4e7260..1037b28 100644 Binary files a/src/DP_epidemiology/__pycache__/hotspot_analyzer.cpython-310.pyc and b/src/DP_epidemiology/__pycache__/hotspot_analyzer.cpython-310.pyc differ diff --git a/src/DP_epidemiology/__pycache__/mobility_analyzer.cpython-310.pyc b/src/DP_epidemiology/__pycache__/mobility_analyzer.cpython-310.pyc index cab8470..06fa566 100644 Binary files a/src/DP_epidemiology/__pycache__/mobility_analyzer.cpython-310.pyc and b/src/DP_epidemiology/__pycache__/mobility_analyzer.cpython-310.pyc differ diff --git a/src/DP_epidemiology/__pycache__/pandemic_adherence_analyzer.cpython-310.pyc b/src/DP_epidemiology/__pycache__/pandemic_adherence_analyzer.cpython-310.pyc index 7a71a7e..354cf23 100644 Binary files a/src/DP_epidemiology/__pycache__/pandemic_adherence_analyzer.cpython-310.pyc and b/src/DP_epidemiology/__pycache__/pandemic_adherence_analyzer.cpython-310.pyc differ diff --git a/src/DP_epidemiology/__pycache__/utilities.cpython-310.pyc b/src/DP_epidemiology/__pycache__/utilities.cpython-310.pyc index 651b5c4..182b321 100644 Binary files a/src/DP_epidemiology/__pycache__/utilities.cpython-310.pyc and b/src/DP_epidemiology/__pycache__/utilities.cpython-310.pyc differ diff --git a/src/DP_epidemiology/hotspot_analyzer.py b/src/DP_epidemiology/hotspot_analyzer.py index 45a702c..4804d76 100644 --- a/src/DP_epidemiology/hotspot_analyzer.py +++ b/src/DP_epidemiology/hotspot_analyzer.py @@ -25,7 +25,7 @@ def hotspot_analyzer(df:pd.DataFrame, start_date:datetime,end_date:datetime,city nb_timesteps = (end_date - start_date).days // 7 """scale calculation""" - scale=(np.sqrt(3.0)*nb_timesteps*upper_bound)/epsilon + scale=(3.0*nb_timesteps*upper_bound)/epsilon new_df=df.copy() diff --git a/src/DP_epidemiology/mobility_analyzer.py b/src/DP_epidemiology/mobility_analyzer.py index 7ca1cd2..e59144a 100644 --- a/src/DP_epidemiology/mobility_analyzer.py +++ b/src/DP_epidemiology/mobility_analyzer.py @@ -5,6 +5,8 @@ from datetime import datetime import scipy.stats as stats import opendp.prelude as dp +import matplotlib.pyplot as plt +from dtw import dtw,accelerated_dtw dp.enable_features("contrib", "floating-point", "honest-but-curious") @@ -28,7 +30,7 @@ def mobility_analyzer_airline(df:pd.DataFrame,start_date:datetime,end_date:datet nb_timesteps = (end_date - start_date).days // 7 """scale calculation""" - scale=(np.sqrt(3.0)*nb_timesteps*upper_bound)/epsilon + scale=(3.0*nb_timesteps*upper_bound)/epsilon new_df=df.copy() @@ -60,7 +62,7 @@ def mobility_analyzer(df:pd.DataFrame,start_date:datetime,end_date:datetime,city nb_timesteps = (end_date - start_date).days // 7 """scale calculation""" - scale=(np.sqrt(3.0)*nb_timesteps*upper_bound)/epsilon + scale=(3.0*nb_timesteps*upper_bound)/epsilon new_df=df.copy() @@ -85,4 +87,15 @@ def mobility_validation_with_google_mobility(df_transactional_data:pd.DataFrame, # print(df_transactional_mobility.head()) # print(df_google_mobility.head()) r, p = stats.pearsonr(df_transactional_mobility['nb_transactions'][:length], df_google_mobility[category][:length]) - print(f"Scipy computed Pearson r: {r} and p-value: {p}") \ No newline at end of file + print(f"Scipy computed Pearson r: {r} and p-value: {p}") + + d1 = df_transactional_mobility['nb_transactions'][:length].interpolate().values + d2 = df_google_mobility[category][:length].interpolate().values + d, cost_matrix, acc_cost_matrix, path = accelerated_dtw(d1,d2, dist='euclidean') + + plt.imshow(acc_cost_matrix.T, origin='lower', cmap='gray', interpolation='nearest') + plt.plot(path[0], path[1], 'w') + plt.xlabel('Subject1') + plt.ylabel('Subject2') + plt.title(f'DTW Minimum Path with minimum distance: {np.round(d,2)}') + plt.show() \ No newline at end of file diff --git a/src/DP_epidemiology/pandemic_adherence_analyzer.py b/src/DP_epidemiology/pandemic_adherence_analyzer.py index 75e966c..64507c0 100644 --- a/src/DP_epidemiology/pandemic_adherence_analyzer.py +++ b/src/DP_epidemiology/pandemic_adherence_analyzer.py @@ -26,7 +26,7 @@ def pandemic_adherence_analyzer(df:pd.DataFrame,start_date:datetime,end_date:dat nb_timesteps = (end_date - start_date).days // 7 """scale calculation""" - scale=(np.sqrt(3.0)*nb_timesteps*upper_bound)/epsilon + scale=(3.0*nb_timesteps*upper_bound)/epsilon new_df=df.copy() diff --git a/src/DP_epidemiology/utilities.py b/src/DP_epidemiology/utilities.py index 99555c4..3c663c7 100644 --- a/src/DP_epidemiology/utilities.py +++ b/src/DP_epidemiology/utilities.py @@ -118,15 +118,15 @@ def function(df): def make_private_sum_by(column, by, bounds, scale): """Create a measurement that computes the grouped bounded sum of `column`""" - space = dp.vector_domain(dp.atom_domain(T=int)), dp.l2_distance(T=float) - m_gauss = space >> dp.m.then_gaussian(scale) + space = dp.vector_domain(dp.atom_domain(T=int)), dp.l1_distance(T=int) + m_lap = space >> dp.m.then_laplace(scale) t_sum = make_sum_by(column, by, bounds) def function(df): exact = t_sum(df) # print(exact) noisy_sum = pd.Series( - np.maximum(m_gauss(exact.to_numpy().flatten()), 0), + np.maximum(m_lap(exact.to_numpy().flatten()), 0), ) # print(noisy_sum) noisy_sum=noisy_sum.to_frame(name=column) @@ -138,7 +138,7 @@ def function(df): input_metric=dp.symmetric_distance(), output_measure=dp.zero_concentrated_divergence(T=float), function=function, - privacy_map=lambda d_in: m_gauss.map(t_sum.map(d_in)), + privacy_map=lambda d_in: m_lap.map(t_sum.map(d_in)), ) def make_filter(column,entry): diff --git a/src/DP_epidemiology/viz.py b/src/DP_epidemiology/viz.py index 8369230..4e62d24 100644 --- a/src/DP_epidemiology/viz.py +++ b/src/DP_epidemiology/viz.py @@ -96,60 +96,62 @@ def update_graph(start_date, end_date, epsilon, city): return app -def create_mobility_dash_app(df:pd.DataFrame): +def create_mobility_dash_app(df: pd.DataFrame): cities = { "Medellin": (6.2476, -75.5658), "Bogota": (4.7110, -74.0721), "Brasilia": (-15.7975, -47.8919), "Santiago": (-33.4489, -70.6693) - } + } + app = dash.Dash(__name__) - category_list = ['grocery_and_pharmacy', 'transit_stations', 'retail_and_recreation',"other"] + category_list = ['grocery_and_pharmacy', 'transit_stations', 'retail_and_recreation', "other"] + app.layout = html.Div([ - dcc.DatePickerSingle( - id='start-date-picker', - date='2019-01-01' - ), - dcc.DatePickerSingle( - id='end-date-picker', - date='2019-12-31' - ), - dcc.Slider( - id='epsilon-slider', - min=0, - max=10, - step=0.1, - value=1, - marks={i: str(i) for i in range(11)} - ), - dcc.Dropdown( - id='city-dropdown', - options=[{'label': city, 'value': city} for city in cities.keys()], - value='Medellin' - ), - dcc.Dropdown( - id='category-list-dropdown', - options=[{'label': category, 'value': category} for category in category_list], - value='transit_stations' - ), - dcc.Graph(id='mobility-graph') - ]) + dcc.DatePickerSingle( + id='start-date-picker', + date='2019-01-01' + ), + dcc.DatePickerSingle( + id='end-date-picker', + date='2019-12-31' + ), + dcc.Slider( + id='epsilon-slider', + min=0, + max=10, + step=0.1, + value=1, + marks={i: str(i) for i in range(11)} + ), + dcc.Dropdown( + id='city-dropdown', + options=[{'label': city, 'value': city} for city in cities.keys()], + value='Medellin' + ), + dcc.Dropdown( + id='category-list-dropdown', + options=[{'label': category, 'value': category} for category in category_list], + value='transit_stations' + ), + dcc.Graph(id='mobility-graph') + ]) # Callback to update the graph based on input values @app.callback( Output('mobility-graph', 'figure'), [Input('start-date-picker', 'date'), - Input('end-date-picker', 'date'), - Input('city-dropdown', 'value'), - Input('category-list-dropdown', 'value'), - Input('epsilon-slider', 'value')] + Input('end-date-picker', 'date'), + Input('city-dropdown', 'value'), + Input('category-list-dropdown', 'value'), + Input('epsilon-slider', 'value')] ) - def update_graph(start_date, end_date, city_filter,category, epsilon): + def update_graph(start_date, end_date, city_filter, category, epsilon): # Convert date strings to datetime objects start_date = datetime.strptime(start_date, '%Y-%m-%d') end_date = datetime.strptime(end_date, '%Y-%m-%d') - # Call the mobility_analyser function + # Call the mobility_analyzer function filtered_df = mobility_analyzer(df, start_date, end_date, city_filter, category, epsilon) # Plot using Plotly Express @@ -161,76 +163,186 @@ def update_graph(start_date, end_date, city_filter,category, epsilon): labels={'nb_transactions': 'Number of Transactions', 'date': 'Date'} ) + # Add events for Bogotá + if city_filter == "Bogota": + events = [ + ("Isolation Start Drill", "2020-03-20"), + ("National Quarantine", "2020-03-26"), + ("Gender Restriction", "2020-04-16"), + ("Day Without VAT (IVA)", "2020-06-19"), + ("Lockdown 1", "2020-07-15"), + ("Lockdown 2", "2020-07-30"), + ("Lockdown 3", "2020-08-13"), + ("Lockdown 4", "2020-08-20"), + ("End of National Quarantine", "2020-09-04"), + ("Day Without VAT", "2020-11-19"), + ("Candle Day", "2020-12-07"), + ("Start of Novenas", "2020-12-16"), + ("Lockdown 1 (2021)", "2021-01-05"), + ("Lockdown 2 (2021)", "2021-01-12"), + ("Lockdown 3 (2021)", "2021-01-18"), + ("Lockdown 4 (2021)", "2021-01-28"), + ("Holy Week", "2021-03-28"), + ("Model 4x3", "2021-04-06"), + ("Model 4x3 (Extension)", "2021-04-06"), + ("Vaccination Stage 1", "2021-02-18"), + ("Vaccination Stage 2", "2021-03-08"), + ("Vaccination Stage 3", "2021-05-22"), + ("Vaccination Stage 4", "2021-06-17"), + ("Vaccination Stage 5", "2021-07-17"), + ("Riots and Social Unrest", "2021-05-01") + ] + + for event, date in events: + fig.add_shape( + type="line", + x0=date, + y0=0, + x1=date, + y1=1, + xref='x', + yref='paper', + line=dict(color="Red", width=2, dash="dash") + ) + fig.add_annotation( + x=date, + y=1, + xref='x', + yref='paper', + text=event, + showarrow=True, + arrowhead=1, + ax=-10, + ay=-40, + font=dict(color="Red") + ) + return fig + return app -def create_pandemic_adherence_dash_app(df:pd.DataFrame): +def create_pandemic_adherence_dash_app(df: pd.DataFrame): cities = { "Medellin": (6.2476, -75.5658), "Bogota": (4.7110, -74.0721), "Brasilia": (-15.7975, -47.8919), "Santiago": (-33.4489, -70.6693) - } - entry_types=["luxury","essential","other"] + } + entry_types = ["luxury", "essential", "other"] app = dash.Dash(__name__) - + app.layout = html.Div([ - dcc.DatePickerSingle( - id='start-date-picker', - date='2019-01-01' - ), - dcc.DatePickerSingle( - id='end-date-picker', - date='2019-12-31' - ), - dcc.Slider( - id='epsilon-slider', - min=0, - max=10, - step=0.1, - value=1, - marks={i: str(i) for i in range(11)} - ), - dcc.Dropdown( - id='city-dropdown', - options=[{'label': city, 'value': city} for city in cities.keys()], - value='Medellin' - ), - dcc.Dropdown( - id='entry-type-dropdown', - options=[{'label': entry_type, 'value': entry_type} for entry_type in entry_types], - value='luxury' - ), - dcc.Graph(id='pandemic-adherence-graph') - ]) + dcc.DatePickerSingle( + id='start-date-picker', + date='2019-01-01' + ), + dcc.DatePickerSingle( + id='end-date-picker', + date='2019-12-31' + ), + dcc.Slider( + id='epsilon-slider', + min=0, + max=10, + step=0.1, + value=1, + marks={i: str(i) for i in range(11)} + ), + dcc.Dropdown( + id='city-dropdown', + options=[{'label': city, 'value': city} for city in cities.keys()], + value='Medellin' + ), + dcc.Dropdown( + id='entry-type-dropdown', + options=[{'label': entry_type, 'value': entry_type} for entry_type in entry_types], + value='luxury' + ), + dcc.Graph(id='pandemic-adherence-graph') + ]) # Callback to update the graph based on input values @app.callback( Output('pandemic-adherence-graph', 'figure'), [Input('start-date-picker', 'date'), - Input('end-date-picker', 'date'), - Input('city-dropdown', 'value'), - Input('entry-type-dropdown', 'value'), - Input('epsilon-slider', 'value')] + Input('end-date-picker', 'date'), + Input('city-dropdown', 'value'), + Input('entry-type-dropdown', 'value'), + Input('epsilon-slider', 'value')] ) - def update_graph(start_date, end_date, city_filter,essential_or_luxury, epsilon): + def update_graph(start_date, end_date, city_filter, essential_or_luxury, epsilon): # Convert date strings to datetime objects start_date = datetime.strptime(start_date, '%Y-%m-%d') end_date = datetime.strptime(end_date, '%Y-%m-%d') - # Call the mobility_analyser function - filtered_df = pandemic_adherence_analyzer(df, start_date, end_date, city_filter,essential_or_luxury, epsilon) + # Call the pandemic_adherence_analyzer function + filtered_df = pandemic_adherence_analyzer(df, start_date, end_date, city_filter, essential_or_luxury, epsilon) # Plot using Plotly Express fig = px.line( filtered_df, x='date', y='nb_transactions', - title=f"Pandemic Stage Analysis for {city_filter} from {start_date.date()} to {end_date.date()} with epsilon={epsilon}", + title=f"Pandemic adherence Analysis for {city_filter} from {start_date.date()} to {end_date.date()} with epsilon={epsilon}", labels={'nb_transactions': 'Number of Transactions', 'date': 'Date'} ) + # Add events for Bogotá + if city_filter == "Bogota": + events = [ + ("Isolation Start Drill", "2020-03-20"), + ("National Quarantine", "2020-03-26"), + ("Gender Restriction", "2020-04-16"), + ("Day Without VAT (IVA)", "2020-06-19"), + ("Lockdown 1", "2020-07-15"), + ("Lockdown 2", "2020-07-30"), + ("Lockdown 3", "2020-08-13"), + ("Lockdown 4", "2020-08-20"), + ("End of National Quarantine", "2020-09-04"), + ("Day Without VAT", "2020-11-19"), + ("Candle Day", "2020-12-07"), + ("Start of Novenas", "2020-12-16"), + ("Lockdown 1 (2021)", "2021-01-05"), + ("Lockdown 2 (2021)", "2021-01-12"), + ("Lockdown 3 (2021)", "2021-01-18"), + ("Lockdown 4 (2021)", "2021-01-28"), + ("Holy Week", "2021-03-28"), + ("Model 4x3", "2021-04-06"), + ("Model 4x3 (Extension)", "2021-04-06"), + ("Vaccination Stage 1", "2021-02-18"), + ("Vaccination Stage 2", "2021-03-08"), + ("Vaccination Stage 3", "2021-05-22"), + ("Vaccination Stage 4", "2021-06-17"), + ("Vaccination Stage 5", "2021-07-17"), + ("Riots and Social Unrest", "2021-05-01") + ] + + for event, date in events: + fig.add_shape( + type="line", + x0=date, + y0=0, + x1=date, + y1=1, + xref='x', + yref='paper', + line=dict(color="Red", width=2, dash="dash") + ) + fig.add_annotation( + x=date, + y=1, + xref='x', + yref='paper', + text=event, + showarrow=True, + arrowhead=1, + ax=-10, + ay=-40, + font=dict(color="Red") + ) + return fig + return app def create_contact_matrix_dash_app(df:pd.DataFrame): @@ -375,7 +487,7 @@ def update_graph(start_date, end_date, city_filter, category, epsilon): offset = filtered_df_transactional["date"].iloc[0] filtered_df_google = preprocess_google_mobility(df_google_mobility_data, start_date, end_date, city_filter, category, offset) - # Create the plot + # Create the plot with two y-axes fig = go.Figure() # Add transactional mobility data @@ -383,7 +495,8 @@ def update_graph(start_date, end_date, city_filter, category, epsilon): x=filtered_df_transactional['date'], y=filtered_df_transactional['nb_transactions'], mode='lines', - name='Transactional Mobility' + name='Transactional Mobility', + yaxis='y1' )) # Add Google mobility data @@ -391,15 +504,14 @@ def update_graph(start_date, end_date, city_filter, category, epsilon): x=filtered_df_google['date'], y=filtered_df_google[category], mode='lines', - name='Google Mobility' + name='Google Mobility', + yaxis='y2' )) - # Update layout + # Update layout for two y-axes fig.update_layout( title=f"Mobility Analysis for {city_filter} and category {category} from {start_date.date()} to {end_date.date()} with epsilon={epsilon}", xaxis_title='Date', - # yaxis_title='Mobility Change', - # legend_title='Data Source' yaxis=dict( title='Transactional Mobility', titlefont=dict(color='blue'), diff --git a/tests/test_viz.ipynb b/tests/test_viz.ipynb index 3e3d84a..60fe32d 100644 --- a/tests/test_viz.ipynb +++ b/tests/test_viz.ipynb @@ -28,21 +28,261 @@ "outputs": [], "source": [ "path = \"C:\\\\Users\\kshub\\\\OneDrive\\\\Documents\\\\PET_phase_2\\\\Technical_Phase_Data\\\\technical_phase_data.csv\"\n", - "df_tran = pd.read_csv(path)\n", - "df_mobility = pd.read_csv(\"C:\\\\Users\\\\kshub\\\\OneDrive\\\\Documents\\\\PET_phase_2\\\\Global_Mobility_Report (1).csv\", low_memory=False)\n" + "df_tran = pd.read_csv(path)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
country_region_codecountry_regionsub_region_1sub_region_2metro_areaiso_3166_2_codecensus_fips_codeplace_iddateretail_and_recreation_percent_change_from_baselinegrocery_and_pharmacy_percent_change_from_baselineparks_percent_change_from_baselinetransit_stations_percent_change_from_baselineworkplaces_percent_change_from_baselineresidential_percent_change_from_baseline
0AEUnited Arab EmiratesNaNNaNNaNNaNNaNChIJvRKrsd9IXj4RpwoIwFYv0zM2020-02-150.04.05.00.02.01.0
1AEUnited Arab EmiratesNaNNaNNaNNaNNaNChIJvRKrsd9IXj4RpwoIwFYv0zM2020-02-161.04.04.01.02.01.0
2AEUnited Arab EmiratesNaNNaNNaNNaNNaNChIJvRKrsd9IXj4RpwoIwFYv0zM2020-02-17-1.01.05.01.02.01.0
3AEUnited Arab EmiratesNaNNaNNaNNaNNaNChIJvRKrsd9IXj4RpwoIwFYv0zM2020-02-18-2.01.05.00.02.01.0
4AEUnited Arab EmiratesNaNNaNNaNNaNNaNChIJvRKrsd9IXj4RpwoIwFYv0zM2020-02-19-2.00.04.0-1.02.01.0
\n", + "
" + ], + "text/plain": [ + " country_region_code country_region sub_region_1 sub_region_2 \\\n", + "0 AE United Arab Emirates NaN NaN \n", + "1 AE United Arab Emirates NaN NaN \n", + "2 AE United Arab Emirates NaN NaN \n", + "3 AE United Arab Emirates NaN NaN \n", + "4 AE United Arab Emirates NaN NaN \n", + "\n", + " metro_area iso_3166_2_code census_fips_code place_id \\\n", + "0 NaN NaN NaN ChIJvRKrsd9IXj4RpwoIwFYv0zM \n", + "1 NaN NaN NaN ChIJvRKrsd9IXj4RpwoIwFYv0zM \n", + "2 NaN NaN NaN ChIJvRKrsd9IXj4RpwoIwFYv0zM \n", + "3 NaN NaN NaN ChIJvRKrsd9IXj4RpwoIwFYv0zM \n", + "4 NaN NaN NaN ChIJvRKrsd9IXj4RpwoIwFYv0zM \n", + "\n", + " date retail_and_recreation_percent_change_from_baseline \\\n", + "0 2020-02-15 0.0 \n", + "1 2020-02-16 1.0 \n", + "2 2020-02-17 -1.0 \n", + "3 2020-02-18 -2.0 \n", + "4 2020-02-19 -2.0 \n", + "\n", + " grocery_and_pharmacy_percent_change_from_baseline \\\n", + "0 4.0 \n", + "1 4.0 \n", + "2 1.0 \n", + "3 1.0 \n", + "4 0.0 \n", + "\n", + " parks_percent_change_from_baseline \\\n", + "0 5.0 \n", + "1 4.0 \n", + "2 5.0 \n", + "3 5.0 \n", + "4 4.0 \n", + "\n", + " transit_stations_percent_change_from_baseline \\\n", + "0 0.0 \n", + "1 1.0 \n", + "2 1.0 \n", + "3 0.0 \n", + "4 -1.0 \n", + "\n", + " workplaces_percent_change_from_baseline \\\n", + "0 2.0 \n", + "1 2.0 \n", + "2 2.0 \n", + "3 2.0 \n", + "4 2.0 \n", + "\n", + " residential_percent_change_from_baseline \n", + "0 1.0 \n", + "1 1.0 \n", + "2 1.0 \n", + "3 1.0 \n", + "4 1.0 " + ] + }, + "execution_count": 3, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "import pandas as pd\n", + "\n", + "# Define the path to the CSV file\n", + "file_path = \"C:\\\\Users\\\\kshub\\\\OneDrive\\\\Documents\\\\PET_phase_2\\\\Global_Mobility_Report (1).csv\"\n", + "\n", + "# Initialize an empty list to store the chunks\n", + "chunks = []\n", + "\n", + "# Read the CSV file in chunks\n", + "chunk_size = 10000 # Adjust the chunk size as needed\n", + "for chunk in pd.read_csv(file_path, chunksize=chunk_size, low_memory=False):\n", + " chunks.append(chunk)\n", + "\n", + "# Concatenate the chunks into a single DataFrame\n", + "df_mobility = pd.concat(chunks, ignore_index=True)\n", + "\n", + "# Display the first few rows of the DataFrame\n", + "df_mobility.head()" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [], + "source": [ + "df_mobility = pd.read_csv(\"C:\\\\Users\\\\kshub\\\\OneDrive\\\\Documents\\\\PET_phase_2\\\\Global_Mobility_Report (1).csv\", low_memory=False)\n" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - "Scipy computed Pearson r: 0.3056463348470299 and p-value: 0.03886009367628722\n" + "Scipy computed Pearson r: 0.2726605353538447 and p-value: 0.06675963402694846\n" ] + }, + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" } ], "source": [ @@ -51,7 +291,7 @@ }, { "cell_type": "code", - "execution_count": 4, + "execution_count": 5, "metadata": {}, "outputs": [ { @@ -69,7 +309,7 @@ " " ], "text/plain": [ - "" + "" ] }, "metadata": {}, @@ -83,7 +323,7 @@ }, { "cell_type": "code", - "execution_count": 4, + "execution_count": 7, "metadata": {}, "outputs": [ { @@ -101,21 +341,378 @@ " " ], "text/plain": [ - "" + "" ] }, "metadata": {}, "output_type": "display_data" + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "---------------------------------------------------------------------------\n", + "ValueError Traceback (most recent call last)\n", + "File c:\\Users\\kshub\\OneDrive\\Documents\\PET_phase_2\\PETs_for_Public_Health_Challenge\\src\\DP_epidemiology\\viz.py:76, in create_hotspot_dash_app..update_graph(\n", + " start_date=datetime.datetime(2019, 1, 1, 0, 0),\n", + " end_date=datetime.datetime(2019, 12, 31, 0, 0),\n", + " epsilon=1,\n", + " city='Medellin'\n", + ")\n", + " 73 filtered_df = get_coordinates(output)\n", + " 75 # Plot using Plotly Express\n", + "---> 76 fig = px.scatter_geo(\n", + " px = \n", + " filtered_df = nb_transactions merch_postal_code Latitude Longitude\n", + "0 306566 500001 6.389072 -74.596536\n", + "1 296614 500002 5.523995 -76.536749\n", + "2 342304 500003 5.982925 -75.812286\n", + "3 307235 500004 5.130301 -76.138346\n", + "4 237251 500005 5.605763 -74.716468\n", + "5 458127 500006 4.801478 -75.121902\n", + "6 554476 500007 7.180048 -76.071168\n", + "7 373792 500008 7.463546 -74.963595\n", + "8 324664 500009 7.743818 -76.738181\n", + "9 576898 500010 5.648854 -74.132482\n", + "10 273212 500011 6.727315 -76.635165\n", + "11 404836 500012 6.861639 -76.746068\n", + "12 498471 500013 7.362958 -74.297787\n", + "13 445097 500014 6.130727 -75.667267\n", + "14 397807 500015 5.825303 -76.559407\n", + "15 580316 500016 7.165207 -76.975321\n", + "16 421731 500017 6.236910 -74.487897\n", + "17 926698 500020 7.652337 -75.576271\n", + "18 370391 500021 7.236503 -75.530987\n", + "19 777796 500022 6.854339 -75.455203\n", + "20 394755 500023 7.241663 -76.453039\n", + "21 470054 500024 6.091720 -76.950376\n", + "22 438161 500025 7.261668 -77.019802\n", + "23 183703 500026 5.103643 -74.756495\n", + "24 447787 500027 4.874871 -74.121468\n", + "25 462685 500028 6.066869 -75.693614\n", + "26 384548 500030 5.627784 -76.472683\n", + "27 443880 500031 7.509737 -76.784860\n", + "28 350224 500032 6.206165 -74.107333\n", + "29 358644 500033 6.058493 -76.241435\n", + "30 406516 500034 7.417255 -75.579524\n", + "31 292610 500035 7.335878 -74.620644\n", + "32 339361 500036 5.158304 -74.901021\n", + "33 461339 500037 5.490366 -75.888601\n", + "34 334532 500040 7.124754 -75.053544\n", + "35 327973 500041 6.637287 -74.737167\n", + "36 332156 500042 7.137278 -75.044710\n", + "37 518508 500043 7.327858 -75.105685\n", + "38 353982 500044 4.986360 -75.365658\n", + "39 440938 500046 5.015590 -75.221307\n", + "40 497444 500047 5.814341 -75.149734\n", + "41 243727 55411 5.462628 -74.823107\n", + " start_date = datetime.datetime(2019, 1, 1, 0, 0)\n", + " end_date = datetime.datetime(2019, 12, 31, 0, 0)\n", + " city = 'Medellin'\n", + " epsilon = 1\n", + " px.colors.sequential.Plasma = ['#0d0887', '#46039f', '#7201a8', '#9c179e', '#bd3786', '#d8576b', '#ed7953', '#fb9f3a', '#fdca26', '#f0f921']\n", + " px.colors.sequential = \n", + " px.colors = \n", + " 77 filtered_df,\n", + " 78 lat='Latitude',\n", + " 79 lon='Longitude',\n", + " 80 color='nb_transactions',\n", + " 81 size='nb_transactions',\n", + " 82 hover_name='merch_postal_code',\n", + " 83 hover_data={'merch_postal_code': True, 'nb_transactions': True, 'Latitude': False, 'Longitude': False},\n", + " 84 projection='mercator',\n", + " 85 title=f\"Transaction Locations in {city} from {start_date.date()} to {end_date.date()} with epsilon={epsilon}\",\n", + " 86 color_continuous_scale=px.colors.sequential.Plasma\n", + " 87 )\n", + " 89 # Center the map around the selected city\n", + " 90 fig.update_geos(\n", + " 91 center=dict(lat=cities[city][0], lon=cities[city][1]),\n", + " 92 projection_scale=2.5 # Zoom level\n", + " 93 )\n", + "\n", + "File c:\\Users\\kshub\\OneDrive\\Documents\\PET_phase_2\\.venv\\lib\\site-packages\\plotly\\express\\_chart_types.py:1148, in scatter_geo(\n", + " data_frame= nb_transactions merch_postal_code Latitude ... 243727 55411 5.462628 -74.823107,\n", + " lat='Latitude',\n", + " lon='Longitude',\n", + " locations=None,\n", + " locationmode=None,\n", + " geojson=None,\n", + " featureidkey=None,\n", + " color='nb_transactions',\n", + " text=None,\n", + " symbol=None,\n", + " facet_row=None,\n", + " facet_col=None,\n", + " facet_col_wrap=0,\n", + " facet_row_spacing=None,\n", + " facet_col_spacing=None,\n", + " hover_name='merch_postal_code',\n", + " hover_data={'Latitude': False, 'Longitude': False, 'merch_postal_code': True, 'nb_transactions': True},\n", + " custom_data=None,\n", + " size='nb_transactions',\n", + " animation_frame=None,\n", + " animation_group=None,\n", + " category_orders=None,\n", + " labels=None,\n", + " color_discrete_sequence=None,\n", + " color_discrete_map=None,\n", + " color_continuous_scale=['#0d0887', '#46039f', '#7201a8', '#9c179e', '#bd3786', '#d8576b', '#ed7953', '#fb9f3a', '#fdca26', '#f0f921'],\n", + " range_color=None,\n", + " color_continuous_midpoint=None,\n", + " symbol_sequence=None,\n", + " symbol_map=None,\n", + " opacity=None,\n", + " size_max=None,\n", + " projection='mercator',\n", + " scope=None,\n", + " center=None,\n", + " fitbounds=None,\n", + " basemap_visible=None,\n", + " title='Transaction Locations in Medellin from 2019-01-01 to 2019-12-31 with epsilon=1',\n", + " template=None,\n", + " width=None,\n", + " height=None\n", + ")\n", + " 1101 def scatter_geo(\n", + " 1102 data_frame=None,\n", + " 1103 lat=None,\n", + " (...)\n", + " 1142 height=None,\n", + " 1143 ) -> go.Figure:\n", + " 1144 \"\"\"\n", + " 1145 In a geographic scatter plot, each row of `data_frame` is represented\n", + " 1146 by a symbol mark on a map.\n", + " 1147 \"\"\"\n", + "-> 1148 return make_figure(\n", + " go = \n", + " locationmode = None\n", + " 1149 args=locals(),\n", + " 1150 constructor=go.Scattergeo,\n", + " 1151 trace_patch=dict(locationmode=locationmode),\n", + " 1152 )\n", + "\n", + "File c:\\Users\\kshub\\OneDrive\\Documents\\PET_phase_2\\.venv\\lib\\site-packages\\plotly\\express\\_core.py:2115, in make_figure(\n", + " args={'animation_frame': None, 'animation_group': None, 'basemap_visible': None, 'category_orders': None, 'center': None, 'color': 'nb_transactions', 'color_continuous_midpoint': None, 'color_continuous_scale': ['#0d0887', '#46039f', '#7201a8', '#9c179e', '#bd3786', '#d8576b', '#ed7953', '#fb9f3a', '#fdca26', '#f0f921'], 'color_discrete_map': None, 'color_discrete_sequence': None, ...},\n", + " constructor=,\n", + " trace_patch={'locationmode': None},\n", + " layout_patch={}\n", + ")\n", + " 2113 trace_patch = trace_patch or {}\n", + " 2114 layout_patch = layout_patch or {}\n", + "-> 2115 apply_default_cascade(args)\n", + " args = {'data_frame': nb_transactions merch_postal_code Latitude Longitude\n", + "0 306566 500001 6.389072 -74.596536\n", + "1 296614 500002 5.523995 -76.536749\n", + "2 342304 500003 5.982925 -75.812286\n", + "3 307235 500004 5.130301 -76.138346\n", + "4 237251 500005 5.605763 -74.716468\n", + "5 458127 500006 4.801478 -75.121902\n", + "6 554476 500007 7.180048 -76.071168\n", + "7 373792 500008 7.463546 -74.963595\n", + "8 324664 500009 7.743818 -76.738181\n", + "9 576898 500010 5.648854 -74.132482\n", + "10 273212 500011 6.727315 -76.635165\n", + "11 404836 500012 6.861639 -76.746068\n", + "12 498471 500013 7.362958 -74.297787\n", + "13 445097 500014 6.130727 -75.667267\n", + "14 397807 500015 5.825303 -76.559407\n", + "15 580316 500016 7.165207 -76.975321\n", + "16 421731 500017 6.236910 -74.487897\n", + "17 926698 500020 7.652337 -75.576271\n", + "18 370391 500021 7.236503 -75.530987\n", + "19 777796 500022 6.854339 -75.455203\n", + "20 394755 500023 7.241663 -76.453039\n", + "21 470054 500024 6.091720 -76.950376\n", + "22 438161 500025 7.261668 -77.019802\n", + "23 183703 500026 5.103643 -74.756495\n", + "24 447787 500027 4.874871 -74.121468\n", + "25 462685 500028 6.066869 -75.693614\n", + "26 384548 500030 5.627784 -76.472683\n", + "27 443880 500031 7.509737 -76.784860\n", + "28 350224 500032 6.206165 -74.107333\n", + "29 358644 500033 6.058493 -76.241435\n", + "30 406516 500034 7.417255 -75.579524\n", + "31 292610 500035 7.335878 -74.620644\n", + "32 339361 500036 5.158304 -74.901021\n", + "33 461339 500037 5.490366 -75.888601\n", + "34 334532 500040 7.124754 -75.053544\n", + "35 327973 500041 6.637287 -74.737167\n", + "36 332156 500042 7.137278 -75.044710\n", + "37 518508 500043 7.327858 -75.105685\n", + "38 353982 500044 4.986360 -75.365658\n", + "39 440938 500046 5.015590 -75.221307\n", + "40 497444 500047 5.814341 -75.149734\n", + "41 243727 55411 5.462628 -74.823107, 'lat': 'Latitude', 'lon': 'Longitude', 'locations': None, 'locationmode': None, 'geojson': None, 'featureidkey': None, 'color': 'nb_transactions', 'text': None, 'symbol': None, 'facet_row': None, 'facet_col': None, 'facet_col_wrap': 0, 'facet_row_spacing': None, 'facet_col_spacing': None, 'hover_name': 'merch_postal_code', 'hover_data': {'merch_postal_code': True, 'nb_transactions': True, 'Latitude': False, 'Longitude': False}, 'custom_data': None, 'size': 'nb_transactions', 'animation_frame': None, 'animation_group': None, 'category_orders': None, 'labels': None, 'color_discrete_sequence': None, 'color_discrete_map': None, 'color_continuous_scale': ['#0d0887', '#46039f', '#7201a8', '#9c179e', '#bd3786', '#d8576b', '#ed7953', '#fb9f3a', '#fdca26', '#f0f921'], 'range_color': None, 'color_continuous_midpoint': None, 'symbol_sequence': None, 'symbol_map': None, 'opacity': None, 'size_max': None, 'projection': 'mercator', 'scope': None, 'center': None, 'fitbounds': None, 'basemap_visible': None, 'title': 'Transaction Locations in Medellin from 2019-01-01 to 2019-12-31 with epsilon=1', 'template': None, 'width': None, 'height': None}\n", + " 2117 args = build_dataframe(args, constructor)\n", + " 2118 if constructor in [go.Treemap, go.Sunburst, go.Icicle] and args[\"path\"] is not None:\n", + "\n", + "File c:\\Users\\kshub\\OneDrive\\Documents\\PET_phase_2\\.venv\\lib\\site-packages\\plotly\\express\\_core.py:970, in apply_default_cascade(\n", + " args={'animation_frame': None, 'animation_group': None, 'basemap_visible': None, 'category_orders': None, 'center': None, 'color': 'nb_transactions', 'color_continuous_midpoint': None, 'color_continuous_scale': ['#0d0887', '#46039f', '#7201a8', '#9c179e', '#bd3786', '#d8576b', '#ed7953', '#fb9f3a', '#fdca26', '#f0f921'], 'color_discrete_map': None, 'color_discrete_sequence': None, ...}\n", + ")\n", + " 968 if \"symbol_sequence\" in args:\n", + " 969 if args[\"symbol_sequence\"] is None and args[\"template\"].data.scatter:\n", + "--> 970 args[\"symbol_sequence\"] = [\n", + " args = {'data_frame': nb_transactions merch_postal_code Latitude Longitude\n", + "0 306566 500001 6.389072 -74.596536\n", + "1 296614 500002 5.523995 -76.536749\n", + "2 342304 500003 5.982925 -75.812286\n", + "3 307235 500004 5.130301 -76.138346\n", + "4 237251 500005 5.605763 -74.716468\n", + "5 458127 500006 4.801478 -75.121902\n", + "6 554476 500007 7.180048 -76.071168\n", + "7 373792 500008 7.463546 -74.963595\n", + "8 324664 500009 7.743818 -76.738181\n", + "9 576898 500010 5.648854 -74.132482\n", + "10 273212 500011 6.727315 -76.635165\n", + "11 404836 500012 6.861639 -76.746068\n", + "12 498471 500013 7.362958 -74.297787\n", + "13 445097 500014 6.130727 -75.667267\n", + "14 397807 500015 5.825303 -76.559407\n", + "15 580316 500016 7.165207 -76.975321\n", + "16 421731 500017 6.236910 -74.487897\n", + "17 926698 500020 7.652337 -75.576271\n", + "18 370391 500021 7.236503 -75.530987\n", + "19 777796 500022 6.854339 -75.455203\n", + "20 394755 500023 7.241663 -76.453039\n", + "21 470054 500024 6.091720 -76.950376\n", + "22 438161 500025 7.261668 -77.019802\n", + "23 183703 500026 5.103643 -74.756495\n", + "24 447787 500027 4.874871 -74.121468\n", + "25 462685 500028 6.066869 -75.693614\n", + "26 384548 500030 5.627784 -76.472683\n", + "27 443880 500031 7.509737 -76.784860\n", + "28 350224 500032 6.206165 -74.107333\n", + "29 358644 500033 6.058493 -76.241435\n", + "30 406516 500034 7.417255 -75.579524\n", + "31 292610 500035 7.335878 -74.620644\n", + "32 339361 500036 5.158304 -74.901021\n", + "33 461339 500037 5.490366 -75.888601\n", + "34 334532 500040 7.124754 -75.053544\n", + "35 327973 500041 6.637287 -74.737167\n", + "36 332156 500042 7.137278 -75.044710\n", + "37 518508 500043 7.327858 -75.105685\n", + "38 353982 500044 4.986360 -75.365658\n", + "39 440938 500046 5.015590 -75.221307\n", + "40 497444 500047 5.814341 -75.149734\n", + "41 243727 55411 5.462628 -74.823107, 'lat': 'Latitude', 'lon': 'Longitude', 'locations': None, 'locationmode': None, 'geojson': None, 'featureidkey': None, 'color': 'nb_transactions', 'text': None, 'symbol': None, 'facet_row': None, 'facet_col': None, 'facet_col_wrap': 0, 'facet_row_spacing': None, 'facet_col_spacing': None, 'hover_name': 'merch_postal_code', 'hover_data': {'merch_postal_code': True, 'nb_transactions': True, 'Latitude': False, 'Longitude': False}, 'custom_data': None, 'size': 'nb_transactions', 'animation_frame': None, 'animation_group': None, 'category_orders': None, 'labels': None, 'color_discrete_sequence': None, 'color_discrete_map': None, 'color_continuous_scale': ['#0d0887', '#46039f', '#7201a8', '#9c179e', '#bd3786', '#d8576b', '#ed7953', '#fb9f3a', '#fdca26', '#f0f921'], 'range_color': None, 'color_continuous_midpoint': None, 'symbol_sequence': None, 'symbol_map': None, 'opacity': None, 'size_max': None, 'projection': 'mercator', 'scope': None, 'center': None, 'fitbounds': None, 'basemap_visible': None, 'title': 'Transaction Locations in Medellin from 2019-01-01 to 2019-12-31 with epsilon=1', 'template': None, 'width': None, 'height': None}\n", + " args[\"symbol_sequence\"] = None\n", + " args[\"template\"] = None\n", + " 971 scatter.marker.symbol for scatter in args[\"template\"].data.scatter\n", + " 972 ]\n", + " 973 if not args[\"symbol_sequence\"] or not any(args[\"symbol_sequence\"]):\n", + " 974 args[\"symbol_sequence\"] = [\"circle\", \"diamond\", \"square\", \"x\", \"cross\"]\n", + "\n", + "File c:\\Users\\kshub\\OneDrive\\Documents\\PET_phase_2\\.venv\\lib\\site-packages\\plotly\\express\\_core.py:971, in (.0=)\n", + " 968 if \"symbol_sequence\" in args:\n", + " 969 if args[\"symbol_sequence\"] is None and args[\"template\"].data.scatter:\n", + " 970 args[\"symbol_sequence\"] = [\n", + "--> 971 scatter.marker.symbol for scatter in args[\"template\"].data.scatter\n", + " Exception trying to inspect frame. No more locals available.\n", + " 972 ]\n", + " 973 if not args[\"symbol_sequence\"] or not any(args[\"symbol_sequence\"]):\n", + " 974 args[\"symbol_sequence\"] = [\"circle\", \"diamond\", \"square\", \"x\", \"cross\"]\n", + "\n", + "File c:\\Users\\kshub\\OneDrive\\Documents\\PET_phase_2\\.venv\\lib\\site-packages\\plotly\\graph_objs\\scatter\\_marker.py:1214, in Marker.symbol(\n", + " self= instance\n", + ")\n", + " 1109 @property\n", + " 1110 def symbol(self):\n", + " 1111 \"\"\"\n", + " 1112 Sets the marker symbol type. Adding 100 is equivalent to\n", + " 1113 appending \"-open\" to a symbol name. Adding 200 is equivalent to\n", + " (...)\n", + " 1212 Any|numpy.ndarray\n", + " 1213 \"\"\"\n", + "-> 1214 return self[\"symbol\"]\n", + " Exception trying to inspect frame. No more locals available.\n", + "\n", + "File c:\\Users\\kshub\\OneDrive\\Documents\\PET_phase_2\\.venv\\lib\\site-packages\\plotly\\basedatatypes.py:4742, in BasePlotlyType.__getitem__(\n", + " self= instance,\n", + " prop='symbol'\n", + ")\n", + " 4739 self._compound_array_props[prop] = []\n", + " 4741 return validator.present(self._compound_array_props[prop])\n", + "-> 4742 elif self._props is not None and prop in self._props:\n", + " prop = 'symbol'\n", + " Exception trying to inspect frame. No more locals available.\n", + " 4743 return validator.present(self._props[prop])\n", + " 4744 elif self._prop_defaults is not None:\n", + "\n", + "File c:\\Users\\kshub\\OneDrive\\Documents\\PET_phase_2\\.venv\\lib\\site-packages\\plotly\\basedatatypes.py:4437, in BasePlotlyType._props(\n", + " self= instance\n", + ")\n", + " 4434 return self._orphan_props\n", + " 4435 else:\n", + " 4436 # Get data from parent's dict\n", + "-> 4437 return self.parent._get_child_props(self)\n", + " Exception trying to inspect frame. No more locals available.\n", + "\n", + "File c:\\Users\\kshub\\OneDrive\\Documents\\PET_phase_2\\.venv\\lib\\site-packages\\plotly\\basedatatypes.py:4451, in BasePlotlyType._get_child_props(\n", + " self= instance,\n", + " child= instance\n", + ")\n", + " 4439 def _get_child_props(self, child):\n", + " 4440 \"\"\"\n", + " 4441 Return properties dict for child\n", + " 4442 \n", + " (...)\n", + " 4449 dict\n", + " 4450 \"\"\"\n", + "-> 4451 if self._props is None:\n", + " Exception trying to inspect frame. No more locals available.\n", + " 4452 # If this node's properties are uninitialized then so are its\n", + " 4453 # child's\n", + " 4454 return None\n", + " 4455 else:\n", + " 4456 # ### Child a compound property ###\n", + "\n", + "File c:\\Users\\kshub\\OneDrive\\Documents\\PET_phase_2\\.venv\\lib\\site-packages\\plotly\\basedatatypes.py:4437, in BasePlotlyType._props(\n", + " self= instance\n", + ")\n", + " 4434 return self._orphan_props\n", + " 4435 else:\n", + " 4436 # Get data from parent's dict\n", + "-> 4437 return self.parent._get_child_props(self)\n", + " Exception trying to inspect frame. No more locals available.\n", + "\n", + "File c:\\Users\\kshub\\OneDrive\\Documents\\PET_phase_2\\.venv\\lib\\site-packages\\plotly\\basedatatypes.py:4471, in BasePlotlyType._get_child_props(\n", + " self=layout.template.Data({\n", + " 'bar': [{'error_x': {...': 'white'}},\n", + " 'type': 'table'}]\n", + "}),\n", + " child= instance\n", + ")\n", + " 4469 elif isinstance(validator, CompoundArrayValidator):\n", + " 4470 children = self[child.plotly_name]\n", + "-> 4471 child_ind = BaseFigure._index_is(children, child)\n", + " Exception trying to inspect frame. No more locals available.\n", + " 4472 assert child_ind is not None\n", + " 4474 children_props = self._props.get(child.plotly_name, None)\n", + "\n", + "File c:\\Users\\kshub\\OneDrive\\Documents\\PET_phase_2\\.venv\\lib\\site-packages\\plotly\\basedatatypes.py:3965, in BaseFigure._index_is(\n", + " iterable=(Scatter({\n", + " 'fillpattern': {'fillmode': 'overlay', 'size': 10, 'solidity': 0.2}\n", + "}),),\n", + " val= instance\n", + ")\n", + " 3963 index_list = [i for i, curr_val in enumerate(iterable) if curr_val is val]\n", + " 3964 if not index_list:\n", + "-> 3965 raise ValueError(\"Invalid value\")\n", + " 3967 return index_list[0]\n", + "\n", + "ValueError: Invalid value\n", + "\n" + ] } ], "source": [ - "app=create_hotspot_dash_app(df)\n", + "app=create_hotspot_dash_app(df_tran)\n", "app.run_server(debug=True)" ] }, { "cell_type": "code", - "execution_count": 5, + "execution_count": 8, "metadata": {}, "outputs": [ { @@ -133,7 +730,7 @@ " " ], "text/plain": [ - "" + "" ] }, "metadata": {}, @@ -141,13 +738,13 @@ } ], "source": [ - "app=create_mobility_dash_app(df)\n", + "app=create_mobility_dash_app(df_tran)\n", "app.run_server(debug=True)" ] }, { "cell_type": "code", - "execution_count": 9, + "execution_count": 10, "metadata": {}, "outputs": [ { @@ -165,7 +762,7 @@ " " ], "text/plain": [ - "" + "" ] }, "metadata": {}, @@ -173,7 +770,7 @@ } ], "source": [ - "app=create_pandemic_adherence_dash_app(df)\n", + "app=create_pandemic_adherence_dash_app(df_tran)\n", "app.run_server(debug=True)" ] },