Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add example to create graph from pandas data frame and change style interactively #149

Open
joseberlines opened this issue Aug 29, 2020 · 14 comments
Labels
hacktoberfest mentored This issue is going to be mentored by one of the repo's maintainers

Comments

@joseberlines
Copy link
Contributor

Playing around with example:
https://github.com/QuantStack/ipycytoscape/blob/master/examples/Labels%20example.ipynb

I have been thinking if it would be worth to implement a solution that would create a graph out of a pandas data frame.

The nature of a graph is not easily represented as table, but for some straight forward cases might be useful since in data science we are all confronted with pandas eventually.

I can imagine a column with the names of the nodes and another column for directed edges and other columns for representation layouts (colours, etc).

Some thoughts about this are welcome.

@ianhi
Copy link
Collaborator

ianhi commented Aug 29, 2020

Hey @joseberlines happily mariana has already implemented this! Check out the DF example:
https://github.com/QuantStack/ipycytoscape/blob/master/examples/DataFrame%20interaction.ipynb

@joseberlines
Copy link
Contributor Author

joseberlines commented Aug 29, 2020

Hi @ianhi thanks for the answer. If that is already implemented then there is a bit of info missing in the example.
I am thinking about this:

Node name Label Color node Edges edge_colors Label edges Thickness edges
Mallorca MNLL red Berlin blue fly 34
Berlin BE Blue Paris green road 4
Paris PA black London,Berlin Yellow Road, train 3,4
London LO red Paris, Berlin Black road,train 4,5
Paris PA blue Mallorca, Moscow green Fly,train 6,7
Moscow MO black Berlin, Mallorca Blue road, fly 8,9

So it would be possible to play around with pandas and for instance apply functions to the whole data in order to get the coloraturas of the nodes etc.

In the example that you mentioned it is possible to see that actually a df can be the input to a ipycytoscape, that means that the pandas might contain more data for styling etc. I don't know if that is the case.

That poses some problems since plausibility checks are necessary for data compatibility.

This opens many possibilities because by combining the plotting of graphs with ipywidgets its possible to filter the graph easily using pandas functionality.

@ianhi
Copy link
Collaborator

ianhi commented Aug 29, 2020

Yeah, unfortunately the from_dataframe method currently doesn't look for every possible attribute. For example, for Nodes it only looks for what other Nodes it is connected too, and what the tooltip should say.

https://github.com/QuantStack/ipycytoscape/blob/c1e002b52fab0db65136502e5ff09371ffa5ff55/ipycytoscape/cytoscape.py#L470-L471

This is has been implemented for both JSON and networkx but not yet dataframes. See #64 (comment)

@joseberlines
Copy link
Contributor Author

The idea here being also for the user not having to touch any CSS or HTML code. I see ipycytoscape as part of the ipywidgets family. The idea is not needing to touch CSS and HTML and being able to deploy dashboards with voila with just one language, python.

@marimeireles
Copy link
Collaborator

Thanks for the reply @ianhi, I completely forgot about this. I reopened the issue! :)
It should be an easy fix!

@joseberlines, what you mean is that you would like to set the color of the nodes and edges based on what's written on the columns, right? (That's an amazing idea! Never thought of that).

@joseberlines
Copy link
Contributor Author

joseberlines commented Aug 30, 2020

yes @marimeireles, in this way you can basically work with pandas and create the conditional formatting all in pandas. for instance
pseudocode !!!!:

df = Dataframe
df['color'] = df[' *** whatever column ***'].apply( **** whatever function ****)
df['text'] = df[' *** whatever column ***'].apply( **** whatever function ****)

you have totally power to redefine the characteristics of the graph without getting out of the pandas world. This is actually something related to a suggestion I made related to ipysheet,
jupyter-widgets-contrib/ipysheet#173

Same philosophy.

If relying on the user for creating a pandas that complains with all the requirements is too much, another approach would be:

pseudocode !!!!:

list_nodes = ['berlin','Barcelona','Paris']
list_text     = [P,B,P]
list_colors = [blue,red,green]
list_sizes  = [34,56,67]

cytoscapeobj = ipycytoscape.CytoscapeWidget()
cytoscapeobj.graph.add_graph_from_lists(nodes = list_nodes, texts = list_texts,colors=list_colors, sizes=list_sizes)

obviously if the Len of the list is not the same error is raised.

@marimeireles
Copy link
Collaborator

This might be a good example for people willing to try it on hacktoberfest.

@marimeireles marimeireles added hacktoberfest enhancement New feature or request labels Sep 30, 2020
@marimeireles marimeireles changed the title create graph from pandas data frame Add example to create graph from pandas data frame and change style interactively Sep 30, 2020
@marimeireles marimeireles added mentored This issue is going to be mentored by one of the repo's maintainers and removed enhancement New feature or request labels Sep 30, 2020
@joseberlines
Copy link
Contributor Author

Hi, can someone tell me about hacktoberfest? where? how? thanks

@marimeireles
Copy link
Collaborator

Hey @joseberlines sure! :)
It's an online global event where people interested in contributing to tech open pull requests for open source projects. If you open 4 PRs this year you get a cool t-shirt + some stickers and mine and everybody else who uses the project gratefulness. :D
Here's more info about it: https://hacktoberfest.digitalocean.com/
You're super welcome to join. I'm around if you need anything.

@ianhi
Copy link
Collaborator

ianhi commented Oct 12, 2020

@marimeireles per https://hacktoberfest.digitalocean.com/hacktoberfest-update PRs will only count if we add the hacktoberfest topic.

I was about to go add it but then noticed that no other quantstack repo has any topics, is there a quantstack policy against having topics?

@marimeireles
Copy link
Collaborator

No probs @ianhi, I missed this update. Thanks! :)

@marimeireles
Copy link
Collaborator

@ianhi I just did, actually. wasn't sure if you could do it. Thanks again! <3

@joseberlines
Copy link
Contributor Author

Hi @marimeireles & @ianhi , I am still coding this idea which is taken more time than I expected. Can any of you provide me with complete set of parameters that could be handled by ipycytoscape (as pointed out in issue #175 ) minute 15.06 of the Jupiter con conference YouTube video. thx.

@joseberlines
Copy link
Contributor Author

joseberlines commented Nov 26, 2020

Dear all, I was about to open a discussion issue about this item but we might go on discussing it here.

So far my idea is the following
pseudocode:

def make_complete_graph_from_df(nodes_df, edges_df='', class_df=''):
    
    # check compulsory fields in nodes
    if "id" not in nodes_df.columns:
        raise ValueError(f'"id" should be a column of the nodes DataFrame.')
    if "name" not in nodes_df.columns:
        raise ValueError(f'"name" should be a column of the nodes DataFrame.')
    if "position_x" in nodes_df.columns and "position_y" not in nodes_df.columns:
        raise ValueError(f'"position_x" in columns but "position_y" missing.')
    if "position_y" in nodes_df.columns and "position_x" not in nodes_df.columns:
        raise ValueError(f'"position_y" in columns but "position_x" missing.')
                         

where nodes_df is a DF containing the following columns related to the attributes:
all_node_attributes = ["id","idInt","parent","name","score",
"position_x","position_y",
"group","removed","selected","selectable","locked","grabbed","grabbable", "classes"]

and some others related to style (background colour, shape -if possible-, tooltip, etc)

the same applies for edges_df where there is a check that all the source-target connections are nodes present in the nodes df. Otherwise error is raised.
The edges_df also contains columns with style characteristics of the edges (colour, thickness, label, whatever)

the method will build the graph node by note, edge by edge and build a style object added to the ipycytoscape object.

What do you think? @marimeireles @ianhi @sven5s

NOTE: this is the reason why I asked #192 in order to facilitate the node construction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hacktoberfest mentored This issue is going to be mentored by one of the repo's maintainers
Projects
None yet
Development

No branches or pull requests

3 participants