Skip to content
This repository has been archived by the owner on Aug 16, 2024. It is now read-only.

Latest commit

 

History

History

customer return prediction

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 

Customer Return Prediction

Prerequisite

  1. you should have an instance of mysql server up and running.

    create a table called "return" in mysql and load the data in this table from return.csv.

  2. To run this pipeline you should have one s3 account.

    upload the customer.csv and product.csv in s3 bucket.

  3. One Hana system should be up and running.

    create a table called "soHeader" in Hana and upload the data from soHeader.csv file.

Creating pipeline for return prediction

  1. Open the Data Hub dashboard and open the modeler.

    Alt text

  2. Create a new graph by clicking on the "+" sign on the top.

    Alt text

  3. Search for the "MySQL Table Consumer" operator in the operator's section.

    Alt text

  4. Drag and drop "MySQL Table Consumer" operator in to the graph.

    Alt text

  5. Click on “Open Configuration” and provide the connection details for your mysql server. Select the source table as "return" which you have create in prerequisite section.

    Alt text

  6. Next drag and drop "Flowagent CSV Producer" operator in the graph.

    Alt text

  7. Connect the above two operators.

    Alt text

  8. Next step is to connect to the hana system. For that in datahub we have "HANA Table Consumer" operator which allows us to consume hana table. drag and drop this operator in the graph.

    Alt text

  9. Click on "Open Configuration" and provide the connection details for your HANA system. select the source table as "soHeader" which you would have created in the prerequisite steps.

    Alt text

  10. Drag and drop "Flowagent CSV Producer" in the graph and connect it with the "HANA Table Consumer" as shown below.

Alt text

  1. Drag and drop two read file operators to read data from youe s3 bucket. For this scenario we are consuming product and customer data from s3.

Alt text

  1. select the first "ReadFile" operator and assign the below property.

Service: s3

Connection: Connection details for s3

Bucket: Bucket name

Path: product.csv

Alt text

  1. Similarly select the second "ReadFile" operator and give the below properties.

Service:s3

Connection: Connection details for s3

Bucket: Bucket name

Path: customer.csv

Alt text

  1. Now we need to install required python libraries to run our python code. To do that, select the Repository tab. Expand dockerfiles and create a folder named as “Return_Prediction_Docker_File” under that.

Alt text

  1. Right click on “Return_Prediction_Docker_File” and select “Create File”.

Alt text

  1. A create Docker window will pop up. Name it as “dockerfile”. Click on "Create".

Alt text

  1. Copy the code from this file and paste it in the script section.

Alt text

18)Select the configuration for this docker file. Click on the “+” icon on the right side of Tags and add the following tags to the configuration by simply entering the library’s name and press enter.

Alt text

  1. Save the file and build this docker file by clicking build button. Once completed it will show you the build status as completed, and orange circle will turn to green.

Alt text

  1. Again, go back to the graph and search for the "multiplexer" in the operators section.

Alt text

  1. Drag and drop "1:2 Multiplexer" operator in the graph.

Alt text

  1. Join the "Flowagent CSV Producer" which is connected to the "MySQL Table Consumer", to this python operator so we can read return data in to this python operator.

Alt text

  1. Now search for the python2operator in the operators section and drag and drop it to the graph.

Alt text

  1. Now let’s add 1 input port and 2 output ports to the python operator. To do that select the python operator and click on “add port”.

Alt text

  1. Give the following properties for input port and then click OK.

Alt text

  1. Similarly add the output ports. provide the below properties to the output ports.

Alt text

Alt text

  1. Connect the input port of python operator to output port of multiplexer as shown below.

Alt text

  1. The graph will look like below.

Alt text

  1. Now select the python operator. It will show you all the available option with this operator, then choose open script option.

Alt text

  1. A new page will open where you can write python code. Copy the code from here and paste it. This code will run decision tree algorithm on return data and create a tree for that.

Alt text

Go back to the graph

  1. Now next thing is to tell the graph where we can find the python libraries that we installed. For that right click on python operator and select “Group”.

Alt text

  1. Select the entire group and open the configuration for that.

Alt text

  1. Next step is to add tags. Tags describe the runtime requirements of the operator and force the execution in a specific Docker image instance whose Docker file was annotated with the same Tag and Version.

Alt text

  1. Click on “+” button to add tags. Add the below tags

Alt text

  1. Now add "Wiretap" and "HTML Viewer" operators to the graph and connect it to the "output" and "output1" port of the python operator as show below. Here "HTML Viewer" operator is use to render html code to the browser.

Alt text

  1. Drag and drop another python operator in the graph.

Alt text

  1. Create 4 input port namely "input1" , "input2", "input3", "input4" and one output port "output" in this python operator.

Alt text

  1. connect the different data sources to this python operator as shown in below diagram.

Alt text

  1. Again add "Group" to this python operator. For adding Group please follow step 22 to step 25.

Alt text

  1. Now Open the script section of this python operator and copy and paste this code here.

Alt text

  1. Add a terminal to this operator's output.

  2. The final graph will look like below.

Alt text

  1. Save the pipeline and run it.

Alt text

  1. Once running you can select the "HTML Viewer" and select "OPEN UI". Here you will see the decision tree that had been created for the return dataset.

Alt text

  1. This pipeline also joins the data from different data sources and save it in the "masterData.csv" file at /vrep/vflow/data/masterData.csv. To see this file just open "System Management" from your Datahub launchpad.

Alt text

  1. Choose files. Under files -> vflow -> data, you can see "masterData.csv" file.

Alt text

Creating graph using SAP Analytics cloud.

  1. login into your SAC account.

  2. Create new story by clicking create -> new story button on the right hand side panel.

    Alt text

  3. Select "Access & Explore Data".

    Alt text

  4. Upload the "masterData.csv" file in SAC.

    Alt text

  5. By default SAC create dimention and measure autometically. If you want to change it you can chage it by clicking on the column and then from left hand side select the property. For example: If you want to change "return" to measure from dimention, you can simply click on return and change the property to measure.

    Alt text

  6. Now go to the story tab and add chart. Here you can create diffrent types of graph like pie chart, bar chart or donut chart etc.

    Alt text

  7. For more information about how to create graphs in SAC please refer SAP analytics cloud documentation

  8. Some of the sample graphs are shown below.

    Alt text