Skip to content

Latest commit

 

History

History
42 lines (34 loc) · 2.46 KB

README.md

File metadata and controls

42 lines (34 loc) · 2.46 KB

Google play rating prediction

Dataset

From Kaggle googleplaystore.csv

algorithm SVM vs SVR

Support vector machines (SVM) is a supervised machine learning classification algorithm, which can be used for classification problem in multi-dimension space. It is widely used in mining researches, such as text mining and opinion mining, and it has a great result. Comparing to other classification algorithms, SVM algorithm usually has better result in higher dimension, where all number of features is quite large and the data is sparse. SVM uses g(x)=w^T x+b as the linear separation hyperplane, where w is the weight vector, b is the bias, x is a set of high dimensional non-linear transformation function, w and b is determined by training data that optimize the following formulas:

image

whereε_iis the slack variables, C is the penalty coefficient, for all the training samples (x_i,y_i).

Support Vector Regression (SVR) is using the SVM algorithm on regression problem. The goal of SVM is to find the separation hyperplane; and the goal of SVR is to find the regression hyperplane.

For the given training set: {(x_1,z_1), …, (x_i,z_i)}

where x_i∈Rn is a feature vector, and z_i∊ R1 is the target output. In order to find the hyperplane, two parameters C>0 and ε>0 must be given and the support vector regression can be defined:

image

In our experiment, we use a free SVM toolkit, sklearn, C = 1to train the SVR model.

Method

image

Result

image

Conclusion

The comparison of the three proposed methods are shown in Table I. Among the three mothods, KNN with K = 15 has the best accuracy against the other two, and Decision Tree method had the worst performance. Hence, to do such rating prediction on Google Play apps, SVR and KNN algorithm is more acceptable. With the trained model, we can predict the rating when a app is given with the corresponding features, so that improve the user experience when surfing the Apps market and provide a early evaluation for developing the potential products.