Skip to content

Classifying millions of shopping products into 2,000 categories using a product name, brand name, and image.

Notifications You must be signed in to change notification settings

itemgiver/Kakao-Shopping-Classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 

Repository files navigation

Kakao-Shopping-Classification

image

Introduction

In this project, I classified millions of shopping products into 2,000 categories. Each product has a product name, brand name, model name, price, and image information. Product name, brand name, and model name are written in Korean. I used NLP techniques and pattern matching to cluster similar product, brand, and model names. The proper classification of shopping products is related to our real life, so I thought the experience of working on this project would be helpful when I work in the industry.

Research Method

  1. Print product name, brand name, and model name to get an idea of how to cluster the names.
  2. Recognized that we need additional preprocessing steps to do embedding.
  3. Found out that pattern matching could be an efficient process to preprocess the words in the Korean product names.
  4. Combined state-of-the-art NLP techniques and pattern matching in the preprocessing step.
  5. Build a neural network model consisting of two hidden layers and a softmax function.
  6. Tuned hyperparameters in both preprocessing and training steps.

Result

Test Score : 1.11923
Implemented the most accurate shopping product classification model among 50 students.

image

References

KAIST CoE202: Fundamentals of Artificial Intelligence
http://kalman.kaist.ac.kr/coe202/

Kakao Shopping Product Classification Contest
https://arena.kakao.com/c/5
https://github.com/kakao-arena/shopping-classification

Unfortunately, the final version of preprocess.ipynb, train.ipynb was lost and the initial version of the code was uploaded.

About

Classifying millions of shopping products into 2,000 categories using a product name, brand name, and image.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published