Build ML model based on NLP Text classification to predict the industry based on job title column
You can think of the job industry as the category or general field in which you work. On a job application, "industry" refers to a broad category under which a number of job titles can fall. For example, sales is an industry; job titles under this category can include sales associate, sales manager, manufacturing sales rep, pharmaceutical sales and so on.
The problem is supervised text classification problem, and our goal is to investigate which supervised machine learning methods are best suited to solve it. Given a new job title that comes in, we want to assign it to one of 4 industry categories. The classifier makes the assumption that each new complaint is assigned to one and only one category. This is multi-class text classification problem.
The Dataset that has two variables (Job title & Industry) in a csv format of more than 8,500 samples.
The dataset is imbalanced (Imbalance means that the number of data points available for different classes is different).