Support Ticket Classification and Key Phrases Extraction
- Identify the main issues in the ticket description
- Extract the key phrases in the ticket description
- Topic Modeling with LDA model
- Key Phrases Extraction with pytextrank (combining spaCy and networkx)
- Construct a graph, sentence by sentence, based on the spaCy part-of-speech tags tags
- Use matplotlib to visualize the lemma graph
- Use PageRank – which is approximately eigenvalue centrality – to calculate ranks for each of the nodes in the lemma graph
-
$a_{v,t}=1$ if vertex$v$ is linked to vertex$t$ , and$a_{v,t}=0$ otherwise -
$M(v)$ is a set of the neighbors of$v$ and$\lambda$ is a constant
-
- Collect the top-ranked phrases from the lemma graph based on the noun chunks
- Find a minimum span for each phrase based on combinations of lemmas
permission 1 0.17555037929471423 requisitions 1 0.1742458175386728 recruiter 1 0.1416381454134179
Scores a single topic by measuring the degree of semantic similarity between high scoring words in the topic
Given the # documents, # words, and # topics, output:
- distribution of words for each topic K
- distribution of topics for each document i