Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolving abbreviated malware names #9

Open
So-Cool opened this issue Jun 9, 2016 · 2 comments
Open

Resolving abbreviated malware names #9

So-Cool opened this issue Jun 9, 2016 · 2 comments

Comments

@So-Cool
Copy link
Collaborator

So-Cool commented Jun 9, 2016

Right now the first mapping which is the longest string matched is used. To improve labelling all possible matches need to be considered and the most probable abbreviation combination i.e. the one that uses all of the sub-strings should be chosen.
For example "adload" right now will be split into "a" and "dload" with the latter mapped to downloader. A better split would be "ad" (adware) and "load" (downloader).

@hgascon
Copy link
Member

hgascon commented Jun 10, 2016

How often does this occur? If there are not too many cases, such mappings can be added manually.

@So-Cool
Copy link
Collaborator Author

So-Cool commented Jun 10, 2016

Not too often in the samples that I have to be honest. Nevertheless, as there is quite a number of possible combinations this could be quite useful in general.
Let's see what happens with labels when we're at the stage of clustering.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants