-
Notifications
You must be signed in to change notification settings - Fork 14
Dump organization
Now that you have fetched the dump files, you have to separate the interesting stuff from the useless one. This is where the magic happens: we have downloaded our dumps, calculated their score, but how can we effectively know if what we're looking at is an ugly debug file or a juicy hash list?
Simple, you have to run the classifier:
dumpscraper classify <arguments>
Arguments could be one of the following one:
-
since
Starting date in the format YYYY-MM-DD. If theuntil
argument is not supplied, Dump Scraper will only process this day -
until
Stop date in the format YYYY-MM-DD
This script will look at the score of each dump, decide which type belongs to (trash, hash, or plain) and copy it under the organized
folder in the correct section.
organized
`- hash
`- plain
`- trash
We are copying the file instead of moving it because we always want the original information. In this way, if we improve the classifier, you can safely run the classifier again, without losing your precious dump files.
Dump Scraper and its documentation are Copyright © 2015-2016 Davide Tampellini / FabbricaBinaria.
Dump Scraper is Open Source Software, distributed under the GNU General Public License, version 3 of the license, or (at your option) any later version.
The Dump Scraper Wiki content is provided under the GNU Free Documentation License, version 1.3 of the license, or (at your option) any later version.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license can be found on the GNU site.