Skip to content
Matthias Hecker edited this page Feb 6, 2015 · 12 revisions

url.discover

The entrypoint for submitted urls, each url is delegated to a url specific task. If no suitable task was found the url is delegated to the url.generic task.

Signature

Name Type Description
item_id int The id of an item. It must exist.
urls *list List of URLs.

url.generic

Fetch magic numbers to determine the file type and proceed.

In order to determine the type of a generic url the first few bytes are loaded and observed to include either one of the supported magic numbers (aka the file signature) or only valid latin1/utf-8 codepoints.

In case a valid file signature was found the corresponding task is scheduled. The signatures and tasks are defined in the task.file module (in the FileTasks class).

In case only valid ascii/utf-8 was found it is assumed that the url is a regular website in which case the web.download(?) task is scheduled.

Parameters:

Name Type Description
item_id int The id of an existing item.
url string a URL

image.discover

Image processing, it validates and collects metadata about the image. Then schedules storing the image at their final destination.

Signature

Name Type Description
item_id int The id of an item. It must exist.
data.tempfile string Absolute path to a temporary image file.
mimetype string Detected mimetype of the file.
url string Url this file was downloaded from.
upload string Original upload filename.
Clone this wiki locally