-
Notifications
You must be signed in to change notification settings - Fork 1
/
README
79 lines (57 loc) · 2.7 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
NetSurf public suffix list handling
===================================
library to generate static code representation of the Public suffix list
Public suffix list
------------------
The public suffix list is a database of top level domain names [1].
The database allows an application to determine if if a domain
name requires an additional label to be valid.
Uses
----
The principle use in a web browser is to restrict supercookies being
set [2] although it can also serve secondary purposes in the UI such
as domain highlighting.
Implementation
--------------
This implementation uses a directly mapped tree to allow for a
relatively compact representation (~70k compiled vs ~185k source) to
determine if a given hostname is a public suffix. There is no
allocation and the data tables all reside in read only data section
and cannot be updated after compilation.
There is a single API nspsl_getpublicsuffix() which will return the
public suffix of a hostname if it was valid or NULL if not. The
hostname passed must be normalised (i.e. lower case and domain labels
in punycode)
The data tables are staticly generated from a given input with the
genpubsuffix.pl perl program. This program takes a downloaded suffix
database and generates a file the main library lookup code directly
includes.
The generated psl.inc file and the public suffix list used to generate
it are included in the source release as a conveniance but can be
regenerated simply by deleting the src/public_suffix_list.dat file and
making any target.
If you wish to update the public suffix list data file then the build
files will run some perl not normally needed during a build. This
perl script depends on libidna-punycode-perl and libtie-ixhash-perl.
Other libraries
---------------
This library was created for a very limited purpose in NetSurf and is
therefore possibly not useful as a generic solution. Other C libraries
might be more apropriate:
- Registered domain libs
Small C library which builds a pre-processed database into dynamic
memory. Similar tree based lookup to nspsl. Not been updated for a
while
https://github.com/usrflo/registered-domain-libs/
- libpsl
has a very compact data representation (~35k vs nspsl ~70k) using
DAFSA taken from the chromium project. Can load (preprocessed)
database files at runtime as well as using a compile time builtin
set.
Can be compiled with runtime IDNA support. Directly uses the PSL git
repo as a submodule so PSL database always up to date.
If the 0.14 release was more readily available in distribution
packages NetSurf would use this in preference to nspsl
https://github.com/rockdaboot/libpsl
[1] https://publicsuffix.org
[2] https://en.wikipedia.org/wiki/HTTP_cookie#Supercookie