a talispam is a program held to act as a charm to avert spam and bring good messages
get the sources:
$ git clone https://github.com/saidone75/talispam.git -b v0.3.0
produce an uberjar with leiningen:
$ cd talispam
$ lein uberjar
Compiling talispam.config
Compiling talispam.core
[...]
Compiling talispam.whitelist
Created /home/saidone/talispam/target/uberjar/talispam-0.3.0.jar
Created /home/saidone/talispam/target/uberjar/talispam-0.3.0-standalone.jar
create a native binary (need a GraalVM toolchain installed and configured):
$ lein native-image
Build on Server(pid: 20771, port: 40355)
[./target/talispam:20771] classlist: 4,185.77 ms, 2.15 GB
[...]
[./target/talispam:20771] [total]: 244,502.81 ms, 1.98 GB
and copy the executable binary (target/talispam) somewhere in your path
clone the sample configuration from talispam-config on your ~/.talispam folder:
$ cd ~
$ git clone https://github.com/saidone75/talispam-config.git -b v0.3.0 .talispam
(WARNING: contains a quite big spam/ham training corpus, you may want to train the filter against your own collections)
train bayesian classifier:
$ talispam learn
talispam 0.3.0
building classifier db ✓
done!
print spam/ham score (lower score is ham, higher is spam)
$ cat .talispam/easy_ham/02051.58e196144807bd76d7b77d4b7efb6d32 | talispam score
14
$ cat .talispam/spam/00460.8996dc28ab56dd7b6f35b956deceaf22 | talispam score
98
I imagined it as a drop-in replacement for SpamAssassin on my personal mail server: invoked without arguments will add the same spam identification header and return the message to stdout:
$ cat .talispam/spam/00460.8996dc28ab56dd7b6f35b956deceaf22 | talispam | head -n 5
From ilug-admin@linux.ie Wed Sep 25 10:29:22 2002
X-Spam-Checker-Version: talispam 0.3.0 on kugelmass
X-Spam-Flag: YES
X-Spam-Score: 98
Return-Path: <ilug-admin@linux.ie>
integration with procmail is pretty much the same as well, just add these lines on your .procmailrc:
:0fw
| talispam
:0e
EXITCODE==$?
:0:
* ^X-Spam-Flag: YES
$HOME/Mail/spam
$ talispam -?
NAME:
talispam - a Bayesian mail filter
USAGE:
talispam [global-options] command [command options] [arguments...]
VERSION:
0.3.0
COMMANDS:
learn train talispam classifier
score print ham/spam score for stdin
whitelist print a list of addresses in ham corpus
print-db print all words from classifier db by spam score
stats print stats summary for a mbox
GLOBAL OPTIONS:
-?, --help
on my little mail server (Intel Atom D2550 @ 1.86GHz) is extraordinarily fast, expecially in comparison with SpamAssassin (to be honest, not directly comparable because SpamAssassin perform a lot more checks):
$ time cat .talispam/spam/00460.8996dc28ab56dd7b6f35b956deceaf22 | spamassassin | head -n 5
From ilug-admin@linux.ie Wed Sep 25 10:29:22 2002
Return-Path: <ilug-admin@linux.ie>
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on kugelmass.local
X-Spam-Flag: YES
X-Spam-Level: ***********
real 0m7.332s
user 0m7.170s
sys 0m0.156s
$ time cat .talispam/spam/00460.8996dc28ab56dd7b6f35b956deceaf22 | talispam | head -n 5
From ilug-admin@linux.ie Wed Sep 25 10:29:22 2002
X-Spam-Checker-Version: talispam 0.3.0 on kugelmass
X-Spam-Flag: YES
X-Spam-Score: 98
Return-Path: <ilug-admin@linux.ie>
real 0m0.269s
user 0m0.238s
sys 0m0.055s
Copyright (c) 2020-2022 Saidone
Distributed under the MIT License