## Using hard mathematical problems against spam

^{Dec 13, 2006 by Ilya Levin}

This is not something new exactly. I first came up with this idea few years ago and it was published and discussed many times since then. The idea is quite simply and many people came up with the same idea all by themselves as well. I claim neither rights nor priority for this idea.

There are three major approaches to prevent spam today: content filtering, sender verification and a "proof of effort".

The last approach is based on making mail send procedure slower and hence ineffective for a mass bulk spamming. It is about forcing a sender to spend some of his time and computation resources to send a message and prove such effort to receiver.

Besides advantages, all three approaches have their own drawbacks.

Content filtering is a subject to false positive and false negative alarms.

Sender verification (e.g. SenderID) requires modification of existent network infrastructure and workflow and so on.

The approach with a proof of effort (e.g. CPU-bound and memory-bound functions) seems to be more robust yet may simply be implemented on top of the current infrastructure or even be implemented entirely as client-to-client. However, its effectiveness has a tendency to degrade in a long term because of hardware getting better.

Hard mathematical problems such as prime factoring or computing discrete logarithm may be used instead of bound functions. The result would be simpler spam preventing schemes which may be transparently adjusted according to any further hardware improvements without refractoring.

Let's describe the proposed method on example of prime factoring problem.
Assume *f()* is a one-way function that return N-bit odd integer, for example:
*f(x) = Truncate(SHA1(x), N) ∨ 1*

This function recomputed for each recipient and for each message. To send a mail, the
sender must compute *f(recipient||date||message)* and factorize the result to p_{1}…p_{n}.
If result of *f()* itself is a prime number then sender must alter the message and retry.

The factorization of *f()* is a required sending effort.

The prime factors p_{1}…p_{n} should be send together with
the message in an extra header tag or in an additional MIME section.

When receiving the message, the recepient simply reject messages without prime factors or if any of these factors is less than 3.

Next, the recepient computes *f(recipient||date||message)* and multiplies p_{1}*p_{2}*…p_{n}
If both results are same and the size of the multiplication product is N bits then
the message is accepted else the message is rejected.

The recipient may also cache *f()* values of received messages for some period
and reject all new incoming messages with the same *f()* without detailed checks.
This will effectively prevent a mailbox flooding attack when an attacker simply
replays an intercepted legitimate message to initiate out of space denial of service.

The adjusting workload factor here is N - the size of an integer. Bigger N gives heavier workload at the sender's side. The size may easily be changed once workload need to be adjusted in future.

Although, using hard mathematical problems to prevent spam is not a silver bullet.

The actual solution to the spam problem is beyond any technical methods.