Home

More fun with message threading

2007/12/23

When I try to write email-related code and the result fails my expectations, I use my plan B: Write it in c-client. I suppose the fact that I do not start with c-client from the beginning is a result of suffering from the Not Invented Here Syndrome.

The other day I was trying to decipher the semantics of Thread-Index: and Thread-Topic: since it seems that Microsoft has not placed any public information on them. Apostolos suggested that Thread-Index: takes BASE64 values, to which I replied negatively. After all, decoding

AcdyY+a08VX8xfobTsy61v9NHPZ7QA==

using perl -MMIME::Base64 -ne ‘print decode_base64($_);’ does not produce anything meaningful.

However I dug a little bit more, following this piece of advice from the imap-protocol list:

“Look at the evolution source code, it contains quite a bit of
information on this.”

camel-exchange-folder.c from the Evolution Exchange package reveals the following gem:

/* A new post to a folder gets a 27-byte-long thread index. (The value
 * is apparently unique but meaningless.) Each reply to a post gets a
 * 32-byte-long thread index whose first 27 bytes are the same as the
 * parent's thread index. Each reply to any of those gets a
 * 37-byte-long thread index, etc. The Thread-Index header contains a
 * base64 representation of this value.
 */

[ Update: Message Threading in Public Folders ]

Enough with trying to work with Thread-Index: then! JWZ has documented a very effective algorithm for message threading and c-client implements it (read docs/draft/sort.txt from the source code distribution):

SEARCHPGM *spg;
THREADNODE *thr;
:
spg = mail_newsearchpgm();
thr = mail_thread(ds, "REFERENCES", NIL, spg, 0);
walk_thread(thr, NIL);

(You are advised to read docs/internal.txt.)

The “REFERENCES” argument to mail_thread() instructs it to use jwz’s algorithm. The other option is to use “ORDEREDSUBJECT” (or as draft-ietf-imapext-sort-19.txt calls it: “poor man’s threading”). walk_thread() just prints the edges of the graph (actually it is a tree):

void
walk_thread(THREADNODE *thr, THREADNODE *prev)
{
        if (thr) {
                if (prev) {
                        printf("%d %d\n", prev->num, thr->num);
                }

                if (thr->next) {
                        walk_thread(thr->next, thr);
                } else {
                        printf("%d NIL\n", thr->num);
                }

                if (thr->branch) {
                        walk_thread(thr->branch, prev);
                }
        }

        return;
}

You may wish to use the output of the above routine (slightly modified) and feed it to dot, so that you can have an image display of the threads in the email collection that you study.

What is left to discuss a little bit more, is the THREADNODE structure: You can go from a THREADNODE to its first child via the next pointer (thr->next in the above example). If the THREADNODE has two children, then the second is a branch of the first (thr->next->branch). It if has three, the third is a branch of the second child (thr->next->branch->branch) and so on.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: