The Problem
Some ignorant portals as well as spammers send multi part MIME emails marked as text/plain. The part of the email is actually encoded with text/html which is very unconvenient to read. Leaving the reasons aside, it was more important to me how to solve the problem.
Exercise
The exercise here is to locate the related part in the Email and convert the html part into text/plain. Depending of whas was wrong, the encoding type or the encoding, it is also possible to change the text/plain encoding to text/html. In any case, the content of the email will not be touched.
Solution
The easy solution is to convert all body parts of the MIME Email to text:
# ~/.procmailrc
:0 fbw
* ^From:.*ignorant@domain.tld
|/usr/bin/lynx -dump -stdin -force_html
However this is not very clever, because it converts also the html part we want to keep.
More clever is to extract the email, locate the html part and convert it to text.
# ~/.procmailrc
:0 fw
* ^From:.*ignorant@domain.tld
|/usr/local/bin/mime-html-to-text.pl
The following Perl script uses MIME::Parse to split the Email, locates the text/plain part and filters the HTML with lynx into readable text:
#!/usr/bin/perl
use strict;
use MIME::Parser;
use IPC::Open2;
use FileHandle;
$ENV{"PATH"} = '/usr/bin';
my $parser = new MIME::Parser;
$parser->ignore_errors(1);
$parser->extract_uuencode(1);
$parser->tmp_recycling(0);
$parser->output_to_core(1);
$parser->decode_headers(0);
my $entity=$parser->parse(\*STDIN);
print $entity->head->as_string,"\n";
my $boundary = '--' . $entity->head->multipart_boundary;
my @parts = $entity->parts;
for my $part (@parts) {
if($part->parts) {
push @parts,$part->parts; # for nested multi parts
next;
}
if ($part->head->mime_type =~ m,text/plain,) {
my $textmsg;
my ($chld_rd, $chld_wr) = (FileHandle->new, FileHandle->new);
my $pid = open2($chld_rd, $chld_wr,
qw{/usr/bin/lynx -dump -force_html -stdin});
print $chld_wr $part->bodyhandle->as_string;
close $chld_wr;
{ local $/ = undef;
$textmsg = < $chld_rd>;
close $chld_rd;
}
print $boundary,"\n";
print $part->head->as_string,"\n";
print $textmsg;
} else {
print $boundary,"\n";
print $part->as_string,"\n";
}
}
print $boundary,"--\n";
Comments are closed, but trackbacks and pingbacks are open.