Welcome to MacForumz.com!
FAQFAQ    SearchSearch      ProfileProfile    Private MessagesPrivate Messages   Log inLog in

MT-NW Regular Expression Filtering

 
   Macintosh computer (Home) -> Comm RSS
Next:  Show Desktop with 2-finger Click  
Author Message
Dave Allen

External


Since: Apr 13, 2004
Posts: 6



(Msg. 1) Posted: Wed Dec 26, 2007 7:26 pm
Post subject: MT-NW Regular Expression Filtering
Imported from groups: comp>sys>mac>comm (more info?)

This message is not archived

 >> Stay informed about: MT-NW Regular Expression Filtering 
Back to top
Login to vote
Michelle Steiner

External


Since: Jul 15, 2003
Posts: 4218



(Msg. 2) Posted: Wed Dec 26, 2007 7:26 pm
Post subject: Re: MT-NW Regular Expression Filtering [Login to view extended thread Info.]
Imported from groups: per prev. post (more info?)

This message is not archived

 >> Stay informed about: MT-NW Regular Expression Filtering 
Back to top
Login to vote
Neill Massello

External


Since: May 31, 2006
Posts: 206



(Msg. 3) Posted: Wed Dec 26, 2007 7:26 pm
Post subject: Re: MT-NW Regular Expression Filtering [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

Dave Allen wrote:

> I'm not great with regular expressions. Is there a regular expression I
> can use with MT-NewsWatcher to match these hashed up subject lines?

The regular expression "M.*I.*5" should do it, but it would also catch a
subject line like "Let's hope there's no 'Mission: Impossible' 4 or 5".
"M.?I.?5" will catch the current variations, but won't work if the
spammer puts more than one character between the M, the I, or the 5.
 >> Stay informed about: MT-NW Regular Expression Filtering 
Back to top
Login to vote
Tom Stiller

External


Since: Jul 13, 2003
Posts: 1487



(Msg. 4) Posted: Wed Dec 26, 2007 8:17 pm
Post subject: Re: MT-NW Regular Expression Filtering [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

In article ,
Michelle Steiner wrote:

> In article ,
> Dave Allen wrote:
>
> > I'm not great with regular expressions. Is there a regular
> > expression I can use with MT-NewsWatcher to match these hashed up
> > subject lines?
>
> I haven't used regex at all, but after studying some documentation on it
> for the past few minutes, I wonder whether this will do it:
>
> M[^]I[^]5

I think you misunderstood something. The expression [^] is not valid
syntax. If it were it would mean all characters not matching the empty
set.

--
Tom Stiller

PGP fingerprint = 5108 DDB2 9761 EDE5 E7E3 7BDA 71ED 6496 99C0 C7CF
 >> Stay informed about: MT-NW Regular Expression Filtering 
Back to top
Login to vote
Michelle Steiner

External


Since: Jul 15, 2003
Posts: 4218



(Msg. 5) Posted: Wed Dec 26, 2007 8:47 pm
Post subject: Re: MT-NW Regular Expression Filtering [Login to view extended thread Info.]
Imported from groups: per prev. post (more info?)

Back to top
Login to vote
user638

External


Since: Jan 13, 2005
Posts: 585



(Msg. 6) Posted: Wed Dec 26, 2007 9:29 pm
Post subject: Re: MT-NW Regular Expression Filtering [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

In article
,
Dave Allen wrote:

> The MI5 idiot used to be pretty easy to filter out. Nearly all of his
> posts could be killed by matching simple filters like:
>
> "from" contains "MI5Victim@mi5.gov.uk"
> "subject" contains "MI5 Persecution"
>
> The subject filter also caught replies to his posts.
>
> Unfortunately, his more recent posts all have different "from" addresses
> and his latest subject lines are all hashed up.
>
> I'm not great with regular expressions. Is there a regular expression I
> can use with MT-NewsWatcher to match these hashed up subject lines?
>
> Subject: M'I.5'P ersecution BBC Newsca sters L ie & De ny T heyre
> Watc hing Me
>
> Subject: M-I'5.Persecuti on . BBC Newscasters L ie & D eny Theyre
> Watchin g Me
>
> Subject: M,I-5'Persecutio n , B BC New scasters Li e & Deny T heyre Wa
> tching Me
>
> Subject: M-I,5.Perse cution - Molestat ion dur ing Trav el
>
> Subject: M`I'5-Persecuti on . Mole station dur ing Tr avel
>
> Subject: M-I 5.Perse cution . Four Ye ars of MI5 Persecuti on Po sts
> on I nternet Newsgr oups

This seems to be working

M[^a-z0-9]*I[^a-z0-9]*5[^a-z0-9]*P[^a-z0-9]*e[^a-z0-9]*r[^a-z0-9]*s

It is looking for MI5Pers with zero or more anythings that are not
a-z A-Z or 0-5 between each letter.

I guess I choose to be less aggressive and included the beginning
of Persecution "Pers". If you are satisfied that MI5 is
sufficient, then drop the stuff from the 'P' on.

Bob Harris
 >> Stay informed about: MT-NW Regular Expression Filtering 
Back to top
Login to vote
Dave Allen

External


Since: Apr 13, 2004
Posts: 6



(Msg. 7) Posted: Thu Dec 27, 2007 12:26 am
Post subject: Re: MT-NW Regular Expression Filtering [Login to view extended thread Info.]
Imported from groups: per prev. post (more info?)

Back to top
Login to vote
Warren Oates

External


Since: Nov 16, 2005
Posts: 980



(Msg. 8) Posted: Thu Dec 27, 2007 8:58 am
Post subject: Re: MT-NW Regular Expression Filtering [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

In article ,
Bob Harris wrote:

> This seems to be working
>
> M[^a-z0-9]*I[^a-z0-9]*5[^a-z0-9]*P[^a-z0-9]*e[^a-z0-9]*r[^a-z0-9]*s

I don't think he'd use anything but punctuation that lets his subject
line be reasonably easy to read, so I use this which seems to work:

..*M[~`'" -_,]I[~`'" -_,]5.*

There's a space in the character class. Note that I don't care _what_
comes after the 5. You may be able to get away with [:punct:]; I'm not
sure it works in MTNW.
--
W. Oates
 >> Stay informed about: MT-NW Regular Expression Filtering 
Back to top
Login to vote
Tom Stiller

External


Since: Jul 13, 2003
Posts: 1487



(Msg. 9) Posted: Thu Dec 27, 2007 1:07 pm
Post subject: Re: MT-NW Regular Expression Filtering [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

In article ,
Michelle Steiner wrote:

> In article ,
> Tom Stiller wrote:
>
> > > I haven't used regex at all, but after studying some documentation
> > > on it for the past few minutes, I wonder whether this will do it:
> > >
> > > M[^]I[^]5
> >
> > I think you misunderstood something. The expression [^] is not valid
> > syntax. If it were it would mean all characters not matching the
> > empty set.
>
> That's what I thought it meant. If that were a valid construct, it
> would catch everything containing MI5, with anything between the
> letters, wouldn't it?

I think the syntax you meant is:
M[^I]*I[^5]*5

which is an 'M', followed by zero or more characters other than an 'I',
followed by an 'I', followed by zero or more characters other than a
'5', followed by a '5'.
>
> BTW, as soon as I read "sets" in the article, things began to make sense
> to me because I know a bit about set theory. Or at least, I think I do.

--
Tom Stiller

PGP fingerprint = 5108 DDB2 9761 EDE5 E7E3 7BDA 71ED 6496 99C0 C7CF
 >> Stay informed about: MT-NW Regular Expression Filtering 
Back to top
Login to vote
Michelle Steiner

External


Since: Jul 15, 2003
Posts: 4218



(Msg. 10) Posted: Thu Dec 27, 2007 1:07 pm
Post subject: Re: MT-NW Regular Expression Filtering [Login to view extended thread Info.]
Imported from groups: per prev. post (more info?)

Back to top
Login to vote
Hugh Watkins

External


Since: Aug 02, 2006
Posts: 108



(Msg. 11) Posted: Thu Dec 27, 2007 6:31 pm
Post subject: Re: MT-NW Regular Expression Filtering [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

Dave Allen wrote:

> The MI5 idiot used to be pretty easy to filter out. Nearly all of his
> posts could be killed by matching simple filters like:
>
> "from" contains "MI5Victim@mi5.gov.uk"
> "subject" contains "MI5 Persecution"
>
> The subject filter also caught replies to his posts.
>
> Unfortunately, his more recent posts all have different "from" addresses
> and his latest subject lines are all hashed up.
>

snip
the Berlin University Usenet news server filters him out

I never heard of him until I saw him in google groups

Hugh W


--
For genealogy and help with family and local history in Bristol and
district http://groups.yahoo.com/group/Brycgstow/

http://snaps4.blogspot.com/ photographs and walks

GENEALOGE http://hughw36.blogspot.com/ MAIN BLOG
 >> Stay informed about: MT-NW Regular Expression Filtering 
Back to top
Login to vote
Megadave

External


Since: Dec 29, 2007
Posts: 39



(Msg. 12) Posted: Sat Dec 29, 2007 12:37 am
Post subject: Re: MT-NW Regular Expression Filtering [Login to view extended thread Info.]
Imported from groups: per prev. post (more info?)

Back to top
Login to vote
Heath Raftery

External


Since: Jul 10, 2003
Posts: 164



(Msg. 13) Posted: Wed Jan 02, 2008 12:23 am
Post subject: Re: MT-NW Regular Expression Filtering [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

Warren Oates wrote:
> In article ,
> Bob Harris wrote:
>
>> This seems to be working
>>
>> M[^a-z0-9]*I[^a-z0-9]*5[^a-z0-9]*P[^a-z0-9]*e[^a-z0-9]*r[^a-z0-9]*s
>
> I don't think he'd use anything but punctuation that lets his subject
> line be reasonably easy to read, so I use this which seems to work:
>
> .*M[~`'" -_,]I[~`'" -_,]5.*

FWIW, I just developed something similar for tin using the wildmat
syntax. Here's how it appears in my ~/.tin/filter filter:

comment=MI5 wildcard
group=*
case=0
score=kill
subj=*M[-`'., ]I[-`',. ]5[-`',. ]*

Note that the placement of the dash first in the set is significant
(ie. it means something else if it is not first) and that I haven't
included a few of the characters that Warren has, but it would be
trivial to do so.

--
*--------------------------------------------------------*
| ^Nothing is foolproof to a sufficiently talented fool^ |
| Heath Raftery, HRSoftWorks _\|/_ |
*______________________________________m_('.')_m_________*
 >> Stay informed about: MT-NW Regular Expression Filtering 
Back to top
Login to vote
max

External


Since: Oct 28, 2007
Posts: 9



(Msg. 14) Posted: Tue Jan 29, 2008 9:56 am
Post subject: Re: MT-NW Regular Expression Filtering [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

In article ,
Heath Raftery wrote:

> Warren Oates wrote:
> > In article ,
> > Bob Harris wrote:
> >
> >> This seems to be working
> >>
> >> M[^a-z0-9]*I[^a-z0-9]*5[^a-z0-9]*P[^a-z0-9]*e[^a-z0-9]*r[^a-z0-9]*s
> >
> > I don't think he'd use anything but punctuation that lets his subject
> > line be reasonably easy to read, so I use this which seems to work:
> >
> > .*M[~`'" -_,]I[~`'" -_,]5.*
>
> FWIW, I just developed something similar for tin using the wildmat
> syntax. Here's how it appears in my ~/.tin/filter filter:
>
> comment=MI5 wildcard
> group=*
> case=0
> score=kill
> subj=*M[-`'., ]I[-`',. ]5[-`',. ]*
>
> Note that the placement of the dash first in the set is significant
> (ie. it means something else if it is not first) and that I haven't
> included a few of the characters that Warren has, but it would be
> trivial to do so.


Back when this [MI5] became a problem in december i crafted this:
kill if subj= m.i.5.p (case insensitive)

and it worked pretty well. I tested it on about 10k newsgroup articles
and didn't get any false hits (athough they are certainly possible with
this filter) nor any misses.

I chose this route so i wouldn't have to find all the non-alpha
characters on my keyboard. :-) That, and i don't really know how to use
regexps.

..max

--
The part of betatron @ earthlink . net was played by a garden gnome
 >> Stay informed about: MT-NW Regular Expression Filtering 
Back to top
Login to vote
Tom Stiller

External


Since: Jul 13, 2003
Posts: 1487



(Msg. 15) Posted: Tue Jan 29, 2008 11:09 am
Post subject: Re: MT-NW Regular Expression Filtering [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

In article
,
max wrote:

> In article ,
> Heath Raftery wrote:
>
> > Warren Oates wrote:
> > > In article ,
> > > Bob Harris wrote:
> > >
> > >> This seems to be working
> > >>
> > >> M[^a-z0-9]*I[^a-z0-9]*5[^a-z0-9]*P[^a-z0-9]*e[^a-z0-9]*r[^a-z0-9]*s
> > >
> > > I don't think he'd use anything but punctuation that lets his subject
> > > line be reasonably easy to read, so I use this which seems to work:
> > >
> > > .*M[~`'" -_,]I[~`'" -_,]5.*
> >
> > FWIW, I just developed something similar for tin using the wildmat
> > syntax. Here's how it appears in my ~/.tin/filter filter:
> >
> > comment=MI5 wildcard
> > group=*
> > case=0
> > score=kill
> > subj=*M[-`'., ]I[-`',. ]5[-`',. ]*
> >
> > Note that the placement of the dash first in the set is significant
> > (ie. it means something else if it is not first) and that I haven't
> > included a few of the characters that Warren has, but it would be
> > trivial to do so.
>
>
> Back when this [MI5] became a problem in december i crafted this:
> kill if subj= m.i.5.p (case insensitive)
>
> and it worked pretty well. I tested it on about 10k newsgroup articles
> and didn't get any false hits (athough they are certainly possible with
> this filter) nor any misses.
>
> I chose this route so i wouldn't have to find all the non-alpha
> characters on my keyboard. :-) That, and i don't really know how to use
> regexps.
>
Try

M[^I]*I[^5]*5

which is an 'M', followed by zero or more characters other than an 'I',
followed by an 'I', followed by zero or more characters other than a
'5', followed by a '5'.

--
Tom Stiller

PGP fingerprint = 5108 DDB2 9761 EDE5 E7E3 7BDA 71ED 6496 99C0 C7CF
 >> Stay informed about: MT-NW Regular Expression Filtering 
Back to top
Login to vote
Display posts from previous:   
   Macintosh computer (Home) -> Comm All times are: Pacific Time (US & Canada)
Page 1 of 1

 
You can post new topics in this forum
You can reply to topics in this forum
You can edit your posts in this forum
You can delete your posts in this forum
You can vote in polls in this forum



[ Contact us | Terms of Service/Privacy Policy ]