Main Page | Compound List | File List | Compound Members | File Members

SpamParameters Class Reference

#include <SpamParameters.h>

List of all members.

Public Types

enum  SpamEnum {
  to_list = 0, from_address, from_kill, spam_words,
  kill_words, flags, my_domain, valid_users,
  SpamEnumMax
}

Public Member Functions

 SpamParameters (const char *paramFileName) throw ( SpamException )
 SpamParameters ()
 ~SpamParameters ()
std::vector< const char * > & getSection (SpamEnum sec) throw ( SpamException )
bool hasFlag (const char *name)
void print ()

Private Types

enum  paramState { BadState = 0, sectionDef, beginBracket, spamPhrase }

Private Member Functions

bool skipLine (const char *buf)
void finiteStateMachine (const char *buf) throw ( SpamException )
SpamEnum findEnum (const char *str)
void enterPhrase (const char *buf)

Private Attributes

SpamEnum currentSection
paramState currentState

Static Private Attributes

std::vector< const char * > section [(size_t) SpamEnumMax]
const enumTableElem enumTable []
bool initialized = false


Detailed Description

Structure of the spam parameters file:
to_list [ allowed e-mail addresses in the "To:" part of the email. For example: iank@bearcave.com. Spammers frequently either don't have a "To:" (its filled in via SMTP) ]

from_address [ allowed from e-mail addresses. These are email addresses that will not be checked for SPAM content. ]

from_kill [ strings that, when found in the from address will result in the email being marked as garbage. ]

spam_words [ words that probably mark the email as spam, but the email will still be placed in the junk_mail file. ]

kill_words [ email containing these words will be marked as garbage. Examples of such words are drug names, like viagra and xanax and penis. ]

flags [ The flags section contains various flags which control the operation of the spam filter. By default, if a "kill" option is not selected, the e-mail will be sent to the spam file. The flags section may be omitted, in which case no flags are active.

kill_base64 Mark email that contains a base64 section as garbage debug run spam filter in debug mode keep_garbage Don't delete email that is marked as garbage, put it in the garbage_mail file. trace_garbage Generate a trace file that tracks the email that is marked as garbage and deleted. ]

my_domain [ A single line, with your domain address (for example, "bearcave.com" ]

valid_users [ Valid user names for the domain specified in my_domain ]

Comments: lines beginning with '#'

Words or phrases are listed, one per line. For example:

kill_words [ fuck hot sluts ]

Leading spaces are ignored, as are blank lines.

Definition at line 125 of file SpamParameters.h.


Constructor & Destructor Documentation

SpamParameters::SpamParameters const char *  paramFileName  )  throw ( SpamException )
 

Open the parameter file and read it into the SpamParameters object.

Definition at line 250 of file SpamParameters.C.

References Logger::getLogger(), and Logger::log().

00252 {
00253   currentState = sectionDef;
00254   currentSection = SpamEnumMax;
00255 
00256   const size_t BUF_SIZE = 1024;
00257   const char *mode = "r";
00258   FILE *file = fopen( paramFileName, mode );
00259   if (file != 0) {
00260     size_t lineNum = 0;
00261 
00262     char buf[BUF_SIZE];
00263     while (fgets(buf, sizeof(buf), file) != 0) {
00264       lineNum++;
00265       SpamUtil().trim(buf); // remove leading and trailing spaces
00266       if (!skipLine(buf)) {
00267         try {
00268           finiteStateMachine( buf );
00269         }
00270         catch (SpamException e) {
00271           static char msg[128];
00272           sprintf(msg, "%s, line %d", e.what(), lineNum );
00273           throw SpamException( msg );
00274         }
00275       }
00276     }  // while
00277     if (lineNum > 0) {
00278       Logger log = pLogger->getLogger("SpamParameters");
00279       log.log(Logger::DEBUG, "SpamParameters", "finished reading parameter file");
00280     }
00281     if (feof(file)) {
00282       fclose(file);
00283     }
00284     else {
00285       throw SpamException("SpamParameters::SpamParameters:error reading file");
00286     }
00287   }
00288   else { 
00289     throw SpamException("SpamParameters::SpamParameters: could not open parameter file");
00290   }
00291   initialized = true;
00292 } // SpamParameters

SpamParameters::SpamParameters  ) 
 

The default, argumentless constructor should only be called when the static parts of the object have been initialized. This happens via the constructor which is passed the parameter file name.

Definition at line 241 of file SpamParameters.C.

00242 {
00243   assert( initialized );
00244 }

SpamParameters::~SpamParameters  ) 
 

The section array of vector<const char *> objects is a container for strings which were allocated via new. Recover this storage.

Definition at line 221 of file SpamParameters.C.

00222 {
00223   for (size_t i = 0; i < (size_t)SpamEnumMax; i++) {
00224     vector<const char *> oneSec = section[i];
00225     size_t len = oneSec.size();
00226     for (size_t j = 0; j < len; j++) {
00227       const char *ptr = oneSec[j];
00228       if (ptr != 0) {
00229         delete [] (char *)ptr;
00230       }
00231     }
00232   }
00233 } // SpamParameters destructor


Member Function Documentation

void SpamParameters::enterPhrase const char *  buf  )  [private]
 

Enter a parameter phrase (a word or set of words) in the current section (for example, spam_words).

Definition at line 91 of file SpamParameters.C.

00092 {
00093   size_t len = strlen( buf );
00094   char *storage = new char[ len + 1 ];
00095   SpamUtil().strncpy(storage, buf, len + 1);
00096   section[ (size_t)currentSection ].push_back( storage );
00097 } // enterPhrase

SpamParameters::SpamEnum SpamParameters::findEnum const char *  str  )  [private]
 

Given a string, like "kill_words" return the associated enumeration value (which happens to the the enumeration kill_words).

Definition at line 72 of file SpamParameters.C.

00073 {
00074   SpamEnum enumVal = SpamEnumMax;
00075   size_t tableSize = sizeof(enumTable) / sizeof(enumTableElem);
00076   for (size_t i = 0; i < tableSize; i++) {
00077     const char *enumName = enumTable[i].name;
00078     if (strcmp(str, enumName) == 0) {
00079       enumVal = enumTable[i].enumVal;
00080       break;
00081     }
00082   }
00083   return enumVal;
00084 } // findEnum

void SpamParameters::finiteStateMachine const char *  buf  )  throw ( SpamException ) [private]
 

A finite state machine which drives processing the sections. E.g., a section defintion is followed by a begin bracket, which is followed by zero or more spam phrases, which is followed by an end bracket.

Definition at line 106 of file SpamParameters.C.

00108 {
00109   paramState nextState = BadState;
00110   switch (currentState) {
00111   case sectionDef: 
00112     {
00113       SpamEnum sec = findEnum(buf);
00114       if (sec != SpamEnumMax) {
00115         currentSection = sec;
00116         nextState = beginBracket;
00117       }
00118       else {
00119         throw SpamException("finiteStateMachine: section defintion expected");
00120       }
00121     }
00122     break;
00123   case beginBracket:
00124     {
00125       if (*buf == '[') {
00126         nextState = spamPhrase;
00127       }
00128       else {
00129         throw SpamException("finiteStateMachine: '[' expected");
00130       }
00131     }
00132     break;
00133   case spamPhrase:
00134     {
00135       if (*buf == ']') {
00136         nextState = sectionDef;
00137       }
00138       else {
00139         enterPhrase( buf );
00140         nextState = spamPhrase;
00141       }
00142     }
00143     break;
00144   default: 
00145     {
00146       throw SpamException("finiteStateMachine: unexpected state");
00147     }
00148     break;
00149   } // switch
00150 
00151   currentState = nextState;
00152 } // finiteStateMachine

vector< const char * > & SpamParameters::getSection SpamEnum  sec  )  throw ( SpamException )
 

Given a section enumeration, return the associated vector which contains the data for that section.

Definition at line 182 of file SpamParameters.C.

Referenced by MailHeader::checkAddressSection(), MailHeader::checkDomainAddrs(), MailHeader::checkFrom(), SpamUtil::checkLine(), hasFlag(), and print().

00184 {
00185   static vector<const char *> bogus;
00186   vector<const char *> &vec = bogus;
00187   if (sec < SpamEnumMax) {
00188     vec = section[ (size_t)sec ];
00189   }
00190   else {
00191     throw out_of_range("getSection: bad enumeration argument");
00192   }
00193   return vec;
00194 } // getSection

bool SpamParameters::hasFlag const char *  name  ) 
 

Look for a flag in the flags section. The function returns true if the flag is found, false otherwise.

Definition at line 201 of file SpamParameters.C.

References getSection().

Referenced by MailFilter::MailFilter(), main(), MailHeader::parseContentType(), and MailBody::processBySection().

00202 {
00203   bool foundFlag = false;
00204   vector<const char *> flagVec = getSection( flags );
00205   size_t len = flagVec.size();
00206   for (size_t i = 0; i < len; i++) {
00207     if (strcmp(flagVec[i], name) == 0) {
00208       foundFlag = true;
00209       break;
00210     }
00211   }
00212   return foundFlag;
00213 } // hasFlag

void SpamParameters::print  ) 
 

Print the parameter vectors for debugging.

Definition at line 158 of file SpamParameters.C.

References getSection().

00159 {
00160   size_t tableSize = sizeof(enumTable) / sizeof(enumTableElem);
00161   for (size_t i = 0; i < tableSize; i++) {
00162     const char *name = enumTable[i].name;
00163     printf("%s\n", name );
00164     printf("[\n");
00165     vector<const char *> &vec = getSection( (SpamEnum)i );
00166     size_t len = vec.size();
00167     for (size_t j = 0; j < len; j++) {
00168       const char *ptr = vec[j];
00169       printf("   %s\n",  ptr );
00170     }
00171 
00172     printf("]\n");
00173   }  
00174 } // print

bool SpamParameters::skipLine const char *  buf  )  [private]
 

Skip a blank line or a comment line. Note that leading and trailing spaces are "trimmed" from the line, so a blank line will consist of just the null character.

Definition at line 58 of file SpamParameters.C.

00059 {
00060   bool skipIt = false;
00061   char ch = *buf;
00062   if (ch == '\0' || ch == '#') {
00063     skipIt = true;
00064   }
00065   return skipIt;
00066 }  // skipLine


Member Data Documentation

const SpamParameters::enumTableElem SpamParameters::enumTable [static, private]
 

Initial value:

 {
  enumTableElem("to_list", to_list ),
  enumTableElem("from_address", from_address ),
  enumTableElem("from_kill", from_kill ),
  enumTableElem("spam_words", spam_words),
  enumTableElem("kill_words", kill_words ),
  enumTableElem("flags", flags),
  enumTableElem("my_domain", my_domain ),
  enumTableElem("valid_users", valid_users ),
}

Definition at line 38 of file SpamParameters.C.


The documentation for this class was generated from the following files:
Generated on Sat Mar 27 13:07:38 2004 for Mail Filter by doxygen 1.3.3