Microsoft Indexing Service Noise Words.
Home | About Us | Products | Support | Contact Us | Library
Noise words
One of the underlying concepts behind Indexing service is the use of 'noise words'. Almost every language has them. They are the common words in each language like "the", "and" "to" and "at". Just look at this sentence for example:
"On Monday I took the dog for a walk to the park."
When indexing service process's the file that contains this phrase the first thing it would do is ignore the noise words. So what actually gets processed would be.
"Monday dog walk park"
When someone submits a query against indexing service, indexing service will do the same thing with their query, stripping out the noise words from it. So if you typed :-
"Who took the dog for a walk to the park on Monday"
Indexing service would strip that down to :-
"dog walk park Monday"
As you can see, even though both strings were actually different, by stripping out the noise words they are in fact very similar. Indexing service would then be able to match this document up wit the query and it would probably be displayed in the result set of the query.
Editing the noise word lists
Indexing service keeps a list of noise words inside a text file. There is one for each language supported by Indexing service. The languages supported by Index Server 2.0 are English, Chinese, French, German, Korean, Spanish, Italian, Dutch, Swedish, and Japanese
Editing the noise word list is a simple job, just edit the contents of the appropriate file.
1) Stop indexing service
2) Open the relevant file with notepad
Noise files are kept in the WINNT\system32 folder. They are all called 'noise.something', the something being the 3 character reference for the language.
3) Make the changes you require and close notepad saving the file.
4) Start Indexing Service.
All done, it really is that simple.