In the first part of the series we've looked at how to create content sources to be crawled. In addition to that, you can control the behavior of the crawler on the content sources by setting exclusion and inclusion rules on the content. Those rules apply to all and any content matched by specified URL patterns. Let's take an example: you don't want the indexer to index anything under http://yourserver.com/sites/ with one exception: you want to index one particular subsite: http://yourserver.com/sites/yourblogs.
First, you need to set an exclusion rule:
No Comments