In order to exclude your website from any of the search engines and to ban all robots to crawl your website up ahead, you need create robots.txt file in the root directory of your website server with the following content:
In order to ban just Quintura robot to crawl your website, you need create robots.txt file in the root directory of your website server with the following content:
In order to exclude folders and certain individual pages from indexing, you need create robots.txt file in the root directory of your website server. The robots.txt file is organized in accordance with the Robots exclusion standard. In creating robots.txt file, follow the certain rules. The Quintura crawler is following the restrictions of the robots.txt file, where "User-agent" parameter equals to “Quintura-Crw”. In case there's no such records, it is following the restrictions, where "User-agent" equals "*". And then, in case of no records with "*" parameter, it is following the restrictions, where "User-agent" equals “Googlebot”.
In order to exclude from index all the pages from certain folder (ex., “limurs”), please, add the following record to your robots.txt file:
If you would add the following restriction:
Here is the example of the more complex robots.txt file:
With such robots.txt parameters Quintura crawler would index your website, omitting the following sections: users, forum and login.php?action=login, crawling one page every 5 seconds. Here, the pages of login.php would not be indexed only if the link would contain 'action=login' parameter (other parameters doesn't matter).
Sitemap if the instrument to indicate which pages of the website the crawler should index. In this case the robot would not scan your whole website, but would address just those of your pages, which are listed in the sitemap file.
For example:
In this case the Quintura crawler would ignore all of the Disallow rules and would index the website just following the rules written in the two files /products.xml and /services.txt. For details on sitemaps, please refer to the sitemap standard.
Note:
If the sitemap file if not available or not compatible with the standard, the Quintura crawler would index your website in the scan mode.
The other standard, more convenient in use with webpages, implies the use of the HTML metatag <META> on your pages, which disallows robots to index the page. The description of this standard.
In order to ban the robots to index your website page, add to the <HEAD> section of the page the following metatag:
In order to allow the robots to index your website page, but disallow them to follow the external links, use the following metatag:
The Quintura crawler supports the noindex tag, which disallows the indexing certain (auxiliary) parts of the text. Place the open <noindex> tag at the beginning of such a part, and </noindex> — at its end, and Quintura would not index such a part.
Example: