Please document crawler use

Mostly:

  • Are you honoroing robots.txt when crawling a given website.

  • If you do - what is the crawler name.

Use case: I work on a number of websites adding articles (which I want journalist.cafe to use internally) but - the robots.txt is not open at the moment as I want google etc. out until I have a longer content list.

My robots.txt has exceptions for the tools I use - i would like to tell journalist.cafe it can crawl my site, if it gets currently blocked.

Please authenticate to join the conversation.

Upvoters
Status

Closed

Board
πŸ’‘

Feature Request

Date

Over 2 years ago

Author

Thomas Tomiczek

Subscribe to post

Get notified by email when there are changes.