Mostly:
Are you honoroing robots.txt when crawling a given website.
If you do - what is the crawler name.
Use case: I work on a number of websites adding articles (which I want journalist.cafe to use internally) but - the robots.txt is not open at the moment as I want google etc. out until I have a longer content list.
My robots.txt has exceptions for the tools I use - i would like to tell journalist.cafe it can crawl my site, if it gets currently blocked.
Please authenticate to join the conversation.
Closed
Feature Request
Over 2 years ago

Thomas Tomiczek
Get notified by email when there are changes.
Closed
Feature Request
Over 2 years ago

Thomas Tomiczek
Get notified by email when there are changes.