A robots.txt file notifies search engine crawlers which URLs on your site they can access. This is mostly intended to prevent requests from overwhelming your site; it is not a strategy for keeping a web page out of Google. Block indexing with noindex or password-protect the page to keep it out of Google.
A robots.txt file can be found at the base of your website. As an example, for the website www.example.com, the robots.txt file is located at www.example.com/robots.txt. The robots.txt file is a plain text file that adheres to the Robots Exclusion Standard. One or more rules are contained in a robots.txt file. Each rule restricts or allows access to a single file location on the domain or subdomain where the robots.txt file is hosted. Unless you declare otherwise in your robots.txt file, all files are assumed to be crawlable.
User-agent: Googlebot
Disallow: /nogooglebot/
User-agent: *
Allow: /
Sitemap: https://www.example.com/sitemap.xml
That robots.txt file means the following:
Googlebot, the user agent, is not permitted to crawl any URL that begins with https://example.com/nogooglebot/.
All other user agents may crawl the full site. This could have been left out and the result would have been the same; the default behaviour is to allow user agents to crawl the entire site.
The sitemap file can be found at https://www.example.com/sitemap.xml.
More examples can be found in the syntax section.
If You want to Increase website visibility and traffic and Need SEO Services for your website.
Contact Us for SEO services and Social Media management:-
If you need any assistance so you can contact the US:-
Email: info@getsetseo.com
Contact: 7530817898
No comments:
Post a Comment