{"id":823,"date":"2024-06-26T12:25:55","date_gmt":"2024-06-26T12:25:55","guid":{"rendered":"https:\/\/designforweb.org\/notes\/?p=823"},"modified":"2025-01-23T17:21:17","modified_gmt":"2025-01-23T17:21:17","slug":"robots-to-the-rescue","status":"publish","type":"post","link":"https:\/\/www.designforweb.org\/notes\/robots-to-the-rescue\/","title":{"rendered":"Robots to the rescue"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">Much of the web has hoovered up by now but with AI and the ongoing training still happening, it&#8217;s still worth putting some measures in place to stop our sites from being used as fodder. While there might not be a reliable way to prevent our sites from being accessed by these scrapers, there&#8217;s no harm in adding scripts that might help. I don&#8217;t understand it all enough but I will gladly follow the lead of some of the smart web people out there \u30c4<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This post will serve as an archive of useful articles and links, as well as examples of how to address this issue. Or just food for thought relating to the subject of AI in regards to our own sites.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Useful links, examples &amp; articles<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/robotstxt.com\/ai\">AI \/ LLM User-Agents: Blocking Guide<\/a>, as seen 23\/01\/25, recommeded by <a href=\"https:\/\/explainers.dev\/\">David<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/darkvisitors.com\/docs\/robots-txt\">Set Up Your Robots.txt<\/a>, as seen on 01\/07\/2024, <a href=\"https:\/\/darkvisitors.com\">Dark Visitors<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/ericwbailey.website\/published\/consent-llm-scrapers-and-poisoning-the-well\/\">Consent, LLM scrapers, and poisoning the well,<\/a> published 26\/06\/2024, <a href=\"https:\/\/ericwbailey.website\/\">Eric Bailey<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/ethanmarcotte.com\/wrote\/blockin-bots\/\">Blockin\u2019 bots.<\/a>, published 12\/04\/2024,&nbsp;<a href=\"https:\/\/ethanmarcotte.com\/\">Ethan Marcotte<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/seirdy.one\/robots.txt\">robots.txt<\/a>, as seen 10\/2023, <a href=\"https:\/\/seirdy.one\/about\/\">Seirdy<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/github.com\/ai-robots-txt\">ai.robots.txt<\/a> &#8211; as seen 09\/2023, A community effort to identify and block AI crawlers.<\/li>\n\n\n\n<li><a href=\"https:\/\/www.coywolf.news\/seo\/google-announces-method-for-sites-to-opt-out-of-llm-training\/\">Google allows sites to opt out of training its LLMs for GenAI<\/a>, as seen 09\/2023, <a href=\"https:\/\/www.coywolf.news\/jon-henshaw\/\">Jon Henshaw<\/a><br \/>+ <strong><a href=\"https:\/\/gist.github.com\/henshaw\/aa8b68ad8b7f897c709bd0ef4fd03b48\">disallow-genai-bots.txt<\/a><\/strong><\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">New icons for human created content<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Just like we did in the old days \u30c4 When web standards were not yet a default state of the web, we often added little icons to our sites to state that they were built following web standards with valid code. This practice has now all but disappeared ~ and it seems a new trend is rising. This is a screenshot (taken June 2024) of <a href=\"https:\/\/no-ai-icon.com\/\">no-ai-icon.com<\/a> offering different versions for a little icon badge which we can use to indicate that our content is made by a human without any AI tools involved \u30c4 nice \u30c4<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/no-ai-icon.com\/\"><img loading=\"lazy\" decoding=\"async\" width=\"2032\" height=\"4718\" src=\"https:\/\/www.designforweb.org\/notes\/wp-content\/uploads\/2024\/06\/no-ai-icons.png\" alt=\"\" class=\"wp-image-1084\" srcset=\"https:\/\/www.designforweb.org\/notes\/wp-content\/uploads\/2024\/06\/no-ai-icons.png 2032w, https:\/\/www.designforweb.org\/notes\/wp-content\/uploads\/2024\/06\/no-ai-icons-129x300.png 129w, https:\/\/www.designforweb.org\/notes\/wp-content\/uploads\/2024\/06\/no-ai-icons-441x1024.png 441w, https:\/\/www.designforweb.org\/notes\/wp-content\/uploads\/2024\/06\/no-ai-icons-768x1783.png 768w, https:\/\/www.designforweb.org\/notes\/wp-content\/uploads\/2024\/06\/no-ai-icons-662x1536.png 662w, https:\/\/www.designforweb.org\/notes\/wp-content\/uploads\/2024\/06\/no-ai-icons-882x2048.png 882w\" sizes=\"auto, (max-width: 2032px) 100vw, 2032px\" \/><\/a><\/figure>\n","protected":false},"excerpt":{"rendered":"<p>Refusing to help train the nonsense<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"inline_featured_image":false,"footnotes":""},"categories":[18],"tags":[],"class_list":["post-823","post","type-post","status-publish","format-standard","hentry","category-geekery"],"jetpack_featured_media_url":"","_links":{"self":[{"href":"https:\/\/www.designforweb.org\/notes\/wp-json\/wp\/v2\/posts\/823","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.designforweb.org\/notes\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.designforweb.org\/notes\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.designforweb.org\/notes\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.designforweb.org\/notes\/wp-json\/wp\/v2\/comments?post=823"}],"version-history":[{"count":33,"href":"https:\/\/www.designforweb.org\/notes\/wp-json\/wp\/v2\/posts\/823\/revisions"}],"predecessor-version":[{"id":1146,"href":"https:\/\/www.designforweb.org\/notes\/wp-json\/wp\/v2\/posts\/823\/revisions\/1146"}],"wp:attachment":[{"href":"https:\/\/www.designforweb.org\/notes\/wp-json\/wp\/v2\/media?parent=823"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.designforweb.org\/notes\/wp-json\/wp\/v2\/categories?post=823"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.designforweb.org\/notes\/wp-json\/wp\/v2\/tags?post=823"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}