# robots.txt for https://www.cuatrecasas.com/ # General crawling rules for search engines User-agent: * Disallow: /admin/ Disallow: /bundles/ Disallow: /bundles_old/ Disallow: /erecruiting/ Disallow: /images/ Allow: /images/cache/ Disallow: /img/ Disallow: /media_repository/ Allow: /media_repository/images/ Allow: /media_repository/docs/ Allow: /summernote/ Allow: /resources/ Disallow: /web/ Allow: /web/assets/ Allow: /web/vendor/ Sitemap: https://www.cuatrecasas.com/sitemap.xml LLMS: https://www.cuatrecasas.com/llms.txt # Social media crawler exception to render shared images User-agent: Twitterbot Allow: /images/ # AI search crawlers (allowed with same restrictions as the general block) User-agent: OAI-SearchBot User-agent: ChatGPT-User User-agent: PerplexityBot Disallow: /admin/ Disallow: /bundles/ Disallow: /bundles_old/ Disallow: /erecruiting/ Disallow: /images/ Allow: /images/cache/ Disallow: /img/ Disallow: /media_repository/ Allow: /media_repository/images/ Allow: /media_repository/docs/ Allow: /summernote/ Allow: /resources/ Disallow: /web/ Allow: /web/assets/ Allow: /web/vendor/ # GPTBot (OpenAI crawler): allow only selected content sections User-agent: GPTBot Allow: /es/*/art/ Allow: /en/*/art/ Allow: /pt/*/art/ Allow: /*/conocimiento Allow: /*/servicios Allow: /*/services Disallow: / # AI training crawlers (blocked) User-agent: Google-Extended User-agent: ClaudeBot User-agent: anthropic-ai User-agent: CCBot User-agent: Bytespider User-agent: Applebot-Extended User-agent: Meta-ExternalAgent User-agent: Amazonbot User-agent: cohere-ai Disallow: /