bee4/robots.txt
This package is abandoned and no longer maintained.
No replacement package was suggested.
Robots.txt parser and matcher
Maintainers
v2.0.3
2016-02-24 22:36 UTC
Requires
- php: >=5.4
- ext-pcre: *
Requires (Dev)
- atoum/atoum: ~2.5
- hoa/console: ~3.0
- squizlabs/php_codesniffer: ~2.5
Suggests
None
Provides
None
Conflicts
None
Replaces
None
Apache-2.0 5fa1a9a66ce5dd11ca686c23290d87d9a2ec68a8
- Stephane HULARD <s.hulard.woop@chstudio.fr>
README
👁 Build Status
👁 Scrutinizer Code Quality
👁 Code Coverage
👁 SensiolabInsight
This library allow to parse a Robots.txt file and then check for URL status according to defined rules. It follow the rules defined in the RFC draft visible here: http://www.robotstxt.org/norobots-rfc.txt
Installing
👁 Latest Stable Version
👁 Total Downloads
This project can be installed using Composer. Add the following to your composer.json:
{
"require": {
"bee4/robots.txt": "~2.0"
}
}
or run this command:
composer require bee4/robots.txt:~2.0
Usage
<?php use Bee4\RobotsTxt\ContentFactory; use Bee4\RobotsTxt\Parser; // Extract content from URL $content = ContentFactory::build("https://httpbin.org/robots.txt"); // or directly from robots.txt content $content = new Content(" User-agent: * Allow: / User-agent: google-bot Disallow: /forbidden-directory "); // Then you must parse the content $rules = Parser::parse($content); //or with a reusable Parser $parser = new Parser(); $rules = $parser->analyze($content); //Content can also be parsed directly as string $rules = Parser::parse('User-Agent: Bing Disallow: /downloads'); // You can use the match method to check if an url is allowed for a give user-agent... $rules->match('Google-Bot v01', '/an-awesome-url'); // true $rules->match('google-bot v01', '/forbidden-directory'); // false // ...or get the applicable rule for a user-agent and match $rule = $rules->get('*'); $result = $rule->match('/'); // true $result = $rule->match('/forbidden-directory'); // true
