delight-im/git-scraper

Downloads entire Git repositories from publicly accessible ".git" folders over HTTP

Maintainers

👁 delight-im

Package info

github.com/delight-im/PHP-GitScraper

pkg:composer/delight-im/git-scraper

Statistics

Installs: 35

Dependents: 0

Suggesters: 0

Stars: 18

Open Issues: 0

v1.0.0 2015-10-24 14:39 UTC

Requires

  • php: >=5.5.0

Requires (Dev)

None

Suggests

None

Provides

None

Conflicts

None

Replaces

None

Apache-2.0 14b00e4fdc52679ee941c089dff722c3b2c14d73

securitygitscrapervcsscm

This package is auto-updated.

Last update: 2026-06-24 11:23:59 UTC


README

Downloads entire Git repositories from publicly accessible .git folders over HTTP

  • Directory indexes or directory browsing on the web server are not required
  • Running git update-server-info on the server is not required

Requirements

  • PHP 5.5.0+

Installation

  1. Include the library via Composer [?]:

    $ composer require delight-im/git-scraper
    
  2. Include the Composer autoloader:

    require __DIR__ . '/vendor/autoload.php';

Usage

$scraper = new \Delight\GitScraper\GitScraper('http://www.example.com/.git/');
$scraper->fetch();
// var_dump($scraper->getFiles());
$scraper->download('./');

Terminology

  • hash
    • used to identify objects in Git
    • always uses the SHA-1 algorithm
    • has a length of 20 bytes, 40 hex characters or 160 bits
    • ensures file integrity
  • object
    • stored in .git/objects
    • addressable by its unique hash
    • has a small header describing the type and length of its content
    • compressed with zlib
    • can be previewed (in a slightly modified version) by running the command git cat-file -p {hash}
  • commit object
    • points to a single tree object (stored as 40 hex characters)
    • contains the name and email address of the committer as well as the commit time
    • includes information about the author (may not be the committer) which are analogous to the committer data
    • holds the commit message or description of the commit
    • points to the parent tree as well so that you can browse the history
  • tree object
    • corresponds to a directory on the file system
    • contains pointers to other objects (stored as 20 bytes)
    • tree objects (i.e. sub-directories) and blob objects (i.e. files inside the directory) may be listed here
  • blob object
    • similar to a file on the file system
    • simply a binary representation of the file

Further reading

Contributing

All contributions are welcome! If you wish to contribute, please create an issue first so that your feature, problem or question can be discussed.

Disclaimer

You should probably use this library with your own websites and repositories only.

License

Copyright (c) delight.im <info@delight.im>

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

 http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.