Downloading All The Github Repositories

I had a need to grab all of the Github repositories for Cookbooks, which is a Github user maintained by the Chef community for collecting many cookbooks in one place for development. All of these cookbooks should be on the Opscode Community site, which is where you should go if you’re browsing for cookbooks to use yourself. But I needed to grep through a large number of cookbooks to develop statistics on Chef Cookbook usage patterns, so I needed All The Things.

#!/usr/bin/env ruby
# 2012-01-11 Bryan McLellan <btm@loftninjas.org>
# Fetch the list of repositories from a Github user and 'git clone' them all

require 'rubygems'
require 'json'
require 'net/http'

url = "http://github.com/api/v2/json/repos/show/cookbooks"
dir = "cookbooks"

if File.basename(Dir.getwd) != dir
 if File.exists?(dir)
   puts "Target directory of '#{dir}' already exists."
   exit 1
 end

 Dir.mkdir(dir)
 Dir.chdir(dir)
end

resp = Net::HTTP.get_response(URI.parse(url))
data = resp.body

result = JSON.parse(data)

result['repositories'].each { |repo|
 puts "Fetching #{repo['url']}"
 system "git clone #{repo['url']}"
}

2 thoughts on “Downloading All The Github Repositories

  1. Sean

    I believe the api has changed since this was written.

    I had to change the url (line 9):
    url = “https://api.github.com/users/cookbooks/repos”

    And I had to change the last loop (lines 27-30):

    result.each { |repo|
    url = repo[‘url’].gsub(“api.github.com/repos”, “github.com”)
    puts “Fetching #{url}”
    system “git clone #{url}”
    }

Leave a Reply

Your email address will not be published. Required fields are marked *

Time limit is exhausted. Please reload the CAPTCHA.