Monthly Archives: February 2009

Writing binary data to CouchDB

I’m doing some performance testing with CouchDB and jcouchdb and I wanted to know if I should write binary data using a bytearray or as a base64 encoded string. The latter is definitely the correct answer. I initially tried using couchdb4j, but I found that it’s exception handling is flawed, or well, doesn’t exist. So I dropped that after a day of tinkering with it. I’ve since been writing a performance testing tool in Java to reuse some of the code in a Java product we have when I’m satisfied with the results. You can find the source that produced these numbers on github for now. I’ve got some more tests to add, and will spend some time thinking about where to put the final tool.

I’m using couchdb 0.8.0-1 as installed out of the box on Ubuntu 8.10 from the package. The graph on the left (which I quickly made in OO and is terrible) is the result of four total runs. Each run was ten threads each writing one hundred documents. The first two runs are writing a binary array and then a base64 encoded string of an 88k image, then again with a 9.5k image. The base64 runs include the time it took to encode the file, but the binary array runs took three times longer to complete. Futon also hates displaying the binary array data.

I’ll be adding another method to test reads since that’s what we’ll be doing primarily. I want to test the concurrency on the reads, then compare those numbers to the results of running multiple couchdb nods behind nginx to ensure the overhead is low and performance really increases. I know Tim Dysinger has been doing some testing and that he and other folks from #couchdb on irc.freenode.net are going to test some pretty large clusters, so it will be interesting to see how our numbers compare.

The number of threads changes the results quite a bit. Tuning may make significant difference or none at all. The one hundred iterations of the 9.5k image takes

number of threads:[base64 seconds, bytearray seconds]
1:[2,5]
5:[8, 21]
20:[34, 90]

I’ll let make another post next week after more testing is done.

The Public Domain

I just finished reading The Public Domain. Before I had even finished the book, I had purchased multiple copies online, tried to arrange to get more copies in the library [and failed], and began scheming up ways to get others to read it.

I’ve always had a community oriented mindset. Having limits on copyright, patents and their ilk has always been an important issue to me. However this book frames the issue from many directions, helping you see just how much we stand to lose if the tides do not change.

Songs written by Ray Charles, who played a part in the birth of soul, may never have been released in today’s environment, where copyright extends far beyond the life of the artist.

Do you remember before Wikipedia? An excellent question, when was the last time you looked up something in a regular encyclopedia? What would the Internet be like today if we argued about net neutraility fifteen years ago. Would you have put your faith in a world-wide band of individual software developers to change the way blue chip companies like IBM do business? Really?

The book touches on mashups in music and how it’s nearly impossible to do the sampling you could do a few years ago now. We’re not just talking about sampling new music either, copyright has been extended beyond the life of the artist retroactively so the few copyrights with a viable business model get to maintain. That was never the reason for the monopoly power behind copyright; it exists to fuel innovation, not create new business models. If we risked so many musical genre’s of the past (like soul, aforementioned) what are we losing out on because of the limits today?

What about all the music, books, and material that cannot be archived and digitized because of the copyrights? We can’t begin to fathom how immensely important this information could be to us in fifteen years. The Internet is a perfectly example of amazing sources of creativity that couldn’t have been planned for in a study.

Read this book, it’s even online under the Creative Commons. Pass it on. I’ll even send you a copy if you promise to.

stack level too deep with rcov on Ubuntu 8.10

/usr/lib/ruby/1.8/rexml/formatters/pretty.rb:129:in `wrap': stack level too deep (SystemStackError)

I’ve had this issue for a while but just started looking for a solution. There’s a number of REXML workarounds in ‘/usr/lib/ruby/1.8/rcov/report.rb’ of debian rcov package version 0.8.1.2-2 for Ruby 1.8.6. Since we’re using ubuntu ruby package 1.8.7.72-1ubuntu0.1 now, these workarounds aren’t used. The cheap workaround is to edit this file directly and edit line 15 to change 1.8.6 to 1.8.7.

if RUBY_VERSION == "1.8.7" && defined? REXML::Formatters::Transitive

Chef 0.5.2 and Ohai 0.1.4 released

We contributed a lot of work to the latest Chef release, which made it out over the weekend. Most notably we got a lot of FreeBSD support in, and it looks like a few people are going to give that shot. The release notes are the best source of information about what was added. As we’re moving puppet recipes over to chef we stumble across pieces of configuration that we’d rather be a resource, and try to add that support. We’re really excited about what we’re getting into Ohai. I tested support that Ben Black is working on for reading libvirt data through their ruby API, and it’s just going to be awesome. With puppet+iclassify I had some convoluted ways of getting guest information, but this implementation is going to be first class enterprise stuff.

Writing to the clipboard from the command line in Linux

I needed to paste a bunch of data to my browser to get it into a gist and didn’t want to copy and paste a page at a time. Install the ‘xsel’ package and you can use it to manipulate the clipboards.

ohai | xsel -b

This takes the output of the program and puts it on the ‘clipboard selection’ instead of the ‘primary selection’, which I needed to do to make firefox happy. You can also see the selections from the command prompt with ‘xsel -o’.