Blog Archives

Writing a good README

Posted in Inside TFG, Tips and Tricks

What’s the problem?

As a developer, you’ll work on hundreds of different projects in your career. The biggest pain in the ass when changing projects is having to get started in a new development environment. If you’re moving to a different Rails project you have to install new Rubies and bundle – plus you might need to install Redis, Solr or jam your public key on a server somewhere. Even worse, after bundling you notice this is a Rails 2.3 project not a Rails 3.2 and you haven’t worked with Rails 2.3 for years and can’t remember how to start a server.

Then you have to work out who is in charge of this damn project and bug them to find out why your PostgreSQL is complaining about some “PostGIS” thing you’ve never heard of. Once you get that sorted you run the specs to see what shape the app is in and half your tests fail because you’re missing some `config/some_random_config.yml` file. You talk to the guy in charge of the project and he tells you to scp it from the staging environment. “But dude, how do I even get to the staging environment? What server is it on?”. Turns out it’s on an Amazon server and you need a special permissions file to be able to access it.

Imagine this process happens 12 months after anybody has ever worked on the project and the last girl to hack on the codebase has long since left the company. Your company has just lost months of her gaining knowledge that you now have to re-acquire. Knowledge that should have been documented somewhere so it wasn’t lost.

What’s the solution?

Although I did reveal the solution in the title (because I’m a bad writer, sorry) – the answer to the question is to write and maintain a comprehensive README file.

How do I write a good README?

Writing a good README is easy. You just need to know what information is required for developers to use and understand the application. Here’s some Rails-centric information I include in the READMEs I write for The Frontier Group:

README: General Information

The “General Information” section should give a new developer an idea of what the project is about and who is involved with it.

Information you might want to include is:

  • The name of the project
  • The name and contact details of the client and any 3rd party vendors
  • The names of the developers on the project
  • A brief description of the project, you should include the answer to the age-old question “What problem is this project solving?”
  • An outline of the technologies in the project. e.g.: Framework (Rails/iOS/Android/Gameboy Colour), programming language, database, ORM.
  • Links to any related projects (e.g.: Is this a Rails API that has corresponding iOS and Android clients?)
  • Links to online tools related to the application (e.g.: Links to the Basecamp project, a link to the dropbox where all the wireframes are stored, a link to the Pivotal Tracker project)

README: Getting Started

The “Getting Started” section outlines the process of getting the app installed and usable for a developer. I define ‘usable’ in this context as able to login to the application and access all of the functionality available.

Information you might want to include is:

  • A detailed spin-up process. This should include:
    • Instructions on installing any software the application is dependent on: e.g.: wkhtmltopdf, PostgreSQL, XQuartz.
    • Instructions on running the app. For rails apps you’ll want to include the rake db:create db:migrate db:seed process here, as well as instructions on starting a server (e.g. are we using pow, or just the default `rails s`)
  • A list of credentials that can be used to log in with each user type in the system and ideally the URL that a developer can log in from.
  • Any information about subdomains in the app (e.g.: api.myapp.dev/)

When writing instructions pretend you’re writing them for someone who knows next to nothing about developing in the framework/language your application uses.

README: Testing

All you need to include in the “Testing” section is the commands to run any of the test suites you have (e.g.: RSpec, Jasmine, Cucumber, Spinach) and any setup you need to do before-hand (e.g.: rake db:test:prepare). This section will be small but vital.

README: Staging and Production environments

The staging and production environment sections (one section per environment) should provide any information a developer might need to know about these environments.

Information you might want to include is:

  • Which server is the application on? Is it on Amazon Sydney? A server in the office? A data-centre down the road?
  • How can a developer connect to the server? Do they need particular permissions? Who do they need to talk to to get those permissions?
  • Where on the server is the application located
  • What is the deploy process for this server
  • Are there any other services on the box related to the app a developer will need to know about? Any cron jobs? Some Resque workers?

Maintaining the README

Worst case scenario, you don’t maintain a README. You’ll burn in developer hell, sorry.

Slightly better than that, your changes to the README will be reactive – you’ve had to work out how to install some required software and you’ve put the details of doing that in the README for future developers.

If you’re having to make reactive changes, you’re probably spending 30 minutes that another developer already had to spend working out a solution to the same problem. This equates to 30 completely wasted minutes that could have been spent implementing a feature for your client.

In an ideal world, maintaining the README is proactive and becomes part of your development life-cycle. As an example, your development life-cycle could look like this:

  1. Plan feature
  2. Write tests
  3. Implement functionality
  4. Update README if required
  5. Get code reviewed and ensure all your tests pass
  6. Merge your feature

What am I getting by writing a good README?

With a comprehensive, well-written README any developer should be able to hop on to your project and begin hacking away within 10 minutes. If you consider that over the course of developing an app you’ll likely see multiple developers set up the app multiple times, you’ll cumulatively save hours of developer time with just minutes of work.

Importantly: If you can set yourself up with this fantastic habit early, other developers will love working with you. Documentation is what separates us from the animals, ladies and gentlemen.

I recommend you invest 30 minutes of your week updating the READMEs of the projects you’re working on. You may not see any benefits straight away, but in 12-18 months time whoever has to maintain that code will be thankful you did it. Spoiler alert – that person will probably be you.

Wherefore art thou libgeos?

Posted in Inside TFG, Tips and Tricks

We have created quite a few location-aware applications for our clients over the last two years, using Rails with PostGIS on Ubuntu 10.04. However, today one of them started throwing HTTP 500 errors after working fine for a year. We investigated and found the cause to be a new error:

RGeo::Error::UnsupportedOperation (Method Surface#centroid not defined.)

followed by a stack trace. This was unusual as the code had not changed for months. A quick check at the rails console revealed a deeper issue:

1.9.3p0 :001 > RGeo::Geos.supported?
=> false

This error means that RGeo could not find the libgeos library. I reinstalled the gem with gem install rgeo which did not fix it, but the mkmf.log file provided a hint:

/usr/bin/ld: cannot find -lgeos

It was odd as the library had been there when the app was deployed. Searching /var/log/ for “geos” showed that the UbuntuGIS ppa we are using was updated last week, from 3.2.2 to 3.3.3. The upgrade had removed the libgeos-3.2.2 package, and installed the libgeos-3.3.3 package. However, the upgrade script had not added any symlinks, so while libgeos-3.3.3.so existed in /usr/lib/, it could not be found by the linker. The fix was simple:

cd /usr/lib
sudo ln -s libgeos-3.3.3.so libgeos.so
sudo ln -s libgeos-3.3.3.so libgeos.so.1

followed by a reinstall of the rgeo gem.

Making vsftpd with chrooted users work again on Ubuntu 12.04

Posted in Inside TFG, Tips and Tricks

At The Frontier Group, we use vsftpd with chrooted users for clients that require FTP access. It has been working well for four years, however after a recent upgrade to Ubuntu 12.04 we started receiving this error message:

500 OOPS: vsftpd: refusing to run with writable root inside chroot ()

Ben Scobie has a good overview of the problem. One solution is adding the following to your vsftpd config file:

allow_writeable_chroot=YES

Unfortunately vsftpd 2.3.5, which is packaged in Ubuntu 12.04, doesn’t support this feature. It is only available in vsftpd 3 onwards.

As an alternative solution, we have backported it from vsftpd 3 into Ubuntu’s 2.3.5 package and made it available as a vsftpd PPA on Launchpad. To use it, run the following:

sudo add-apt-repository ppa:thefrontiergroup/vsftpd
sudo apt-get update
sudo apt-get install vsftpd

Update (31 Jan 2013): @JeyeNooks has backported the feature to Ubuntu 12.10, and has uploaded his package here.

Just Add a Dash of Include to Your Spinach

Posted in Agile Development, Inside TFG, Ruby on Rails, Tips and Tricks

Recently, I switched over from using Cucumber to Spinach in my Rails and mobile projects. I have been enjoying using Spinach so far and thought I’d share a pattern I’ve been using that seems to work for me.

Now I would like to preface this by saying that:

  • I’m not an expert on the topic
  • I’m not saying Spinach is better than Cucumber (or vice versa)

One of the things I found myself doing often when I was naming Cucumber steps was writing names that became very convoluted. I did this because Cucumber scopes all steps globally and naming two steps identically will cause the feature to fail with “Ambiguous match of “step name””. This meant I had to name almost identical steps slightly differently which would get compounded the more similar steps I had.

For example:

And I fill out the object create form with valid details
And I fill out the object create form with valid details using the google auto complete
And I fill out the object create form with valid details using the google auto complete and marking it as draft

When what I really wanted to say was:

And I fill out the form with valid details

All of that other information was covered in the feature name and/or scenario description.

When I moved to Spinach I found that I wasn’t having that problem because each feature’s steps were available only to that feature. I thought this was great… until I found myself defining identical steps in 20 different files.

My friend Tony Issakov then mentioned that you could define modules, include the Spinach DSL and share them between features. I was also warned, by my friend Jordan Maguire, that if you do this too much, you can end up with step definition files that are just includes and aren’t very readable.

For Example:

Feature: Updating an existing object

Background:
  Given I am logged in as an admin
  And there is an object in the system
  And I am on the object index page

Scenario: Successfully updating object
  When I click the object's name
  And I fill out the form with valid details
  Then I should be be on the object's show page
  And I should see the object's details have been update

Scenario: Failed to update object
  When I click the object's name
  And I fill out the form with invalid details
  Then I should see the form again
  And I should see any form errors
class UpdatingAnExistingObject < Spinach::FeatureSteps
  include SharedLinks
  include SharedPath
  include SharedForm
  include SharedObject
  include SharedViews
end

While this is an extreme example you can see that it quickly becomes difficult to know what step is defined where.

After a bit of playing around I found a happy balance between too much and too little step sharing. My general rule of thumb is to only share the following types of steps:

Authentication

module SharedAuthentication

  include Spinach::DSL

  Given 'I am logged in as an admin' do
    visit new_user_session_path
    user = FactoryGirl.create :admin
    login_with user.email, 'secret'
  end

  def login_with email, password
    fill_in 'user[email]', with: email
    fill_in 'user[password]', with: password
    click_button :submit
  end

end

Paths

module SharedPath

  include Spinach::DSL

  And 'I am on the objects index page' do
    visit objects_path
  end

  And "I am on the object's show page for my venue" do
    current_path.should == object_path(@object)
  end

  Then 'I should be on the objects index page' do
    current_path.should == objects_path
  end

end

Forms

Note that I don’t have any of the “And I fill in the form” steps in here. This is because I want separate them out and just have ‘And I fill out the form with valid details’ in my step definition.

module SharedForm

  include Spinach::DSL

  And 'I should see any form errors' do
    page.should have_css 'p.inline-errors'
  end

  def fill_in_object_edit_form attributes = nil
    if attributes.present?
      fill_in 'object[name]', with: attributes[:name]
      # fill in other attributes
    else
      fill_in 'object[name]', with: ''
    end
    click_button :submit
  end
end

Flash

module SharedFlash

  include Spinach::DSL

  And 'I should see a confirmation message' do
    page.should have_css 'p.notice'
  end

  And 'I should see a flash alert' do
    page.should have_css 'p.alert'
  end

end

Object creation

I generally like to have one per model that handles basic object creation.

module SharedObject

  include Spinach::DSL

  And 'there is object in the system' do
    @object = FactoryGirl.create :object
  end

  ## Where mk_two_object is another factory for Object
  And 'there is a mark two object in the system' do
    @mk_two_object = FactoryGirl.create :mk_two_object
  end

end

If we put it all together using the above example we get something that looks like this:
For Example:

Feature: Updating an existing object

Background:
  Given I am logged in as an admin
  And there is an object in the system
  And I am on the object index page

Scenario: Successfully updating object
  When I click the object's name
  And I fill out the form with valid details
  Then I should be be on the object's show page
  And I should see the details have been update

Scenario: Failed to update object
  When I click the object's name
  And I fill out the form with invalid details
  Then I should see the form again
  And I should see any form errors
class UpdatingAnExistingObject < Spinach::FeatureSteps
  include SharedPath
  include SharedForm
  include SharedObject
  include SharedAuthentication

  When "I click the object's name" do
    click_link @object.name
  end

  And 'I fill out the form with valid details' do
    @attributes = FactoryGirl.attributes_for :object
    fill_in_object_edit_form @attributes
  end

  And 'I should see the details have been update' do
    page.should have_content @attributes[:name]
    ## ...
  end

  And 'I fill out the form with invalid details' do
    fill_in_object_edit_form
  end

  Then 'I should see the form again' do
    page.should have_css "form#edit_object_#{@object.id}"
  end
end

Using this pattern I’ve found that I get the step separation that I was looking for in Spinach while keeping the shared functionality that I enjoyed in Cucumber.

I hope other people find this useful.

HTTP Status Codes and RESTful API crafting

Posted in Agile Development, Code, Inside TFG, Ruby on Rails, Tips and Tricks

These days there’s a lot of money in mobile applications – and where there’s money and new technologies there’s web developers primed and ready to argue about how best to implement these new technologies.

As one of this dedicated crowd I’ve found myself recently working with a lot of Rails RESTful APIs that talk to my mobile applications. This has given me the opportunity to both create my own RESTful APIs and use RESTful APIs written by other developers.

As a result, I’ve gained a much better understanding of the HTTP Status Codes and how they can help you write both better APIs and client applications.

For clarity, I use the term ‘client’ to refer to the mobile application that is communicating with the Rails RESTful API backend.

When I say API I’m referring to a Rails RESTful API.

Before I begin, here are some good sites for reading up on the HTTP Status Codes:

  1. ietf.org (Internet Engineering Task Force) RFC2616
  2. w3.org RFC2616
  3. httpstatus.es
  4. Wikipedia List of HTTP Status Codes
  5. REST Patterns HTTP Status Codes

Here’s a good article by Alex Rogriguez over at IBM from 2008 on REST APIs. If you aren’t clear on what a RESTful API actually is give Alex’s article a read.

Here’s an example on how Twitter handles HTTP Status Codes and a RESTful API

For more examples you can do a quick Google Search and you’ll find the REST API docs from Microsoft and Dropbox, among others.

Finally, for the super-keen, here’s a list of the stackoverflow questions on ‘rest’ and ‘api’ sorted by votes.

Now, back on point!

Why are Status Codes important?

The status code works hand-in-hand with the response body to help the client application process the request.

Status codes tell the client application what it needs to do with the response in general terms.

For example, if your API returns a 401 you know you need a logged in user. If the API returns a 403 you know the user you’re working with doesn’t have sufficient authorization. If the API returns a 418 you know that the API is telling you it’s a teapot and was probably written by knobs.

The response body will tell your client application in specifics what it needs to do with the response. If the client is trying to update a record and the API responds with a 400, the response body will inform the client why the request failed. On the other hand, if the API responds with a 200 the response body will provide the client with the updated entity.

That seems pretty straight forward, right? The difficulty lies in when the API returns a status code and a response body that don’t match up.

I worked with an API that returned a 200 status code but the response body had an errors array in it. That was an absolute trip – I had to parse the response body and look for the presence of content in an errors array first. Then I called an error handler from within the success callback of a jQuery ajax function. This is an example of a bad RESTful API. I hope by reading the remainder of this article you can avoid doing that.

Which status codes should I be using?

As long as you’re not too pedantic status codes are easy to work with when writing an API.

Here’s a rough guide to how I handle mine:

(For the purpose of all of the examples below, any response content I refer to will be as JSON.)

(2xx) Success Codes

200 OK

The 200 OK status code can be your go-to for any successful response. There are many success codes but for a very basic API I’ve found the 200 *can* be expressive enough. The distinction between 200 and the other success codes hasn’t proved valuable in my APIs so far.

See the IETF docs on the 200 status code.

However, some other Success Codes you might be interested in:

  1. 201 Created for when you’re creating a new resource.
  2. 202 Accepted for when you’ve successfully set the request to be performed in a background task. Useful if your client is requesting something on the API that is time-consuming and you don’t want the client to have to wait.

What should I return in the 200 response content?

The content of the response is dependent on the HTTP Method used in the request.

Quoting from the IETF RFC2616 docs:

GET an entity corresponding to the requested resource is sent in the response;
POST an entity describing or containing the result of the action;

So consider an example where you are performing a request something like:

GET api.example.com/1/users?pirate_or_developer=true

Your response JSON will look like:

[{id: 1, name: 'Jordan'}, {id:2, name: 'Guybrush'}]

This JSON should allow the client to update all the records they have that match the records provided in the response. Possibly, it will also allow the client to determine which records have been deleted and can prune its local storage accordingly.

Now, consider an example where you are updating a resource with a request like:

PUT api.example.com/1/users/1

Your response JSON will look like:

{id: 1, name: 'Jordan'}

This allows the client to update its data for the record.

(4xx) Client Error Codes

Here I’ll cover the 3 client error codes I find myself using most in my APIs:

400 Bad Request

The 400 Bad Request status indicates that the request ‘could not be understood by the server due to malformed syntax’. See the IETF docs

There is also the 422 Unprocessable Entity status code that appears to be popular for roughly the same purpose. I’ve read from a couple of sources that since the 422 is part of the WebDAV extension that 400 is preferable over 422. I don’t have the knowledge to weigh in on this argument. Perhaps someone can enlighten us in the comments section.

Some examples scenarios I’ve used an API to return a 400 for:

  1. The client is trying to create a resource with data that fails validation rules on the API
  2. The client is requesting resources using invalid params (e.g.: the date is malformed)
  3. The client request is missing fields the API requires as indicated in the thoroughly written docs! (e.g.: a search might require an end date is a start date is passed through)

What should I return in the 400 response content?

The response of a 400 will depend on why the request failed.

Likely, you’ll want to keep a consistent format across all API end-points. I nest my failure reasons in an errors attribute in the JSON.

As an example, let’s say the client request failed because the name provided was too long. The API will respond with:

{errors: {name: "must be fewer than 18 characters"}}

In fact, given this is being returned with a 400 status code, you can probably do away with the errors wrapper entirely and only return:

{name: "must be fewer than 18 characters"}

The only important thing is to be consistent across your API in how you represent these errors.

401 Unauthorized

The 401 Unauthorized status indicates that the request had missing/invalid authentication credentials. See the IETF docs.

To demonstrate the use of the 401 in an API, I’ll use one of my apps as an example.

In the client application we have a login screen for entering email/password to log in. On all other pages in the application we use token-based authentication (each request must have an token passed through that is checked by the API before performing any request). When the client successfully logs in the API passes a token back that can be persisted on the client device.

In this scenario, the API will return a 401 if:

  1. The client app attempts to authenticate a user by passing invalid username or password
  2. The client app makes a request to an action that requires authentication without providing the token (To be honest, I think this has only ever happened to me in development)
  3. The client app makes a request to an action that requires authentication providing an invalid token (For example: the user tries to log in to the client app after their user has been disabled on the API)

What should I return in the 401 response content?

I’ve found that so far I’ll just return one of:

{errors: {invalid_credentials: 'Authentication credentials provided were invalid'}}

or

{errors: {no_token: 'No authentication token was provided in request'}}

or

{errors: {invalid_token: 'Token provided was invalid'}}

403 Forbidden

The 403 Forbidden status indicates that although the request was valid the action requested failed authorization constraints. See the IETF docs.

Pretty straight forward scenario: user x tries to update resource y. User x is not authorized to update resource y.

As a note, if you are making a client where you strictly control the interactions with the API I feel that your client ever getting an authorization error is at best a code smell but more likely it’s actually just a bug.

Consider an interface that has a button that the user can interact with which returns a message like ‘You are not authorized to perform this action’. Ideally, in a mobile interface the user should never see an authorization error message since it’s not something the user can do anything to fix.

What should I return in the 401 response content?

The API will return a string with a simple error message like:

{errors: {authorization: 'You are not authorized to perform this action'}}

What about the rest of the client error status codes?

So I’ve given you the top 3 error status codes I use in a bit more detail but that’s not an exhaustive guide to the error codes. Here’s a whirlwind summary of the remaining error codes that I’ve used in my APIs:

  1. 404 Not Found for when the URL is wrong or the requested resource does not exist.
  2. 410 Gone for when the resource has been permanently (and intentionally) deleted. This informs the client application that it should remove any references to the resource in question.

Summary

I hope I’ve provided some insight into the various status codes you can leverage to create an accurate, informative API. I also hope I’ve illustrated the strong¬†relationship between the status code and the response body and the importance of providing a descriptive response body.

If you have any thoughts or disagree with me on this let me know in the comments and we can discuss!

Thanks for reading.

Search Posts

Featured Posts

Categories

Archives

View more archives