Codebrahma

Work

Implementing Upsert in Rails: Tutorial

An upsert is update or insert. Upsert in database terms refers to an operation which should insert a row into a database table if it doesn’t not already exist, or update it if it does. One can implement this function in Rails by writing __anActiveRecordExtension__. This will allow you to call the upsert method on any model.

The ActiveRecordExtension

  1. Create a file active_record_extension.rb in app/extensions.
module ActiveRecordExtension
  extend ActiveSupport::Concern

  #All extensions go here.

  def self.upsert(attributes)
    #Upsert implementation goes here.
  end
end

ActiveRecord::Base.send(:include, ActiveRecordExtension)
  1. Set up an initializer extenstions.rb in config/initializers which initializes your extension.
require "active_record_extension"

As I see it there are two ways to implement upsert. Lets look at both and run through some numbers.

1. Playing it safe

def upsert(attributes)
  where(:primary_key => attributes['primary_key']).
  first_or_initialize.
  update(attributes)
end

2. Rescue from failure

def upsert(attributes)
  begin
  	create(attributes)
  rescue ActiveRecord::RecordNotUnique, PG::UniqueViolation => e
    find_by_primary_key(attributes['primary_key']).
    update(attributes)
  end
end

So just after looking at the code, you’d notice that the first implementation makes 2 queries everytime regardless of whether the record is present or not. The second implementation makes 1 query when the record is not present, but3 queries if the record is present.

Some numbers

Ideally an INSERT or UPDATE query takes around 100ms. However I’m stubbing the create, update and where methods with sleep(0.00001) for benchmarking. Actual numbers will be 10,000 times higher.

n = 5000

case | user                      | system    | total     | real
------------------------------------------------------------------------------
  1  | new-record-upsert-1       | 1.020000  | 0.050000  | 1.070000 (1.232924)
  2  | new-record-upsert-2       | 0.020000  | 0.020000  | 0.040000 (0.098955)
  3  | existing-record-upsert-1  | 0.890000  | 0.060000  | 0.950000 (1.098148)
  4  | existing-record-upsert-2  | 1.100000  | 0.070000  | 1.170000 (1.384870)
  5  | random-upsert-1           | 1.010000  | 0.150000  | 1.160000 (1.311646)
  6  | random-upsert-2           | 0.560000  | 0.060000  | 0.620000 (0.764272)

The numbers seem to suggest that Implementation#2 is faster. The best case scenario is 37% faster while the worst case is only 18% slower. Even when best/worst case scenarios are randomized Implementation#2 wins hands down.

Gist: upsert.rb

Refrences: nithinkrishna.github.io

Written by
Nithin Krishna
Posted in
Tutorial
Tags
#ruby-on-rails-development
#upsert-in-rails
If you want to get more posts like this, join our newsletter