Programmatically Defer Sidekiq Jobs

Published: July 31, 2020
Sidekiq will happily chug along and process any and all jobs that are in the queue. But, what if you want it to take a break for a bit?

Why Would You Want This?

Let's say you have a Rails app processing a bunch of jobs of different types - most of them reach out to some 3rd party API. One day, that API breaks for some reason and the dead queue starts filling up. Perhaps you wish to move into some brief pseudo-maintenance period for your app and want jobs to "hold up" until such a period has passed. Or, maybe you really can't deal with Heroku's mandatory dyno once a day restarts and would rather pause jobs, restart the dyno yourself, and then continue on once a day during "off hours."

Here are three options for setting up Sidekiq job deferrals through code.

Option One: Use the Sidekiq API

Sidekiq PRO offers an API to access a given queue to pause and unpause it. The code is rather simple and straightforward...

q = Sidekiq::Queue.new('queue_name')
q.pause!
q.paused? # => true
q.unpause!

This is great... except there are two issues:

  1. This is a Sidekiq PRO feature. While I 100% think Sidekiq PRO is worth the price if just to support Mike Perham's work, purchasing a PRO licence may not be in the cards.
  2. This code applies to an entire queue. Your needs may involve targeting certain job classes to defer rather than everything going into a queue. Granted, you could work around this by putting such classes in their own queue.

Option Two: Sidekiq Middleware

Sidekiq supports adding middleware to both the client and server allowing you to append data to jobs, make decisions, and report errors (this is what the Honeybadger gem uses for its Sidekiq integration). For this option, we'll make a very simple server middleware. Why a server middleware? The Sidekiq server is what processes jobs while the client is what adds jobs to the queues.

Assuming this is a relatively up-to-date Rails app and jobs are using ActiveJob...

# lib/sidekiq_middleware/defer_jobs.rb

class DeferJobs
  def call(worker, job, queue)
    if defer_job?(job)
      Rails.logger.info("Deferring #{job['class']} with arguments (#{job['args'].inspect})")
      job['class'].constantize.set(wait: 3.minutes).perform_later(*job['args'])
      return
    end

    yield
  end

  def defer_job?(job)
    # Your logic to determine if the job should be deferred.
  end
end

When the Sidekiq server pulls a job off of the queue, it will eventually hit the #call method on this file. At this point, the job was pulled off the queue, but has not been executed yet (ie, #perform has not been called on the job class). Three arguments are passed in. We are most interested in the job argument. This hash is the standard "job payload" which Sidekiq works with for pretty much the entire library. The full breakdown of what's in this hash can be found on this wiki page. What we are most interested in are the "class" and "args" keys which is a string of the job class name along with an array of positional arguments to pass into the class's #perform method.

Instead of returning like a normal method, you call yield. This will pass the job arguments to the next middleware until there are no more middleware to run and #perform is called on the job class. Returning early without calling yield will result in #perform never being called - effectively sending the job to /dev/null.

In this case, we are calling #defer_job in our middleware and pass in the job hash. At this point, you can do whatever business logic you wish to determine if the job should be deferred. You could look at a database table for a flag that's set, a redis key, the presence of a file, if the job class starts with the letter "B", the current temperature in Intercourse, Pennsylvania, ... whatever you wish. Once you determine to defer the job, the middleware will take the job class and arguments and effectively re-enqueues it for later. In this case, we use the built in Sidekiq job scheduling to run the job in three minutes. After three minutes, the server will pull the job off and run through this middleware again. Repeat until the job is allowed to run.

Finally, we add the middleware in the Sidekiq server setup file...

# config/initializers/sidekiq.rb

require 'sidekiq_middleware/defer_jobs'

Sidekiq.configure_client do |config|
  # ...
end

Sidekiq.configure_server do |config|
  # ...
  config.server_middleware do |chain|
    chain.add DeferJobs
  end
end

... and we're done. This method is relatively straightforward, allows for fully control over what and when to defer, and integrates nicely into Sidekiq's flow.

One possible downside to this option is it could be a little too out of the way. I don't see custom middlewares in Rails apps very often, and it's possible a senior developer may write something like this and it works great for the longest time, but everyone forgets about it. Later in the app's life, other developers may be perplexed as to how this "deferral logic" works and where it happens since it slips itself so nicely into Sidekiq.

Option Three: Good Old Fashioned Prepended Module

This is a rather "rustic" but simple approach to solving this. Instead of hooking into Sidekiq directly, take control of your own code and stop #perform from doing anything!

# app/jobs/job_deferral.rb

module JobDeferral
  def perform(*args)
    if defer_job?
      Rails.logger.info "Deferring '#{self.class}' (with args: #{args.inspect})"
      self.class.set(wait: 3.minutes).perform_later(*args)
      return
    end

    super
  end

  def defer_job?
    # Logic here...
  end
end


# app/jobs/my_worker.rb

class MyWorker < ApplicationJob
  queue_as :default
  prepend JobDeferral

  def perform(id)
    # Your worker code...
  end
end

This is very similar to the Sidekiq middleware, except we are intercepting the call to #perform via Module.prepend with our own method. The method goes through whatever logic may want to determine deferrals or not and schedules the job for later. Instead of yield, we call super which is the original #perform method of the job class. You may pick and choose which job classes to prepend module instead of every class gaining the extra logic. This allows for any developer to look at any job class code, see a module being prepended, and easily jump to said code to see what it does.

There is one rather annoying side effect. To sidekiq, #perform is called. Deferred or not, Sidekiq sees #perform called and completing without issue. So if you do defer the job, your Rails log will still include the "enqueued job" and "completed job" log entries from Sidekiq. This could lead to confusion while viewing logs to debug issues.

Wrapping It Up

All three options are viable and offer their own pros and cons. I personally prefer the first option where some code can trigger a pause on queues and then unpause them later. The other two options have the benefit of being 100% free and gives you more fine-grained control over logic, but comes with the price of added complexity.

One More Note...

The astute reader may recall seeing a "Quiet" button in the non-pro Sidekiq UI for each worker process. This is NOT the same as pause. Putting a process in "quiet" mode is typically reserved for prepping Sidekiq for restarts. Once you quite a process, you cannot "unquiet" it until Sidekiq is restarted.

Tags: ruby

Building, installing, and configuring your own Ruby RPM for CentOS

Published: July 23, 2016
If you're running your Ruby apps on your own server, you obviously need Ruby installed. The problem is the version of Ruby provided by the OS is commonly not what you want.

While Ruby managers like rbenv and rvm are great, I personally believe they should be only used for development purposes and not in a production environment. However, the packages that Linux distros provide are usually outdated. Luckily, there's another option: roll your own Ruby package. This allows you to use just about any version of Ruby you'd want AND it be handled by the distro's package manager. CentOS is my preferred distro to use for production, so I'll be going over the process of creating a RPM for Ruby 2.3 along with installing and configuring it on a production environment.

These instructions will work for both CentOS 6 and 7 and, in theory, should work for Red Hat Enterprise Linux.

Preperation

I prefer to create my package on a separate system from production. To accomplish this, I have a VirtualBox VM setup with the same distro version production uses (this is a requirement). In my case, I have a VM with CentOS 7 with all package updates applied. You can have either a minimal install (no GUI) or with a Desktop Environment such as GNOME. All commands (both in the VM and production) are done by a non-root user with sudo access.

The majority of the packages needed to create a RPM come from the "Developer Tools" group. The group includes programs such as gcc, rpm-build, and redhat-rpm-config among others. Not all are needed, but this is a good way to cover all your bases for basic package building in one go. All of them can be installed with the following command:

$ sudo yum groupinstall 'Development Tools'

Seting Up the Build Environment

Run these two commands to set up the needed directories and default configuration for RPM making:

$ mkdir -p ~/rpmbuild/{BUILD,BUILDROOT,RPMS,SOURCES,SPECS,SRPMS}
$ echo '%_topdir %(echo $HOME)/rpmbuild' > ~/.rpmmacros

The first command creates a series of directories in ~/rpmbuild/. The three directories we are most interested in are RPMS, SOURCES, and SPECS:

SOURCES - This directory will hold the compressed Ruby source code.
SPECS - In here, we'll put a file that describes how the package is to be built. A spec file has metadata about the package along with instructions on how to compile Ruby.
RPMS - When the package is built, the resulting .rpm file is placed in here.

The second command creates a configuration "dot file" for rpmbuild to tell it that all package-related files, by default, are all in ~/rpmbuild/.

Get Ruby Source Code and .spec File

Now we need the actual source code ruby-lang.org and put it in the SOURCES directory. This can be done with one command (Replace "2.3.1" with the version of Ruby you want to build):

$ cd ~/rpmbuild/SOURCES && wget ftp://ftp.ruby-lang.org/pub/ruby/ruby-2.3.1.tar.gz

Next, we need a spec file. You're kind of on your own at this point as there is no official spec file people use. The one I use is created by forcefeed on GitHub. The repo can be viewed here which has a spec file for Ruby versions (by switching tags) all the way back to 2.1.0. I'm going to use their latest spec file for Ruby 2.3.1.

Let's download it to our SOURCES directory. After downloading, you should view it in your favorite text editor and verify it looks on the up and up. You should never blindly download and use a spec file unless you understand what it's doing.

$ cd ~/rpmbuild/SPECS && wget https://raw.githubusercontent.com/feedforce/ruby-rpm/2.3.1/ruby.spec

I'm not going to go over all the details of the file, but I'll point out a couple key areas. If you'd like to follow along by just viewing the file, you can do so here.

%define rubyver         2.3.1

# This is the name of the package.
Name:           ruby

...

# These are packages that are required to be installed when the package installs. 
# Yum handles this for us.
Requires:       readline ncurses gdbm glibc openssl libyaml libffi zlib

# These packages are needed to create the RPM (In our case, to compile Ruby). 
# Building will fail if they are not all installed
BuildRequires:  readline-devel ncurses-devel gdbm-devel glibc-devel gcc openssl-devel make libyaml-devel libffi-devel zlib-devel

# This is a reference for the package to tell it where the source code came from. 
# I use this to know where to grab the source code to put in the SOURCES directory
Source0:        ftp://ftp.ruby-lang.org/pub/ruby/ruby-%{rubyver}.tar.gz

...

# This tells yum what packages and functions our RPM will provide.
Provides: ruby(abi) = 2.3
Provides: ruby-irb
Provides: ruby-rdoc
Provides: ruby-libs
Provides: ruby-devel
Provides: rubygems

...

# These are the commands, switches, and settings used to compile Ruby.
# For example, our Ruby won't have any of the tk hooks baked in (--without-tk).
%configure \
  --enable-shared \
  --disable-rpath \
  --without-X11 \
  --without-tk \
  --includedir=%{_includedir}/ruby \
  --libdir=%{_libdir}

make %{?_smp_mflags}

%install
make install DESTDIR=$RPM_BUILD_ROOT

...

# Rest of the file

Build the Package

Ok, let's try building. We use rpmbuild to create the package.

$ rpmbuild -ba ~/rpmbuild/SPECS/ruby.spec

Hopefully, if you've been following along, this command should fail:

error: Failed build dependencies:
    readline-devel is needed by ruby-2.3.1-1.el7.centos.x86_64
    ncurses-devel is needed by ruby-2.3.1-1.el7.centos.x86_64
    gdbm-devel is needed by ruby-2.3.1-1.el7.centos.x86_64
    openssl-devel is needed by ruby-2.3.1-1.el7.centos.x86_64
    libyaml-devel is needed by ruby-2.3.1-1.el7.centos.x86_64
    libffi-devel is needed by ruby-2.3.1-1.el7.centos.x86_64
    zlib-devel is needed by ruby-2.3.1-1.el7.centos.x86_64

This is rpmbuild letting us know that there's packages we specified as needed to compile Ruby that aren't installed. So, let's install them and try building again.

$ sudo yum install -y readline-devel ncurses-devel gdbm-devel openssl-devel libyaml-devel libffi-devel zlib-devel
$ rpmbuild -ba ~/rpmbuild/SPECS/ruby.spec

If all goes well, the package should build without a hitch. You'll see a bunch of blonds, brunettes, and redheads scroll through in your terminal while the Ruby compiling process happens. It's normal and things should complete eventually.

When it's all done, you should have a complete rpm file in ~/rpmbuild/RPMS. Depending on your version of CentOS, the names will vary slightly...

CentOS 6: ruby-2.3.1-1.el6.x86_64.rpm

CentOS 7: ruby-2.3.1-1.el7.centos.x86_64.rpm

Install on Production

Now that you have a rpm file, we can install it on the production server. I use scp to push the file up from my build VM to the server, but any standard file transfer process will work.

scp ~/rpmbuild/RPMS/ruby-2.3.1-1.el7.centos.x86_64.rpm myserver:~/ruby-2.3.1-1.el7.centos.x86_64.rpm

SSH into production. To have yum install the package, we use the --nogpgcheck as we did not sign our package.

$ sudo yum --nogpgcheck install ~/ruby-2.3.1-1.el7.centos.x86_64.rpm

If installation is successful, verify both the ruby and gem binaries are available and the versions are what you expect them to be.

$ ruby -v
ruby 2.3.1p112 (2016-04-26 revision 54768) [x86_64-linux]
$ gem -v
2.5.1

Hooray! We got Ruby installed and ready to go! We're done... right?

Allow Gems to Install

Our package provides irb, ruby, and gem, but we don't have bundle available. This is because, well, bundler is a gem! So we should probably install it...

$ gem install bundler
Fetching: bundler-1.12.5.gem (100%)
ERROR:  While executing gem ... (Gem::FilePermissionError)
    You don't have write permissions for the /usr/lib64/ruby/gems/2.3.0 directory.

Huh... we can't install bundler... in fact, we can't install any gems! This is due to how Ruby is installed by default. The default path for gems is in /usr/lib64/ruby/gems/2.3.0/ which is owned by root. We could use sudo to allow gems to install... but we really shouldn't need root access to install them.

Luckily, there's another way. We're going to tell Ruby to install gems in the home directory of the user. Each user on the system will be able to install gems for themselves because they will be installed to directories where they have write permissions. We can do this globally by creating a file at /etc/profile.d/ruby.sh. All files in /etc/profile.d/ are processed for all users to set up their environment. Our ruby.sh file will contain the following two lines:

export PATH=$HOME/.gem/ruby/2.3.0/bin:$PATH
export GEM_HOME=$HOME/.gem/ruby/2.3.0

These two lines do the following: Sets ~/.gem/ruby/2.3.0/ as the directory where all gems for the user will be installed and adds ~/.gem/ruby/2.3.0/bin/ to PATH which is where binaries (such as bundle, rake, etc) will go. Now, logout and log back into production to get the new environment set. With all that done, gems should install no problem for all users.

$ gem install bundler
Successfully installed bundler-1.12.5
Parsing documentation for bundler-1.12.5
Installing ri documentation for bundler-1.12.5
Done installing documentation for bundler after 3 seconds
1 gem installed

Success! A Ruby version we want is installed as system Ruby and gems can be installed. Congratulations. You've successfully built your first RPM and installed it on a production environment. High five

Tags: ruby

Migrating from unicorn to puma with no downtime

Published: June 26, 2016
I decided want to migrate a Rails 4.2 side project from unicorn to puma. The app has relatively high traffic throughout the day, so I want to avoid any visible downtime.

How production is setup

  • Ruby 2.2.4
  • unicorn (20 workers)
  • nginx proxying requests through the unicorn socket
  • systemd for starting and monitoring unicorn master process
  • capistrano for code deployment
  • App runs under an unprivileged user on a VPS (meaning, this blog post is not for Heroku users)

The prep work

The app is running Rails 4.2 plus a handful of various gems. Because puma is a multithreaded web server, the app and supporting gems need to be as thread-safe as possible.

My first step was to perform a brief audit of the app code to eliminate possible code paths are that not thread-safe. Using this excellent article by Jarkko Laine as a guide, I found that the majority of the already written app code should be thread-safe while making minor edits on any rough spots I came across.

Confident the app code is thread safe, I turned my attention to the 3rd party gems. I proceeded to audit each gem (and its dependencies) to see if there were any open issues regarding thread-safety. I "sledgehammered" some of the more "simple" gems I used such as acts_as_list by simple Google searches of "[gem name] thread safe" and seeing if anything came up. For other, larger and more mission-critical gems like devise, I dove into their respective Github issues to see if there are any open reports of thread-safety. Only the paperclip gem came up as a red flag with thread-safety issues. The cocaine gem, which is a dependency of paperclip, had a history of not being thread safe until version 0.5.5 (specifically this commit). For me, paperclip was the only gem I had to make sure was up to date. Your mileage may vary as everyone's gem lists for their apps are different.

After updating the app code (where applicable) and gems, I committed and deployed the changes to production.

Swap out the gems and setup configuration

With the code and gems audited, we're ready to setup puma. First, I swapped out the unicorn gem for puma in the Gemfile.

-gem 'unicorn', '~> 5.1.0'
+gem 'puma', '~> 3.4.0'

I found the default puma.rb configuration file provided with Rails 5 works well enough for local development, so I included it in my repo.

For production, I use a custom file:

bind "unix:///path/to/my/app-puma.sock"
pidfile "/path/to/my/app-puma.pid"

threads 2, 2
workers 10

environment "production"

prune_bundler

directory '/path/to/my/app/current'

A few points of interest with my production configuration:

  • I'm having the puma process create a pid and a sock file. This allows systemd to monitor the process via the pid and for nginx to connect to the upstream application through the Unix socket.
  • I'm enabling workers (puma's clustering mode) with each worker processing having a max of two threads. This effectively gives me the same concurrency as I had with unicorn for half the memory. As I get more confident in my thread-safety, I can begin to lower the number of workers while increasing the number of threads per worker.
  • I do not use the preload! option. This allows me to issue rolling restarts of the application code during deploys without requiring a full restart of puma. The trade-off of this is I lose the ability to take advantage of copy-on-write.
  • The prune_bundler option allows the main puma process to run detached from bundler. This allows for gem updates during deploys.
  • Since I'm using capistrano which uses symlinks to expose the updated app code, the directory is needed to make the puma executable properly follow the new symlink during deploys.

The production configuration is written out to /path/to/my/app/shared/config/ and deploy.rb for capistrano is configured to symlink the production puma configuration to the right spot for deploys.

After running bundle install to install puma and remove unicorn, I committed and pushed up the changes. Note that I did not deploy.

In preparation for the change-over, I temporarily commented out the hook in config/deploy.rb that would normally tell unicorn to reload.

namespace :deploy
  after :publishing, :restart
  desc 'Restart application'
  task :restart do
    on roles(:app), in: :sequence, wait: 5 do
      # execute "kill -s USR2 `cat /path/to/my/app-unicorn.pid`"
    end
  end
end

I then set up a new systemd service file that will manage starting and monitoring puma. I used the example systemd configuration provided by puma as a base.

[Unit]
Description=My App Puma Server
Requires=redis.service postgresql-9.4.service
Wants=redis.service postgresql-9.4.service memcached.service
After=redis.service postgresql-9.4.service

[Service]
Type=simple
User=appuser
PIDFile=/path/to/my/app-puma.pid
WorkingDirectory=/path/to/my/app/current
Environment=RAILS_ENV=production
ExecStart=/path/to/my/app/current/bin/bundle exec puma -e production -C ./config/puma.rb config.ru
ExecReload=/bin/kill -s USR1 $MAINPID
ExecStop=/bin/kill -s QUIT $MAINPID
Restart=always

[Install]
WantedBy=multi-user.target

In my service file, I specified a PIDFile so I can send signals to the pid for reloading and stopping the puma process from systemd. Note that for puma, we use USR1 for rolling restarts instead of USR2 which is what unicorn uses.

In my configuration in nginx, I pointed to where the new puma socket will in the upstream section, but I do not reload the nginx configuration at this point.

The plan is to run both unicorn and puma side-by-side. While I did not incur downtime, I temporarily sacrificed some concurrency by cutting the number of unicorn workers in half. I issued the TTOU signal to the master unicorn process ten times to decrement the number of children. The stage is now set.

Executing the switch over

Show time! With the deploy file for capistrano having the grounded out restart task, I issued a production deploy. This will push the new Gemfile to production, install the new puma gem, and remove unicorn. However, since the restart task is doing nothing, the old unicorn process is still running in memory.

Now that the new code, configurations, and gems are in place, I started up the puma cluster via systemd:

$ sudo systemctl start myapp-puma.service
# Wait a bit... let's check to make sure it's up...
$ sudo systemctl status myapp-puma.service
* myapp-puma.service - My App Puma Server
   Loaded: loaded (/etc/systemd/system/myapp-puma.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2016-06-24 02:14:19 UTC; 2 days ago
 Main PID: 4025 (ruby)
   CGroup: /system.slice/myapp-puma.service
           |-- 308 puma: cluster worker 0: 4025 [20160624015128]
           |-- 467 puma: cluster worker 1: 4025 [20160624015128]
            ....
           |-- 4025 puma 3.4.0 (unix:///path/to/my/app-puma.sock,tcp://0.0.0.0:3000) [20160624015128]

Awesome! I now have puma and unicorn running side-by-side. Now's the time for the actual switchover.

Remember before that I updated my nginx config to point to the new socket? I never reloaded the nginx daemon, so the in-memory process is still pointing to the still running unicorn sock. Switch it is now just a simple reload command to nginx:

$ sudo systemctl reload nginx

Boom! The app still comes up in the browser. We're now running on puma without any user noticing a thing.

Cleanup

Time to say goodbye to unicorn and cement the transfer to puma.

I turned off the unicorn process and disabled the service to prevent it from trying to start up the next time the server reboots. Then, I want to have puma start up automatically.

$ sudo systemctl stop myapp-unicorn.service
$ sudo systemctl disable myapp-unicorn.service
$ sudo systemctl enable myapp-puma.service

Finally, I updated and committed the change to the config/deploy.rb capistrano file to issue a rolling restart of the puma cluster.

namespace :deploy
  after :publishing, :restart
  desc 'Restart application'
  task :restart do
    on roles(:app), in: :sequence, wait: 5 do
       execute "kill -s USR1 `cat /path/to/my/app-puma.pid`"
    end
  end
end

MISSION ACCOMPLISHED

Recap

I did a lot of pre-planning and setup before actually got puma running on production. Good planning and organization is key for a successful, seamless switchover. I was fortunate enough to come up with this plan and it worked near flawlessly. Your experience will probably be different than mine, but hopefully what I showed you above can act as a starting point for how you do your switch over.

Tags: ruby