How to Use Paperclip to Upload Files
Migrating from Paperclip to ActiveStorage: a unlike approach
In this article we will hash out how to migrate hundreds of thousands of attachments from Paperclip to ActiveStorage without downtime.
At Sortlist, one of my first tasks later on joining the team, was to migrate from Paperclip to Runway congenital-in ActiveStorage or to Shrine (another candidate can be CarrierWave). The reason for this is considering we are adepts of keeping things upward to engagement and Rail 5.ii came with ActiveStorage. Paperclip was already deprecated for some time and we wanted to motion on with our lives ✈️.
We came to the conclusion that nosotros would migrate to ActiveStorage because nosotros don't actually need the full processing power and configuration options of Shrine. Fifty-fifty though (at the fourth dimension of writing this article) ActiveStorage is not as mature as Shrine or CarrierWave, it has the Rails community behind it, so nosotros were happy with that.
Subsequently reading a few articles about this on the web, the procedure seemed pretty straightforward, but the problem was that all the tutorials that nosotros take found were defended to small Ruby on Track applications with a limited number of attachments. In these cases, the migration is very fast with no reanimation whatsoever. Some examples: GoRails and the RailsConf 2019 video below.
Since at Sortlist we have hundreds of thousands of attachments, we can't afford to expect hours or possibly even days to drift all of our attachments and during this time to have all attachment deportment unavailable π±, then we had to come upwardly with a different approach and this is where things get interesting. I'll start explaining below, but first let'south refresh our knowledge on how Paperclip and ActiveStorage work.
Every bit yous may know, Paperclip works by attaching file data to the model by changing the schema of the model. In addition to that, y'all can also write normal Rail validations for attachments. For case (or some other variations):
If you need more attachment types on a specific model, than your model schema tin get out of hand. π
On the other hand, ActiveStorage creates 2 new database tables ActiveStorageBlobs
which is the table handling the attachment data and ActiveStorageAttachments
table, which is a polymorphic tabular array betwixt the blobs table and your Rail models. This means yous can besides have as many attachment types as you want on your model, without every changing the schema π.
And so at present that we know how they work and because we didn't want to have any unavailable attachments during the whole migration process, nosotros decided to carve up the migration into 2 big steps or Pull Requests: Hidden ActiveStorage and ActiveStorage Rollout.
Hidden ActiveStorage (A)
Go on everything from Paperclip every bit is, but also add ActiveStorage. Nosotros are going to use both of them at the same fourth dimension. This means that during the fourth dimension the attachments are migrated from Paperclip to ActiveStorage, if someone decides to upload an attachment, the user would still apply the working Paperclip implementation (the same seamless menstruation the user is used to), simply in the background, we would also indistinguishable the new zipper into ActiveStorage past making use of Observers. (The user doesn't demand to know that π)
Step ActiveStorageRollout (B)
After the migration finishes, remove everything related to Paperclip and only use the new ActiveStorage implementation.
This logic makes sense to me, so let's do information technology! (Easier said than done π )
Let's move on to some coding π.
Stride A.i. Install ActiveStorage
We will starting time by installing ActiveStorage. Normally, Rail 5.ii already comes with it, so all we demand to practise is run:
rails active_storage:install
in the last.
This will generate the migrations to create the two tables mentioned above: ActiveStorageBlobs
and ActiveStorageAttachments
. Before migrating the tables, nosotros add a new column called :storage_url
of type cord to the ActiveStorageBlobs
table, so the terminal migration file looks like this:
The additional :storage_url
column is used to store the straight URL of the attachment directly in the database. We volition use this new column every bit a getter for the direct URL of the attachment.
Why? Because we can easily clone the database to whatsoever environment and still take working attachments. This means nosotros don't intendance about unlike storage configurations between environments, we know that the attachments piece of work everywhere and new attachments will only be uploaded on the specific environment, but work on other environments if we clone the database where the zipper was uploaded. (Great if you are using multiple environments in your evolution workflowπ₯).
The second reason is that we make use of directly SQL queries for some of our pages and this makes it easier to write them since the table contains the direct URL of the epitome to be used on the frontend. For example, nosotros can create a query like so:
or other variations. Discussing (materialized) views is out of the scope of this commodity, just we may discuss it in more detail in the future if you are interested π₯³.
Step A.ii. Install ActiveStorage validations
Paperclip offers validations (example in a higher place) and we wanted something like. Out of the box, ActiveStorage does not come with validations π, but we found an alternative: https://github.com/igorkasyanchuk/active_storage_validations
If you are using
active_storage
precious stone and yous want to add simple validations for it, similar presence or content_type you need to write a custom validation method.This gems doing information technology for you. Just apply
fastened: true
orcontent_type: 'image/png'
validation.
Even though nosotros will not exist using the jewel however, we tin can move on by only installing it past adding information technology to our Gemfile
and then running packet install
:
gem "active_storage_validations", "~> 0.half dozen.i"
Step A.3 Configure Cloud Storage Provider & ActiveStorage
In this pace we will configure our cloud storage provider. Going into details for this specific job is across the scope of this blog post, so to sum it up, we tin can just configure a new bucket and permissions on AWS S3 (nosotros are using AWS, but the process is similar for other storage providers) and add together the necessary environs variables into your Rail project to allow access to your new AWS bucket.
For example, at Sortlist, for Paperclip we were using AWS_S3_BUCKET_NAME
, AWS_S3_REGION
etc. In addition to these, for ActiveStorage nosotros created AWS_AS_S3_BUCKET_NAME
, AWS_AS_S3_REGION
etc. As you might guess, AS
refers to ActiveStorage and so we know which keys are for which service π.
In Rails 5.2 you should already have an app/config/storage.yml
file. We want ActiveStorage to make use of the newly created AWS credentials, and this is the place to practise it. Afterward editing, the file should wait similar to the following:
We are not done even so π. We must tell Rails to utilise ActiveStorage, and so nosotros need to open up the production.rb
file and edit/add this line:
config.active_storage.service = :amazon
And finally we are finished with all the configurations involving ActiveStorageπ€©.
Footstep A.4 Migration Rake Task
Most of the data/tutorials that we've establish on the spider web practise this directly in a Rail migration, but as discussed, this can be a long running action, so we moved information technology into a rake task:
- In that location are multiple means to execute this rake task, some of which include running it through background jobs with Sidekiq for example, or Ruby threads. At Sortlist, an unwritten internal dominion we try to abide by is that we don't really want to fill Redis with these kinds of long running tasks, and so we went with the 2nd option: Ruby threads. Equally you may know, there are certain situations where Ruby threads tin can be used and this is one of them. Because going into detail for the to a higher place options is beyond the scope of this article and for the sake of simplicity, we will go along a simple version of the rake chore. We can talk over it in a future article if you are interested π.
- Firstly, we gather all the models (with the exception of abstract classes) that are used in our application into an assortment.
- Secondly, nosotros iterate over the array and bank check if the schema of the model contains a
column_name
matching the Regex containingfile_name
and if it does, we save them into an assortment. For case a model can accept alogo_file_name
,picture_file_name
etc. - The next footstep is skipped if the model doesn't have any columns that match the to a higher place example ππΌ.
- For the found columns we iterate over them and create an ActiveStorage tape only if ActiveStorage does not contain that record. The reason for this is that if for some reason nosotros cancel the rake task or it crashes, we can restart it and it will proceed from where it left off, saving united states of america time in the finish (if you have hundreds of thousands of attachments).
- The lawmaking for
:duplicate_active_storage
follows beneath and I will likewise explicate the use of the@is_migration
instance variable.
Step A.5 ActiveSupport Concern
- The
duplicate_active_storage.rb
concern is the one beingness used by the previous rake job to create the ActiveStorage attachment. This concern will likewise exist included (include DuplicateActiveStorage
) in the models that contain attachments (those models which are configured to use Paperclip). - In the previous rake task, we set the example variable
@is_migration
to make sure that we don't trigger anafter_commit
callback later creating the ActiveStorage attachment (which will result in an infinite loop π±). - The
duplicate_active_storage
method gathers the attachment columns of the models for which we desire to indistinguishable the Paperclip attachments into ActiveStorage, and for each one it creates the ActiveStorage attachment based on some conditions:
Conditions for creation:
- If the record was updated, nosotros cheque if the Paperclip attachment was updated. If information technology was updated, so nosotros too update the ActiveStorage attachment. If the attachment wasn't updated, nosotros can skip all actions related to ActiveStorage. This happens in example of an
after_commit
callback. - In case of an
after_destroy
callback, we check that the instance of the model was non deleted (we use theacts_as_paranoid
gem for soft deletion of records, hence theendeavor(:deleted?)
check, considering we are paranoid π€₯). If the record was deleted, nosotros volition also remove it from ActiveStorage. - The rest of the logic of creating an ActiveStorage attachment is standard and is very similar to all the other tutorials on the web regarding this topic, but for completeness, I'll briefly sum it up: and so what is actually happening is that nosotros are constructing the path for the direct URL of the attachment using the
fundamental
method. This is actually the direct URL of the Paperclip zipper (the one from the old S3 saucepan). We then laissez passer on this straight URL to ActiveStorage, which will first download it and then upload it into the new configured S3 bucket. The downloading is necessary, because ActiveStorage does not know how to represent the attachment, then it needs to analyse it first. We then create the associated polymorphic ActiveStorage attachment record and nosotros are expert to go ☀️.
Step A.five Update the storage_url of the ActiveStorage hulk
Afterwards the zipper has been fastened, we have to figure out what to do with the custom cavalcade that nosotros created earlier (storage_url
). The most intuitive way to update that column is to create a callback on the ActiveStorage::Blob
model. We tried that first, but it seemed that nothing happened. Subsequently some research into the problem, we realised it was not easily achievable and nosotros would accept to monkey-patch π ActiveStorage (source).
Monkey-patching is generally a bad idea and is recommended just if you really know what yous are doing. ActiveStorage is quite new and we are expecting a lot of improvements in the nigh future, so we decided against it.
Merely in that location is some skillful news ahead π. We were already making use of the Observer design quite a lot in our app so we gave it a try and we have a winner! πΎ The result being:
The observer pattern is implemented in our project past making use of the Rails Observer jewel. More than information can exist found in the associated links.
Hidden ActiveStorage (A) conclusion:
With the above code in identify, thanks to the business that we've created, no matter which action is taken (create/update/destroy) through Paperclip, it will also be handled by ActiveStorage in the background. In improver, the ActiveSupport::Concern
is used by the rake task to migrate the old attachments (in our case it took about 2–3 days which includes some minor fixes nosotros had to practice along the mode). And so basically nosotros accomplished Dry out code and no reanimation (our initial goal for this stride of the process⚽️).
Let's move on to ActiveStorage Rollout (B) π.
Step B.1 Remove Paperclip
At present that the information is prepare, thanks to the first step, we can completely remove Paperclip and rollout to ActiveStorage every bit the main and only way of storing attachments. This step is actually quite short and straightforward. The steps that we need to accept here are:
- Remove Paperclip jewel from the
Gemfile
and run abundle install
. - Remove any Paperclip configurations from
product.rb
/development.rb
or other environments you lot may have. - Change any references of
has_attached_file
from your models to the ActiveStorage equivalent (e.chiliad.has_attached_file
becomeshas_one_attached
etc.) - Make sure y'all update the model validations to brand use of the new version from the
active_storage_validations
gem we have installed earlier. - Create a migration to drop all the Paperclip columns from your models. For case:
- At this point we can also remove the rake task, the
duplicate_active_storage.rb
file and theincludes DuplicateActiveStorage
from the models in which it was existence used.
And this volition conclude our journeying regarding the migration from Paperclip to ActiveStorage. π₯³
Hopefully this made sense to you, but feel free to get out any comments if you think that some points may demand further explanations or if anything is unclear.
TL;DR π
Interested in joining the product & engineering team of a fast-growing tech start-upwards and you want to use top-notch technology? Sortlist is always looking for talented people to join u.s.a. in Kingdom of belgium or Romania. Cheque out our positions here:
Source: https://medium.com/sortlist-engineering/migrating-from-paperclip-to-activestorage-a-different-approach-4bffa2260e17
0 Response to "How to Use Paperclip to Upload Files"
Post a Comment