Lately we’ve had to change how several documents on the MongoDB database of a project were related to each other. Here’s the how to.
About embedded documents
One of MongoDB’s features is the ability to have documents embedded inside other documents, which makes sense when modelling certain problems in the domain of the application. However, this has the disadvantage of not being able to query the embedded document by itself. An example:
class Student
include Mongoid::Document
#...
embeds_one :referee
#...
end
class Referee
include Mongoid::Document
#...
embedded_in :student
#...
end
This allows you to have an instance of Student
on which you can do student.referee
, but won’t let you do a Referee.find('1233456')
. That is, to get to any referee, you need to do it through the parent student.
What we needed
In our application, we needed to have it so a Referee could be tied to several Students at the same time, this is, switching to something like this:
class Student
include Mongoid::Document
#...
belongs_to :referee
#...
end
class Referee
include Mongoid::Document
#...
field :email
#...
has_many :students
#...
end
Trying to have both relationships at the same time on both models would not work, so we needed to “migrate” our existing referees into this new relationship. After some research, trial and error, we devised the following process, that could be easily adapted to other change of relationships:
NOTE This was written for a Rails 3.2 application running Mongoid v3. It might not work out of the box for newer or (gasp!) older versions.
- Rename the embedded model to
embedded_model_name
. - Create a copy of that embedded model, with the original
model_name
, and modify it so it is notembedded_in
, but instead,has_many
- Modify the parent model, so it
embeds_one :embedded_model_name
and alsobelongs_to :model_name
. Remember to modify the names also in clauses likeaccepts_nested_attributes_for
andvalidates_associated
.
So far, they would look like this:
class Student
include Mongoid::Document
#...
belongs_to :referee
embeds_one :embedded_referee
#...
end
class Referee
include Mongoid::Document
#...
field :email
#...
has_many :students
#...
end
class EmbeddedReferee
include Mongoid::Document
#...
field :email
#...
embedded_in :student
#...
validates :email, uniqueness: true
end
On the new Referee
model, we add a validation for the uniqueness of the email address, as we want to have it identify our Referee
in the system.
Now we would need a Mongoid migration or a Rake task to perform the migration. Basically, we need to make the Student
model aware of the change of the embedded model name, and populate our new Referee
collection with the data from the old embedded referees.
namespace :change_relationships do
desc "Creates referees from the existing embedded ones"
task create_referees: :environment do
# First, rename the embedded referee in the student
Student.all.each{|s| s.rename :referee, :embedded_referee }
# Now, doing student.embedded_referee will give us the embedded document
# Then, create collection of Referees for the Students
Student.all.each do |student|
if student.embedded_referee.present?
old_referee = student.embedded_referee
# Now we create a new Referee based on the email
# or retrive an already existing one
new_referee = Referee.find_or_create_by(email: old_referee.email)
# Now you need to copy over the embedded_referee field
new_referee.set(:first_name, old_referee.first_name)
new_referee.set(:last_name, old_referee.last_name)
# ...
# Then set in the belongs_to association
student.set(:referee_id, new_referee.id)
end
end
end
end
After this, we will have a brand new collection of Referees
with the relationship we wanted with the Students
. However, we must note some things:
- For the
student.embedded_referee.present?
check to work properly, make sure that on yourStudent
model you have disabled or removed out theautobuild
option on theembeds_one
clause. With this, we avoid getting empty embedded_referees and creating empty referees (or getting validation errors in the new ones because they have no data, if we have introduced them) - When copying over the data from the embedded_referees to the referees, you might want to follow a strategy to overwrite data of an existing referee only if the one you are migrating is newer. It’s up to you and what you need.
Next steps
Obviously, check that the relationship is working properly from both sides. You’ll need to fix lots of tests now, by the way.
Once you’re done and you’re sure everything is a-ok, you can have a cleanup task like this:
desc "Remove old embedded referes from the students"
task cleanup_embedded_referees: :environment do
Student.all.each do |student|
# Make sure the new referee is there before removing
if student.embedded_referee.present? && student.referee.present?
# Finally, remove the embedded model
student.embedded_referee.remove #or delete
end
end
end
And finally, remove any mentions to embedded_referee from the Student
model and the EmbeddedReferee
model altogether.
Conclusion
I hope this post is helpful to you, and the method is explained well enough so you can adapt it to other relationships (embeds_many
, has_one
, …), or it doesn’t need much tinkering when trying to use it on newer versions of Mongoid.
Picture ‘Sliced Chioggia Beets, Mitosis’ by Ano Lobb flickr, used under CC BY 2.0 license.