I have a daemon, written in Ruby, part of a Rails project and from which it uses some of the models (and so ActiveRecord). It sits waiting for files to arrive, and when they do it extracts data from them and creates plots, lots of plots; so I run them with the parallel gem, essentially like this:
ActiveRecord::Base.connection.close
Parallel.each(items, in_processes: num_procs) do |item|
ActiveRecord::Base.connection_pool.with_connection do
# create plot and save details to DB
end
end
ActiveRecord::Base.connection.reconnect!
This all worked fine in Rails 5 and Ruby 2.7, on update to Rails 6 and Ruby 3.1 I get segfaults at the end of the parallel run
#0 0x00007fa1eb461818 in ?? () from /lib/x86_64-linux-gnu/libruby-3.1.so.3.1
#1 0x00007fa1eb448333 in ?? () from /lib/x86_64-linux-gnu/libruby-3.1.so.3.1
#2 0x00007fa1eb448789 in rb_source_location_cstr () from /lib/x86_64-linux-gnu/libruby-3.1.so.3.1
#3 0x00007fa1eb2b227d in ?? () from /lib/x86_64-linux-gnu/libruby-3.1.so.3.1
#4 0x00007fa1eb3d836b in ?? () from /lib/x86_64-linux-gnu/libruby-3.1.so.3.1
#5 <signal handler called>
#6 0x00007fa1eb2d6666 in ruby_sized_xfree () from /lib/x86_64-linux-gnu/libruby-3.1.so.3.1
#7 0x00007fa1e6460e4a in xmlCleanupEncodingAliases () from
/home/svdv/svdv/vendor/bundle/ruby/3.1.0/gems/nokogiri-1.16.5-x86_64-linux/lib/nokogiri/3.1/nokogiri.so
#8 0x00007fa1e6461866 in xmlCleanupCharEncodingHandlers () from
/home/svdv/svdv/vendor/bundle/ruby/3.1.0/gems/nokogiri-1.16.5-x86_64-linux/lib/nokogiri/3.1/nokogiri.so
#9 0x00007fa1e64855a9 in xmlCleanupParser () from
/home/svdv/svdv/vendor/bundle/ruby/3.1.0/gems/nokogiri-1.16.5-x86_64-linux/lib/nokogiri/3.1/nokogiri.so
#10 0x00007fa1e4e4c4c3 in NCDISPATCH_finalize () from /lib/x86_64-linux-gnu/libnetcdf.so.19
#11 0x00007fa1e4e43fed in nc_finalize () from /lib/x86_64-linux-gnu/libnetcdf.so.19
#12 0x00007fa1eaf7e55d in __run_exit_handlers (status=0, listp=0x7fa1eb112820 <__exit_funcs>, run_list_
atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at ./stdlib/exit.c:116
#13 0x00007fa1eaf7e69a in __GI_exit (status=<optimized out>) at ./stdlib/exit.c:146
#14 0x00007fa1eb2bb9c0 in ruby_stop () from /lib/x86_64-linux-gnu/libruby-3.1.so.3.1
So this is happening in GLibC’s __run_exit_handlers
of the subprocesses after all my plots are created and their details saved; it does not affect the plotting.
Now I’ve been around the block and know that segfaults in __run_exit_handlers
are a bit of a swine to debug and fix, so if I could skip it altogether by calling POSIX’s _exit
instead of C’s exit
, that could give me a quick fix; and guess that Ruby’s exit!
would do that, and I find that
at_exit do
exit!
end
does indeed stop the segfaults.
But this “fix” gives me a deep sense of unease, so my question: Is this a bad thing to do? Could there be consequences I’ve not appreciated? Is there a better way to do the same thing?