Giter Club home page Giter Club logo

hydra-derivatives's People

Contributors

aploshay avatar atz avatar awead avatar barmintor avatar bbpennel avatar bess avatar botimer avatar carolyncole avatar cbeer avatar cjcolvar avatar dchandekstark avatar dlpierce avatar flyingzumwalt avatar grosscol avatar gwiedeman avatar jcoyne avatar jechols avatar jenlindner avatar jeremyf avatar jrgriffiniii avatar jrochkind avatar kevinreiss avatar lbiedinger avatar mbklein avatar mjgiarlo avatar rodyoukai avatar stkenny avatar tampakis avatar tpendragon avatar val99erie avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hydra-derivatives's Issues

Unlinking a tempfile?

     No such file or directory @ unlink_internal - /tmp/20516b3c-9efa-4b2e-b471-33fbdd822da7-120150827-22031-3paka9.pdf

    /opt/goldenseal/shared/bundle/ruby/2.2.0/gems/hydra-derivatives-2.0.0/lib/hydra/derivatives/document.rb:20:in `unlink'
    /opt/goldenseal/shared/bundle/ruby/2.2.0/gems/hydra-derivatives-2.0.0/lib/hydra/derivatives/document.rb:20:in `block in encode_file'
    /opt/goldenseal/shared/bundle/ruby/2.2.0/gems/hydra-derivatives-2.0.0/lib/hydra/derivatives/services/tempfile_service.rb:34:in `block in default_tempfile'
    /usr/local/lib/ruby/2.2.0/tempfile.rb:319:in `open'
    /opt/goldenseal/shared/bundle/ruby/2.2.0/gems/hydra-derivatives-2.0.0/lib/hydra/derivatives/services/tempfile_service.rb:25:in `default_tempfile'
    /opt/goldenseal/shared/bundle/ruby/2.2.0/gems/hydra-derivatives-2.0.0/lib/hydra/derivatives/services/tempfile_service.rb:20:in `tempfile'
    /opt/goldenseal/shared/bundle/ruby/2.2.0/gems/hydra-derivatives-2.0.0/lib/hydra/derivatives/services/tempfile_service.rb:7:in `create'
    /opt/goldenseal/shared/bundle/ruby/2.2.0/gems/hydra-derivatives-2.0.0/lib/hydra/derivatives/document.rb:14:in `encode_file'
    /opt/goldenseal/shared/bundle/ruby/2.2.0/gems/hydra-derivatives-2.0.0/lib/hydra/derivatives/shell_based_processor.rb:22:in `block in process'
    /opt/goldenseal/shared/bundle/ruby/2.2.0/gems/hydra-derivatives-2.0.0/lib/hydra/derivatives/shell_based_processor.rb:17:in `each'
    /opt/goldenseal/shared/bundle/ruby/2.2.0/gems/hydra-derivatives-2.0.0/lib/hydra/derivatives/shell_based_processor.rb:17:in `process'
    /opt/goldenseal/shared/bundle/ruby/2.2.0/gems/hydra-derivatives-2.0.0/lib/hydra/derivatives.rb:104:in `transform_file'
    /opt/goldenseal/shared/bundle/ruby/2.2.0/bundler/gems/hydra-works-dc0bfd08eb24/lib/hydra/works/models/concerns/generic_file/derivatives.rb:17:in `block (2 levels) in <module:Derivatives>'
    /opt/goldenseal/shared/bundle/ruby/2.2.0/gems/hydra-derivatives-2.0.0/lib/hydra/derivatives.rb:66:in `call'
    /opt/goldenseal/shared/bundle/ruby/2.2.0/gems/hydra-derivatives-2.0.0/lib/hydra/derivatives.rb:66:in `block in create_derivatives'
    /opt/goldenseal/shared/bundle/ruby/2.2.0/gems/hydra-derivatives-2.0.0/lib/hydra/derivatives.rb:64:in `each'
    /opt/goldenseal/shared/bundle/ruby/2.2.0/gems/hydra-derivatives-2.0.0/lib/hydra/derivatives.rb:64:in `create_derivatives'
    /opt/goldenseal/shared/bundle/ruby/2.2.0/bundler/gems/curation_concerns-9b49d4d66ed5/curation_concerns-models/app/jobs/create_derivatives_job.rb:10:in `run'
    /opt/goldenseal/shared/bundle/ruby/2.2.0/bundler/gems/curation_concerns-9b49d4d66ed5/curation_concerns-models/lib/curation_concerns/models/resque.rb:32:in `perform'


Update rspec

There are few instances of its, and potentially other stuff that went away with rspec 3.

Warning because mime_type is nil before characterization

WARN: Unable to find a registered mime type for nil on http://127.0.0.1:8080/fedora/rest/prod/0r/96/73/73/0r967373m
Which is logged here:
https://github.com/projecthydra/hydra-derivatives/blob/master/lib/hydra/derivatives/extract_metadata.rb#L19

In curation_concerns we're delegating mime_type to the output of characterization here:
https://github.com/projecthydra-labs/curation_concerns/blob/6cd47b5173305e37ba7563850c3deaea09b60f45/curation_concerns-models/app/models/concerns/curation_concerns/generic_file/characterization.rb#L7

Use name of processing directive as original_name attr of emitted file.

For the default processors: image, audio, video, document

Pass the name of the directive responsible for the derivative (e.g. thumbnail) as the original_name on the derivative file that gets emitted to output_file_service.call. This can be a convenience to downstream uses. The original_name is currently set to "derivative" and was previously nil.

The original_name attribute of the file passed to output_file_service.call should be the name of the directive that caused the file to be generated. For example:
transform_file :original_file, thumbnail: { format: 'jpg', size: '338x493' }
should eventually pass a file to the output_file_service that has the attribute original_name set as "thumbnail".

Related to #70.

Image derivative creation should not fail on ImageMagick warnings

Some TIFF files contain metadata that causes ImageMagick to exit with non-zero status. This causes the derivative job to fail.

This is my output from Sufia:

`mogrify -resize 200x150> /tmp/mini_magick20160829-32286-19a488t.tiff\` failed with error: mogrify: /tmp/mini_magick20160829-32286-19a488t.tiff: unknown field with tag 37724 (0x935c) encountered. `TIFFReadDirectory' @ warning/tiff.c/TIFFWarnings/715. mogrify: /tmp/mini_magick20160829-32286-19a488t.tiff: Unknown tag 37724. `TIFFSetField' @ error/tiff.c/TIFFErrors/499.

    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/mini_magick-4.5.1/lib/mini_magick/shell.rb:18:in `run'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/mini_magick-4.5.1/lib/mini_magick/tool.rb:92:in `call'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/mini_magick-4.5.1/lib/mini_magick/tool.rb:40:in `new'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/mini_magick-4.5.1/lib/mini_magick/image.rb:504:in `mogrify'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/mini_magick-4.5.1/lib/mini_magick/image.rb:395:in `method_missing'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/hydra-derivatives-3.1.1/lib/hydra/derivatives/processors/image.rb:32:in `block in create_resized_image'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/hydra-derivatives-3.1.1/lib/hydra/derivatives/processors/image.rb:38:in `create_image'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/hydra-derivatives-3.1.1/lib/hydra/derivatives/processors/image.rb:31:in `create_resized_image'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/hydra-derivatives-3.1.1/lib/hydra/derivatives/processors/image.rb:25:in `process_without_timeout'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/hydra-derivatives-3.1.1/lib/hydra/derivatives/processors/image.rb:8:in `process'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/hydra-derivatives-3.1.1/lib/hydra/derivatives/runners/runner.rb:32:in `block (2 levels) in create'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/hydra-derivatives-3.1.1/lib/hydra/derivatives/runners/runner.rb:29:in `each'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/hydra-derivatives-3.1.1/lib/hydra/derivatives/runners/runner.rb:29:in `block in create'
    /data/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/curation_concerns-1.2.0/app/services/curation_concerns/local_file_service.rb:7:in `call'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/hydra-derivatives-3.1.1/lib/hydra/derivatives/runners/runner.rb:43:in `source_file'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/hydra-derivatives-3.1.1/lib/hydra/derivatives/runners/runner.rb:28:in `create'
    /data/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/curation_concerns-1.2.0/app/models/concerns/curation_concerns/file_set/derivatives.rb:40:in `create_derivatives'
    /data/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/curation_concerns-1.2.0/app/jobs/create_derivatives_job.rb:10:in `perform'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/activejob-4.2.7/lib/active_job/execution.rb:32:in `block in perform_now'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/activesupport-4.2.7/lib/active_support/callbacks.rb:117:in `call'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/activesupport-4.2.7/lib/active_support/callbacks.rb:555:in `block (2 levels) in compile'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/activesupport-4.2.7/lib/active_support/callbacks.rb:505:in `call'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/activesupport-4.2.7/lib/active_support/callbacks.rb:498:in `block (2 levels) in around'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/activesupport-4.2.7/lib/active_support/callbacks.rb:343:in `block (2 levels) in simple'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/i18n-0.7.0/lib/i18n.rb:257:in `with_locale'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/activejob-4.2.7/lib/active_job/translation.rb:7:in `block (2 levels) in <module:Translation>'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/activesupport-4.2.7/lib/active_support/callbacks.rb:441:in `instance_exec'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/activesupport-4.2.7/lib/active_support/callbacks.rb:441:in `block in make_lambda'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/activesupport-4.2.7/lib/active_support/callbacks.rb:342:in `block in simple'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/activesupport-4.2.7/lib/active_support/callbacks.rb:497:in `block in around'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/activesupport-4.2.7/lib/active_support/callbacks.rb:505:in `call'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/activesupport-4.2.7/lib/active_support/callbacks.rb:498:in `block (2 levels) in around'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/activesupport-4.2.7/lib/active_support/callbacks.rb:343:in `block (2 levels) in simple'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/activejob-4.2.7/lib/active_job/logging.rb:23:in `block (4 levels) in <module:Logging>'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/activesupport-4.2.7/lib/active_support/notifications.rb:164:in `block in instrument'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/activesupport-4.2.7/lib/active_support/notifications/instrumenter.rb:20:in `instrument'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/activesupport-4.2.7/lib/active_support/notifications.rb:164:in `instrument'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/activejob-4.2.7/lib/active_job/logging.rb:22:in `block (3 levels) in <module:Logging>'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/activejob-4.2.7/lib/active_job/logging.rb:43:in `block in tag_logger'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/activesupport-4.2.7/lib/active_support/tagged_logging.rb:68:in `block in tagged'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/activesupport-4.2.7/lib/active_support/tagged_logging.rb:26:in `tagged'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/activesupport-4.2.7/lib/active_support/tagged_logging.rb:68:in `tagged'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/activejob-4.2.7/lib/active_job/logging.rb:43:in `tag_logger'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/activejob-4.2.7/lib/active_job/logging.rb:19:in `block (2 levels) in <module:Logging>'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/activesupport-4.2.7/lib/active_support/callbacks.rb:441:in `instance_exec'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/activesupport-4.2.7/lib/active_support/callbacks.rb:441:in `block in make_lambda'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/activesupport-4.2.7/lib/active_support/callbacks.rb:342:in `block in simple'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/activesupport-4.2.7/lib/active_support/callbacks.rb:497:in `block in around'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/activesupport-4.2.7/lib/active_support/callbacks.rb:505:in `call'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/activesupport-4.2.7/lib/active_support/callbacks.rb:92:in `__run_callbacks__'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/activesupport-4.2.7/lib/active_support/callbacks.rb:778:in `_run_perform_callbacks'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/activesupport-4.2.7/lib/active_support/callbacks.rb:81:in `run_callbacks'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/activejob-4.2.7/lib/active_job/execution.rb:31:in `perform_now'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/activejob-4.2.7/lib/active_job/execution.rb:21:in `execute'
    /usr/local/hydra/lakeshore/shared/bundle/ruby/2.3.0/gems/activejob-4.2.7/lib/active_job/queue_adapters/resque_adapter.rb:46:in `perform'

The mogrify command actually creates the correct derivative.

As per IM documentation [1] missing tags like the ones in the example above produce a status code between 300 and 399. Codes below 400 are considered warnings that still produce a valid image. Therefore the function calling the IM wrapper should not throw an exception if the return code is less than 400.

[1] http://www.imagemagick.org/script/exception.php

Large File Support: Use I/O, not filepath wherever possible

It is costly to pull down files and write them to disk unnecessarily. For sufficiently large files, this will break the ingest/derivative pipeline. This is made worse by attempts at job parallelization, where each job (potentially serviced on a different worker box) incurs this cost. But it is possible to avoid this problem.

Even though we are forking to shell for many of the non-ruby derivative processors, we should avoid forcing the input (and ideally output) to be literal filesystem files, when there is no such legitimate need:

This also allows optimizations for processors that don't use the bulk of a large file (e.g., only the metadata and first 2 minutes of, say, a 6 hour video). They can read until satisfied and then reset/close the IO. Most of the GBs are never pulled down, never put in memory, and never written to disk.

With a cloud-based platform like Hyku, it is very conceivable that this derivatives code is the tightest bottleneck in supporting large files.

Errors when converting using Processors::Document

Converting documents to pdf using libreoffice produces this error:

No such file or directory @ unlink_internal - /var/folders/yb/pbl7568x7073w02tyghtk3140000gp/T/thumbnail.pdf

Libreoffice converts the document to pdf successfully, but using the name of the source file. So, if your source file is example.odt the resulting file is named example.pdf. The encoding methods are expecting a file named thumbnail.pdf. See:

https://github.com/projecthydra/hydra-derivatives/blob/master/lib/hydra/derivatives/processors/document.rb

JPEG2k Derivative Runner Broken

Returns this:

Unable to execute command "kdu_compress -i /tmp/sufia20151020-74582-10wnpfx.tif -o /tmp/sufia20151020-74582-dpm4u5.jp2 -rate 2.4,1.48331273,0.91675694,0.56659885,0.3501847,0.21643059,0.13376427,0.0826726 -jp2_space sRGB -double_buffering 10 -num_threads 4 -no_weights Clevels=1 "Stiles={1024,1024}" "Cblk={64,64}" Cuse_sop=yes Cuse_eph=yes Corder=RPCL ORGgen_plt=yes ORGtparts=R". Exit code: pid 74792 SIGPIPE (signal 13)

The command runs fine on its own, I assume this is a problem with the new code around popen3?

Add Support for JPEG2000 Derivatives with Kakadu

Many partners use Kakadu for making JPEG2000 service files and have established profiles or "recipes" for doing so. As these recipes are complex and varied, the recipe and scenario should be configurable in a few different ways: via a config file, by passing a string, or else letting the application make a best guess.

Bad error 'No such file or directory' when convert fails to run

When processing office categorized documents and the conversion to pdf fails you get a strange error message No Such file or directory instead of some indication that the pdf conversion failed.

This also shows as bad URI(is not URI?) when the original filename contains a space or other non URI character.

Here is where the conversion happens that is returning no output:
https://github.com/projecthydra/hydra-derivatives/blob/master/lib/hydra/derivatives/processors/document.rb#L38

It would be better to capture the error and or verify the file exists before passing on to the ImageProcessor: https://github.com/projecthydra/hydra-derivatives/blob/master/lib/hydra/derivatives/processors/document.rb#L24

Error generated by hydra/derivatives/railtie.rb

When I point my Gemfile to the master branch of projecthydra/hydra-derivatives, I get the following error starting rails console: https://gist.github.com/coblej/9209242 .

@jeremyf The error seems to be coming from https://github.com/projecthydra/hydra-derivatives/blob/master/lib/hydra/derivatives/railtie.rb#L3 . I think you made the commit that added this. Any insights on why it might be causing a problem? Should it be just "initializer" rather than "config.initializer" for Rails 4?

Creating thumbnails of PDF files results in all-black images

When using Processors::Image to create thumbnails from PDF files, the resulting image is completely black. This is a common issue in Imagemagick and there are available fixes:

http://stackoverflow.com/questions/10934456/imagemagick-pdf-to-jpgs-sometimes-results-in-black-background

Additional options need to be passed to the convert command if pdf is the source file.

To reproduce, execute convert -resize '200x150>' test.pdf test.jpg on the attached file.

test.pdf

Passing any one of the following options to the convert command fixes the issue:

  • -flatten
  • -alpha flatten
  • -alpha remove

Non-Backwards-Compatible Change between 3.1.3 and 3.1.4

I think there is a non-backwards-compatible change introduced between 3.1.3 and 3.1.4. Starting with 3.1.4, I see this error when trying to generate a JP2:

      Failure/Error:
        Hydra::Derivatives::Jpeg2kImageDerivatives.create(
          filename,
          outputs: [
            label: 'intermediate_file',
            service: {
              datastream: 'intermediate_file',
              recipe: :default
            },
            url: derivative_url('intermediate_file')
          ]
      
      NoMethodError:
        protected method `long_dim' called for Hydra::Derivatives::Processors::Jpeg2kImage:Class

Unable to set config_ffmpeg to false

Steps to reproduce the problem:

  • cd to hydra-derivatives directory
  • irb -I./lib
  • In IRB, run the following ruby code:
2.3.3 :001 > require 'hydra/derivatives'
 => true 
2.3.3 :002 > Hydra::Derivatives.enable_ffmpeg
 => true 
2.3.3 :003 > Hydra::Derivatives.enable_ffmpeg = false
 => false 
2.3.3 :004 > Hydra::Derivatives.enable_ffmpeg
 => true 
  • Notice that enable_ffmpeg always returns true, even when you set it to false

Specify custom processor

If I want to create my own processor class, I have to create an instance of Hydra::Derivatives::MyProcessor. It would easier to create:

class MyProcessor < Hydra::Derivatives::Image
end

and pass that class in the makes_derivatives block. From what I can tell, I can't do that.

libreoffice silently fails

Libreoffice returns zero status when it errors:

awead@pooh T $ soffice --invisible --headless --convert-to doc --outdir /. non-existent-file.txt
Error: source file could not be loaded
awead@pooh T $ echo $?
0

Output is directed to STDERR, so maybe we should parse that and raise something?

output_file_service should pass a file that responds to .mime_type

The output file service is configured to pass around mime_type as an option and use that when attaching files. Rather than passing around the mime_type option, the file object should respond to .mime_type. Also the file sent to the output file service should respond to .read. This is blocking samvera/hydra-works#183

Anything that calls .output_file_service (as well as uses it such as the PersistBasicContainedOutputFile service) should be modified to remove the mime_type option and to make sure the file passed around responds to .mime_type.
These are mainly found in the derivative processors:

Bump version?

After #29 and #30 are merged, could the version be bumped up? I should have proposed this with the first JP2 support PR, but, er, now that it actually works as advertised (mea culpa) it's probably even more appropriate.

Hydra::Derivatives::IoDecorator objects cannot be compared

> file = File.open('/etc/passwd')
 => #<File:/etc/passwd> 
> io1 = Hydra::Derivatives::IoDecorator.new(file, 'image/png', "foobar.png")
 => #<File:/etc/passwd> 
> io2 = Hydra::Derivatives::IoDecorator.new(file, 'image/png', "foobar.png")
 => #<File:/etc/passwd> 
> file == file
 => true 
> io1.original_name == io2.original_name
 => true 
> io1.mime_type == io2.mime_type
 => true
> io1.__getobj__ == io2.__getobj__
 => true 
> io1 == io2
 => false 

Final line should be true. Literally all aspects of the two objects are identical, including the internal SimpleDelegator delegate object! Failing to provide adequate comparison makes the class untestable.

Indeed, in rspec testing, I get unhelpful failures like:

         expected: (#<Hydra::Derivatives::IoDecorator(#<File:/Users/atz/repos/hyrax/spec/fixtures/world.png>)>)
              got: (#<Hydra::Derivatives::IoDecorator(#<File:/Users/atz/repos/hyrax/spec/fixtures/world.png>)>)
       Diff:

Literally, "I got exactly what I expected, there is no discernable difference, but I failed anyway."

Note that this seems to be a limitation of SimpleDelegator with IO objects that is not part of the objects themselves:

> file == file
 => true 
> SimpleDelegator.new(1) == SimpleDelegator.new(1)
 => true 
> SimpleDelegator.new(file) == SimpleDelegator.new(file)
 => false 

That makes it a particularly questionable choice for Hydra::Derivatives::IoDecorator, since all it does is wrap IO.

MiniMagick Filling up /tmp/ directory.

I was running derivative creation on the ~75k objects in our repository. However, suddenly, all of my derivatives started error-ing and the AVI Processing server reported it had no space left on its hard drive. Looking at /tmp/, there are tons of files like: mini_magick20140730-1251-4a7xzf.jpg that ate up all of the available space (~60 GB of them).

I recalled that when using MiniMagick in the past, I was using a specific unmerged pull request from someone else for that exact issue in the past (though I stuck with RMagick in the end due to other image processing bugs MiniMagick has). Regardless, the relevant unmerged pull request is: minimagick/minimagick#188

Once the current backlog of ~2k remaining failed objects finishes, I'll attempt to use that old pull request to confirm it fixes the issue as it did in the past. Assuming it does, not sure if the best course of action is to pressure MiniMagick about accepting it to fix there bug or just using that unmerged version of the code?

To duplicate the jpg spam, just create a thumbnail derivative from a tif and check your /tmp/ directory. These files will get deleted upon a restart (by default) but MiniMagick leaving them behind when done with them just doesn't seem like correct behavior.

transform_datastream is not passing options to transform_file

The deprecated method transform_datastream is passing an empty options hash to transform_file. This means that only image derivatives get created as the default processor.

def transform_datastream(file_name, transform_directives, opts={})
transform_file(file_name, transform_directives, opts={})
end
deprecation_deprecate :transform_datastream

It should just pass opts.

AAC encoding fails with prebuilt ffmpeg

hydra-derivatives currently uses libfdk_aac for encoding audio, if you just install ffmpeg
from a Linux distro's package repos it's not going to have it and you'll need to compile ffmpeg to support it.

original_name is never set in PersistBasicContainedOutputFileService

All of the processors wrap files/IO streams in a IoDecorator class, which responds to .original_name. It isn't set in the processors, so currently it is up to the output_file_service to determine its value. The PersistBasicContainedOutputFileService uses the determine_original_name method to set it with a value. However, since all of the IoDecorator objects respond to .original_name and that value is nil, the function returns nil.

I'm not really sure how original_name is used outside of Derivatives, but maybe we should have it set in the processors to ensure it returns a non-nil value.

Support FITSservlet

Under load, it is terribly inefficient to fork to system to spin up a new JVM for each FITS call. It is bad enough to crash several increasingly large AWS VMs that we have been testing out. It imposes a huge scaling cost on attempting to get parallelism.

A more appropriate architecture would put the edu.harvard.hul.ois.fits.Fits class in memory once and service multiple requests from the same JVM, exactly what https://github.com/harvard-lts/FITSservlet does. Hydra::Derivatives should support (and likely recommend) the servlet architecture.

Requires #124

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.