Giter Club home page Giter Club logo

box-c's People

Contributors

agoslen avatar bbpennel avatar cazzerson avatar daines avatar dependabot[bot] avatar diannao avatar gregjan avatar hannahlwang avatar harringj avatar jjksexton avatar krblount avatar krwong avatar lfarrell avatar sharonluong avatar smithjp avatar sonoet avatar sreenug avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

box-c's Issues

Display FITS generated fields

Display fields generated by the FITS extract when applicable. This would include field related information such as original image dimensions, audio quality/bit rate, and number of pages in pdf documents.

move menu properties file from application to external file

To add or modify content in the CDR UI menu, you must edit a properties file in the application and redeploy the application.

propose moving properties file /access/src/main/resources/externalContent.properties to a file external to the application so that ic can be updated without requiring a redeploy

Replace Newly added with New Collections

Replace the newly added panel in the front page of the Access UI with a list of newly added Collections, since they are much more likely to have informative, non-filename titles.

Java OOM on large downloads and Fedora performance

If you try to download a large file, say a WAV of 500mb you will have a long wait and then eventually see a server error. The problem may be the way an array is being copied in Fedora..

This was seen in the tomcat log as JAVA OOM:

SEVERE: Servlet.service() for servlet RestServlet threw exception
java.lang.OutOfMemoryError: Java heap space
        at java.util.Arrays.copyOf(Arrays.java:2786)
        at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:94)
        at java.io.DataOutputStream.write(DataOutputStream.java:90)
        at org.fcrepo.server.security.xacml.pep.rest.filters.DataServletOutputStream.write(DataServletOutputStream.java:71)
        at com.sun.jersey.spi.container.servlet.WebComponent$Writer.write(WebComponent.java:230)
        at com.sun.jersey.spi.container.ContainerResponse$CommittingOutputStream.write(ContainerResponse.java:114)
        at com.sun.jersey.core.provider.AbstractMessageReaderWriterProvider.writeTo(AbstractMessageReaderWriterProvider.java:73)
        at com.sun.jersey.core.impl.provider.entity.InputStreamProvider.writeTo(InputStreamProvider.java:95)
        at com.sun.jersey.core.impl.provider.entity.InputStreamProvider.writeTo(InputStreamProvider.java:58)
        at com.sun.jersey.spi.container.ContainerResponse.write(ContainerResponse.java:254)
        at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:724)
        at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:647)
        at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:638)
        at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:309)
        at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:425)
        at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:590)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at org.fcrepo.server.security.servletfilters.FilterRestApiFlash.doFilter(FilterRestApiFlash.java:79)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at org.fcrepo.server.security.xacml.pep.rest.PEP.doFilter(PEP.java:154)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at org.fcrepo.server.security.jaas.AuthFilterJAAS.doFilter(AuthFilterJAAS.java:295)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
        at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:563)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)```

Admin function to alphabetize a folder

Folders are ingested in a certain order. The order in the METS file is followed in a very specific way to allow users to dictate an exact order. Another option is to create an alphabetized folder, which will be automatically re-ordered whenever new children are added (disregarding METS order).

However, we have a gap. When a folder is created as a normal folder, we sometimes want to alphabetize it after several ingests. This is natural as collections may grow over time. Sometimes things just get ingested or moved into an unnatural order.

I propose:

  • We add a function that let's the user reorder a folder (alphabetically)
  • We add an option to this function which will make this sticky, such that it will continue to be alphabetized as new children are added via ingest or move.

Most of the functionality for this is already there in the code to support our alphabetized folders..

orphaned objects in collection browse

on Tuesday 11/29 at 9:07 AM, Erin wrote:

I don't know how it got there, but there's a floater object posed as a collection in the CDR. You need to be logged in, but once you are you can see C-0343 on the facet tab on the left here, https://cdr.lib.unc.edu/search?types=Collection

This is a folder within the SOHP collection so I'm not sure why it's there. Could you have someone remove it and make sure it's not an indexing issue (I was going to just delete it, but then I realized it might be a larger issue).

an update to this issue at 5:00 pm:

It looks like there is another orphaned folder in the collection facet. You have to be logged in to see it. L-0297 is orphaned now.

Refactor Container Sorting

There is way to much code in the CDR management stack that relates to maintaining ordered lists of contents. In general the need for a certain order breaks down into two use cases:

  • alphabetical - this is locale specific and better done in access (The order of the alphabet is different around the world and cannot be recorded in MD_CONTENTS.)
  • explicitly ordered elements - chapters in a book, for instance. This order must be maintained in MD_CONTENTS.

We can add a default sort setting to the container model, specifying how the access layer should generally present the contents. The two settings above would make sense, with one of them being the default. I think these sorting features already exist, so this is just a matter of detecting the default.

This does away with alphabetizing in MD_CONTENTS, a overwrought function that pulls in lots of data for little gain. Instead we can ask access and Solr to sort alphabetically for the client's locale at access time by setting a default sort flag.

I'd like to fold this change into the scalable-ingest branch. Comments?

Deleting folders in the repository

The current Admin UI won't delete folders that have been ingested into the repository (example: https://cdr.lib.unc.edu/search?action=setFacet%3apath%2c%223%2cuuid%3afe639a29-a51a-4acd-9560-52e540696cd6%2c4%22|resetNav%3asearch)

The Alice Gerrard Collection 20006 folder (containing zero objects) needs to be deleted from the Alice Gerrard Collection contents.

Once this functionality is operable, the same thing needs to happen to the Bill Ferris collection (contents need to move out of Archival Audio Files and up into Bill Ferris and then the Archival Audio Files needs to be deleted).

iRODS client socket timeouts on writes (jargon)

Lately several ingests have failed due to socket timeout. The jargon client inside of Fedora is waiting for a response from the server, but fails to get one within the normal 2 minutes timeout. There are any number of reasons for why the server might be slow, from replication to tape, to other kinds of IO blocking. I am fairly certain that the delay is not due to load/cpu on the iRODS server itself.

I am testing a patch to low-level storage which will increase this timeout to 5 minutes. I'd expect iRODS to finish long before then in most cases and reply. I'm also adding a special catch clause for socket timeout on iRODS output stream close. This is a particular operation that seems to take a long time. If it times out, there is still a chance that we can proceed. Since we do checksum comparison against iRODS for all write operations, any problems will still be detected.

This is in testing this afternoon on DEV.

anti-virus service

Let's think about how we would want to implement virus checking as an ingest service. We'll need to have a solution in place before we can accept submissions via a web form.

Jargon "Stale NFS file handle" error upon ingest

Error: org.fcrepo.server.errors.HttpServiceNotFoundException: [IrodsExternalContentManager] returned an error. The underlying error was a org.fcrepo.server.errors.HttpServiceNotFoundException The message was "[FileExternalContentManager] returned an error. The underlying error was a java.io.IOException The message was "Stale NFS file handle" . " . ; nested exception is org.springframework.ws.soap.client.SoapFaultClientException: org.fcrepo.server.errors.HttpServiceNotFoundException: [IrodsExternalContentManager] returned an error. The underlying error was a org.fcrepo.server.errors.HttpServiceNotFoundException The message was "[FileExternalContentManager] returned an error. The underlying error was a java.io.IOException The message was "Stale NFS file handle" . " .

This issue has been relayed to sysadmins for their consideration.

Update metadata form does not report MODS validation errors

This problem might lie in the admin project or deeper in the persistence project. The persistence project is supposed to return an IngestException to the caller whenever this is a validation error. We should figure out just where this message is being ignored and try to forward it back to the user. We should at least report success or failure.

Periodically call irmtrash for Fedora LLS and Services

Trash is only cleaned up manually now and has started to accumulate. Instead of adding a cron to the system configuration, we can add a call to irmtrash within the relevant modules.

Fedora LLS might incorporate a timer and a task that takes out the trash once a day, for instance. CDR Services could have a similar timer or incorporate the task into the queue system.

Improve Solr indexing queuing

Improve the queuing and processing methodology of requests to index items or containers automatically as a part of ingest.

This will likely involve implementing a fast tracking mechanism for a specific subset of JMS messages to be immediately added to the Solr queue rather than waiting to be processed by the central services queue.

Will also likely involve some adjustments to when Solr indexing occurs in general to account for this, such as triggering reindexing as an automatic resultant step of any other services successfully running at the end of the services stack, and limiting the Solr update service to only trigger for a small subset of messages that are difficult to detect otherwise, such as Deletes, to cut down on mulgara querying.

Indexing Issue (SFC as example)

The Raymond Pulley, Robert Coles and Robert H. Moore collections were ingested on November 11 and are still not being indexed in the UI. I don't think new content has been automatically ingested in several weeks. This is an example of the larger indexing issues. Everything that has been ingested in the last several weeks has had to be manually reindexed.
Side note - metadata updates don't seem to be triggering reindexing, either.

Log out button issues

Log out button is not signing people out. I am prompted with the Shibboleth button and agree and close my browser window and re-open and I'm still logged in.

Scalable, Multi-User Ingest

A number of features are needed to support increasing ingest activity across the libraries:

  • We need a queue for SIPs
  • Ability to interrupt the queue for maintenance, etc..
  • Ability to interrupt/continue individual SIP ingest transactions
  • Fedora service polling on ingest, recover from 503 responses
  • Ability to inspect the SIP queue
  • Split initial SIP validation/processing from Fedora ingest
  • Provide immediate feedback from SIP validation
  • Investigate feasibility of running services during ingest.
  • Discuss splitting solr indexing from mid-tier service stack.

postProcForPUT does not replicate after a resource is forced (msiSetDefaultResc)

Okay, I've just finished switching out the vaults. This ended up requiring a quick restart of tomcat, but service was restored within a minute or two. I will explain.

First I followed the exact instructions below and tried testing it on the command line. I put a test file and it ended up on all 3 resources. This is because the client's default resource is still honored when the msiSetDefaultResc is used with the "preferred" keyword. My client was still configured with cdrResc as the default.

The normal approach to overriding the client setting is to use the "forced" keyword, so I tried that. I substituted "forced" for "preferred" below. This result in a put to cdrResc2, overriding the client's default setting. However, the put never replicated to tape. Why it didn't replicate is still a mystery and is something to explore in the DEV environment.

ISSUE: postProcForPUT does not replicate after a resource is forced (msiSetDefaultResc)

I adjusted the rules back to the "preferred" resource mode. Then I set up my client to also default to cdrResc2. That put worked, leaving a copy on cdrResc2 and tape.

Object update unsuccessful

Hi,
The last three issues of CPJ in the CDR were quite large, so the depositor asked if we could size them down and reingest them (see https://cdr.lib.unc.edu/search?action=setStartRow%3a60&sort=default&sortOrder=&facets=path:1,uuid:cbbc2cc1-c538-4e28-b567-55db61b7942e&rows=20). I uploaded the new files the week before Thanksgiving break. The new files never showed up. I even tried downloading the files to see if it was just the UI masking the new files, but they were still the large file sizes when I downloaded them.

I just attempted to replace them again through the update function of the Admin UI. It doesn't look like it was successful.

Can someone help me troubleshoot this issue?

Thanks,

Erin

Persistance of services queues

While the JMS queue should persist through server restarts, this only works if they haven't been actively read. It would be helpful for maintenance periods for the Services and Solr queues to serialize to disk. The same would also apply to the failure list in the services queue.

Services would also of course need to know to reload these queues during startup.

renumbering of single layer collection

When attempting to re-order the CPJ collection, it won't rearrange in the UI. Even with a manual PID reorder, the results aren't displaying in the web UI

cdr ingest failed

After several successful ingests this morning, this came in as a failure (it had been working on ingest for over an hour before it failed).
Erin
From: [email protected] [[email protected]]
Sent: Tuesday, November 22, 2011 3:16 PM
To: O'Meara, Erin
Subject: CDR ingest failed

Carolina Digital Repository

Your submission was not ingested into the repository.

Error: org.fcrepo.server.errors.LowlevelStorageException: IRODSFedoraFileSystem.write(): couldn't make directories for [/cdrZone/home/fedora/datastreams/2011/1122/15/09/uuid_c5e1b836-acad-4505-a320-60ec70379281+DATA_FILE+DATA_FILE.0] JargonException caught in constructor, rethrow as IOException; nested exception is org.springframework.ws.soap.client.SoapFaultClientException: org.fcrepo.server.errors.LowlevelStorageException: IRODSFedoraFileSystem.write(): couldn't make directories for [/cdrZone/home/fedora/datastreams/2011/1122/15/09/uuid_c5e1b836-acad-4505-a320-60ec70379281+DATA_FILE+DATA_FILE.0] JargonException caught in constructor, rethrow as IOException

Thank you for contributing to the Carolina Digital Repository, a service of the University of North Carolina at Chapel Hill Libraries.

Display issue with Folder facet

The Folder facet in normal search results is displaying incorrectly when expanding a container in place. The first tier of children render properly, but for the second tier the line is slightly out of place, and for the third tier and below only up to two tiers of lines are drawn, disrupting the indentation.

monitor queue report

monitor the queue report page during ingest processing and while catchup services are running

Mulgara reports "too many open files"

This has been observed at various points in the Mulgara logs. It probably reflects the aggregate file handle that all our java processes are using. As such it is bound to be intermittent and hard to replicate.

stack trace from mulgara log:
2011-11-02 20:37:10,140 WARN [qtp939602315-17731 - /sparql/?format=json] xa.HybridTuples - Failed to obtain tmpdir
java.io.IOException: Too many open files
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.checkAndCreate(File.java:1704)
at java.io.File.createTempFile(File.java:1792)
at org.mulgara.util.TempDir.createTempFile(TempDir.java:101)

FedoraDataService getObjectViewXML failing silently

Some of the FedoraDataService runnables used for getObjectViewXML are failing silently, resulting in incomplete result objects. This is currently known to effect GetPathInfo, which is a critical field in the Access UI. The runnables must log when there is a problem, and preferably provide a means of detecting and recovering from it.

Indexing issue null folder

IRODSFedoraFileSystem.write(): couldn't make directories

This happened while ingesting the Ralph Epperson 04 SIP within the SFC. The ingest contains one especially large data file, at 4.9 gig. This is the file that is referenced in the stack trace below:

Your submission was not ingested into the repository.

Error: org.fcrepo.server.errors.LowlevelStorageException: IRODSFedoraFileSystem.write(): couldn't make directories for [/cdrZone/home/fedora/datastreams/2011/1201/10/02/uuid_5fa6361f-b7b1-490b-b2a2-4b8fc8babdf4+DATA_FILE+DATA_FILE.0] JargonException caught in constructor, rethrow as IOException; nested exception is org.springframework.ws.soap.client.SoapFaultClientException: org.fcrepo.server.errors.LowlevelStorageException: IRODSFedoraFileSystem.write(): couldn't make directories for [/cdrZone/home/fedora/datastreams/2011/1201/10/02/uuid_5fa6361f-b7b1-490b-b2a2-4b8fc8babdf4+DATA_FILE+DATA_FILE.0] JargonException caught in constructor, rethrow as IOException

digging deeper, we find this in the fedora log (in reference to the same path as above):

Caused by: java.io.IOException: JargonException caught in constructor, rethrow as IOException
at org.irods.jargon.core.pub.io.IRODSFileOutputStream.close(IRODSFileOutputStream.java:140) [jargon-core-3.0.1-20111012.172326-5.jar:na]
at java.io.FilterOutputStream.close(FilterOutputStream.java:143) [na:1.6.0_29]
at fedorax.server.module.storage.lowlevel.irods.IrodsIFileSystem.stream2streamCopy(IrodsIFileSystem.java:110) [fcrepo-irods-storage-2.1.jar:na]
at fedorax.server.module.storage.lowlevel.irods.IrodsIFileSystem.write(IrodsIFileSystem.java:356) [fcrepo-irods-storage-2.1.jar:na]
... 62 common frames omitted
Caused by: org.irods.jargon.core.exception.JargonException: java.net.SocketTimeoutException: Read timed out
at org.irods.jargon.core.connection.IRODSCommands.readHeaderLength(IRODSCommands.java:954) [jargon-core-3.0.1-20111012.172326-5.jar:na]
at org.irods.jargon.core.connection.IRODSCommands.readHeader(IRODSCommands.java:825) [jargon-core-3.0.1-20111012.172326-5.jar:na]
at org.irods.jargon.core.connection.IRODSCommands.readMessage(IRODSCommands.java:636) [jargon-core-3.0.1-20111012.172326-5.jar:na]
at org.irods.jargon.core.connection.IRODSCommands.readMessage(IRODSCommands.java:624) [jargon-core-3.0.1-20111012.172326-5.jar:na]
at org.irods.jargon.core.connection.IRODSCommands.irodsFunction(IRODSCommands.java:258) [jargon-core-3.0.1-20111012.172326-5.jar:na]
at org.irods.jargon.core.connection.IRODSCommands.irodsFunction(IRODSCommands.java:192) [jargon-core-3.0.1-20111012.172326-5.jar:na]
at org.irods.jargon.core.pub.IRODSFileSystemAOImpl.fileClose(IRODSFileSystemAOImpl.java:1414) [jargon-core-3.0.1-20111012.172326-5.jar:na]
at org.irods.jargon.core.pub.io.IRODSFileImpl.close(IRODSFileImpl.java:1337) [jargon-core-3.0.1-20111012.172326-5.jar:na]
at org.irods.jargon.core.pub.io.IRODSFileOutputStream.close(IRODSFileOutputStream.java:136) [jargon-core-3.0.1-20111012.172326-5.jar:na]
... 65 common frames omitted
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method) [na:1.6.0_29]
at java.net.SocketInputStream.read(SocketInputStream.java:129) [na:1.6.0_29]
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) [na:1.6.0_29]
at java.io.BufferedInputStream.read1(BufferedInputStream.java:258) [na:1.6.0_29]
at java.io.BufferedInputStream.read(BufferedInputStream.java:317) [na:1.6.0_29]

New collection cover

New collection cover image needed for SHC Digital Files Collection, uuid:c59291a6-ad7a-4ad4-b89d-e2fe8acac744. See email for image.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.