Giter Club home page Giter Club logo

Comments (15)

anjackson avatar anjackson commented on September 23, 2024

As @min2ha noticed, the embargo appears to be in milliseconds. See DateEmbargoFilter.

from waybacks.

min2ha avatar min2ha commented on September 23, 2024

There are many pre-defined values in org.archive.wayback.util.partition.PartitionSize.
Example (org.archive.wayback.util.partition.PartitionSize.MS_IN_YEAR)
http://iipc.github.io/openwayback/2.0.0/apidocs/org/archive/wayback/util/partition/PartitionSize.html

To save time, just use the value according to embargo time needed.

I’ve done output, of most important values :

MS_IN_DAY = 86400000
MS_IN_WEEK = 604800000
MS_IN_MONTH = 2592000000
MS_IN_TWO_MONTH = 5184000000
MS_IN_YEAR = 31536000000
MS_IN_TWO_YEAR = 63072000000

from waybacks.

anjackson avatar anjackson commented on September 23, 2024

Hm, okay, so I think I know what happened to the embargo. When we set it up, we were using plain CDX files as the back-end, which uses this LocalResourceIndex class. This bakes in a number of standard filters, some of which pick up configuration from the parent AccessGroup.

https://github.com/iipc/openwayback/blob/master/wayback-core/src/main/java/org/archive/wayback/resourceindex/LocalResourceIndex.java#L128-L137

In particular, it's the AccessPointCaptureFilterGroupFactory which implements the embargo.

https://github.com/iipc/openwayback/blob/6475121cef79240b5e18a5f2c224ff9ba933b43d/wayback-core/src/main/java/org/archive/wayback/resourceindex/filterfactory/AccessPointCaptureFilterGroup.java#L68-L71

However, we've switched to RemoteResourceIndex, which expects that filtering to be done 'upstream' and does very little filtering itself:

https://github.com/iipc/openwayback/blob/6475121cef79240b5e18a5f2c224ff9ba933b43d/wayback-core/src/main/java/org/archive/wayback/resourceindex/RemoteResourceIndex.java#L208-L226

So, we need to add the embargo support back in. Given we already use our own SURTFilteringRemoteResourceIndex the simplest thing is probably just to add the embargo code...

			long embargoMS = accessPoint.getEmbargoMS();
			if(embargoMS > 0) {
				chain.addFilter(new DateEmbargoFilter(embargoMS));
			}

to our own getSearchResultFilters method.

from waybacks.

anjackson avatar anjackson commented on September 23, 2024

Implemented in a90732f thanks @min2ha

@GilHoggarth the embargo should work now.

from waybacks.

anjackson avatar anjackson commented on September 23, 2024

Okay, so there was some real clunky stuff in the locking code. For reasons that made sense at one point, the locking was hard-coded against the behaviour of a particular browser (the version of Firefox that runs in Ericom). This was done by looking for a very specific Accept header. Other browsers don't send that, and so the locking wasn't being applied.

This rule was always brittle, so I've taken it out (as of a7f52a0). We should be able to observe the locking working now.

I've also attempted to clean up the logging a bit.

from waybacks.

GilHoggarth avatar GilHoggarth commented on September 23, 2024

In the current deployment of the NPLD wayback service, each of the LDL versions are slightly tailored to the LDL. Examples of this spotted so far are:

  • The locking page in the new wayback states "The British Library Legal Deposit Web Archive" effectively as the page footer. To be consistent with the /ukdomain Drupal page footer, this should say:
    • dls-{bsp,lon}-wb01 "The British Library Legal Deposit Web Archive"
    • dls-{bsp,lon}-wb02 "Cambridge University Library Legal Deposit Web Archive"
    • dls-{bsp,lon}-wb03 "Bodleian Library Legal Deposit Web Archive"
    • dls-{bsp,lon}-wb04 "Trinity College Dublin Library Legal Deposit Web Archive"
    • dls-nls-wb01 "The National Library of Scotland Legal Deposit Web Archive"
    • dls-nlw-wb01 "The National Library of Wales Legal Deposit Web Archive"

from waybacks.

GilHoggarth avatar GilHoggarth commented on September 23, 2024

Plus, wayback-ldwa still has big, wrong exclude.txt in WEB-INF/classes/.

from waybacks.

anjackson avatar anjackson commented on September 23, 2024

Okay, I think I fixed the exclude.txt override, and created a new environment variable WEB_ARCHIVE_NAME that should be set appropriately for each deployment. Needs testing!

from waybacks.

anjackson avatar anjackson commented on September 23, 2024

Okay, I think this should render the WEB_ARCHIVE_NAME in the right place now.

from waybacks.

GilHoggarth avatar GilHoggarth commented on September 23, 2024

Footer and lock now working; embargo still not restricting access. Any date shows that date's content, embargo just seems to not list the existence of the content in the wayback calendar.

from waybacks.

min2ha avatar min2ha commented on September 23, 2024

Sorry, year Hardcodded in BubbleCalendar.jsp:

wayback-year

Hardcodded in BubbleCalendar.jsp:

for(int i = 1991; i < 2013; i++) {
String curClass = "inactiveHighlight";
if(data.yearNum == i) {
curClass = "activeHighlight";
}

from waybacks.

anjackson avatar anjackson commented on September 23, 2024

Yeah, that's a really old version. The one on the newer branch does it right:

for(int i = startYear; i <= Calendar.getInstance().get(Calendar.YEAR); i++) {
String curClass = "inactiveHighlight";
if(data.yearNum == i) {
curClass = "activeHighlight";
}

from waybacks.

anjackson avatar anjackson commented on September 23, 2024

That said, the BubbleCalendar on the 2017-style-reset branch doesn't really seem to be actually working, I think. Anyway, using BubbleCalendar as part of #1 so we should probably move this discussion there!

from waybacks.

anjackson avatar anjackson commented on September 23, 2024

Oh lawks. The issue with the calendar view not remembering the year was extremely awkward/subtle. The logic that says which capture is closest to the requested date was disabled fro Remote Resource Indexes (no idea why) and without that the calendar page couldn't tell which was the current year. Testing the fix now.

from waybacks.

anjackson avatar anjackson commented on September 23, 2024

But that belongs in a different issue! I believe this is deployed and working.

from waybacks.

Related Issues (11)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.