Comments (15)
As @min2ha noticed, the embargo appears to be in milliseconds. See DateEmbargoFilter.
from waybacks.
There are many pre-defined values in org.archive.wayback.util.partition.PartitionSize.
Example (org.archive.wayback.util.partition.PartitionSize.MS_IN_YEAR)
http://iipc.github.io/openwayback/2.0.0/apidocs/org/archive/wayback/util/partition/PartitionSize.html
To save time, just use the value according to embargo time needed.
I’ve done output, of most important values :
MS_IN_DAY = 86400000
MS_IN_WEEK = 604800000
MS_IN_MONTH = 2592000000
MS_IN_TWO_MONTH = 5184000000
MS_IN_YEAR = 31536000000
MS_IN_TWO_YEAR = 63072000000
from waybacks.
Hm, okay, so I think I know what happened to the embargo. When we set it up, we were using plain CDX files as the back-end, which uses this LocalResourceIndex class. This bakes in a number of standard filters, some of which pick up configuration from the parent AccessGroup.
In particular, it's the AccessPointCaptureFilterGroupFactory which implements the embargo.
However, we've switched to RemoteResourceIndex, which expects that filtering to be done 'upstream' and does very little filtering itself:
So, we need to add the embargo support back in. Given we already use our own SURTFilteringRemoteResourceIndex the simplest thing is probably just to add the embargo code...
long embargoMS = accessPoint.getEmbargoMS();
if(embargoMS > 0) {
chain.addFilter(new DateEmbargoFilter(embargoMS));
}
to our own getSearchResultFilters method.
from waybacks.
Implemented in a90732f thanks @min2ha
@GilHoggarth the embargo should work now.
from waybacks.
Okay, so there was some real clunky stuff in the locking code. For reasons that made sense at one point, the locking was hard-coded against the behaviour of a particular browser (the version of Firefox that runs in Ericom). This was done by looking for a very specific Accept
header. Other browsers don't send that, and so the locking wasn't being applied.
This rule was always brittle, so I've taken it out (as of a7f52a0). We should be able to observe the locking working now.
I've also attempted to clean up the logging a bit.
from waybacks.
In the current deployment of the NPLD wayback service, each of the LDL versions are slightly tailored to the LDL. Examples of this spotted so far are:
- The locking page in the new wayback states "The British Library Legal Deposit Web Archive" effectively as the page footer. To be consistent with the /ukdomain Drupal page footer, this should say:
- dls-{bsp,lon}-wb01 "The British Library Legal Deposit Web Archive"
- dls-{bsp,lon}-wb02 "Cambridge University Library Legal Deposit Web Archive"
- dls-{bsp,lon}-wb03 "Bodleian Library Legal Deposit Web Archive"
- dls-{bsp,lon}-wb04 "Trinity College Dublin Library Legal Deposit Web Archive"
- dls-nls-wb01 "The National Library of Scotland Legal Deposit Web Archive"
- dls-nlw-wb01 "The National Library of Wales Legal Deposit Web Archive"
from waybacks.
Plus, wayback-ldwa still has big, wrong exclude.txt in WEB-INF/classes/.
from waybacks.
Okay, I think I fixed the exclude.txt
override, and created a new environment variable WEB_ARCHIVE_NAME
that should be set appropriately for each deployment. Needs testing!
from waybacks.
Okay, I think this should render the WEB_ARCHIVE_NAME
in the right place now.
from waybacks.
Footer and lock now working; embargo still not restricting access. Any date shows that date's content, embargo just seems to not list the existence of the content in the wayback calendar.
from waybacks.
Sorry, year Hardcodded in BubbleCalendar.jsp:
Hardcodded in BubbleCalendar.jsp:
waybacks/wayback-ukwa/src/main/webapp/WEB-INF/query/BubbleCalendar.jsp
Lines 257 to 261 in c8cd56d
from waybacks.
Yeah, that's a really old version. The one on the newer branch does it right:
waybacks/wayback-ukwa/src/main/webapp/WEB-INF/query/BubbleCalendar.jsp
Lines 271 to 275 in b963711
from waybacks.
That said, the BubbleCalendar on the 2017-style-reset branch doesn't really seem to be actually working, I think. Anyway, using BubbleCalendar as part of #1 so we should probably move this discussion there!
from waybacks.
Oh lawks. The issue with the calendar view not remembering the year was extremely awkward/subtle. The logic that says which capture is closest to the requested date was disabled fro Remote Resource Indexes (no idea why) and without that the calendar page couldn't tell which was the current year. Testing the fix now.
from waybacks.
But that belongs in a different issue! I believe this is deployed and working.
from waybacks.
Related Issues (11)
- Re-theme Beta Wayback look-and-feel to be consistent with the new website HOT 4
- Finish the MissingResourceLoggingFilter implementation
- Provide links to the secure gateways from the 451 error page HOT 1
- Add Content Security Policy to avoid comScore web bug and other live leaks HOT 2
- Removing entries from whitelist file does not remove them from the in-memory whitelist
- Consider building a more sophisticated timeline view
- NPLD playback should indicate licensing terms of current page? HOT 1
- Ensure 404's are logged cleanly. HOT 1
- 451 changes broke no-whitelist playback HOT 1
- Don't say '0 captures' HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from waybacks.