Comments (2)
rss2text relies on the Last-Modified response header when sending an If-Modified-Since request header to a server. This makes sense, but it assumes that servers hosting the same content will reliably send identical Last-Modified headers.
For what it's worth, the RFC defining Last-Modified backs me up:
To get best results when sending an If-Modified-Since header field for cache validation,
clients are advised to use the exact date string received in a previous Last-Modified
header field whenever possible.
I was trying to figure out what you meant by split-brain originally, which lead me to discover exactly what you meant. Is this custom code flipping seconds or Nginx trying to actually make me sad?
➜ rss2text git:(master) for i in {1..5}; do curl -sI https://www.trevorparker.com/rss.xml | grep 'Last-Modified'; done
Last-Modified: Sun, 16 Feb 2014 02:11:17 GMT
Last-Modified: Sun, 16 Feb 2014 02:11:16 GMT
Last-Modified: Sun, 16 Feb 2014 02:11:17 GMT
Last-Modified: Sun, 16 Feb 2014 02:11:17 GMT
Last-Modified: Sun, 16 Feb 2014 02:11:16 GMT
Actually I just went on a hunt to figure out who else is doing this and why. It looks like feedburner feeds flip even more wildly. I re-ran the command above after hunting and now it's completely changed and travelled backwards:
➜ rss2text git:(master) ✗ for i in {1..5}; do curl -sI https://www.trevorparker.com/rss.xml | grep 'Last-Modified'; done
Last-Modified: Sat, 15 Feb 2014 19:31:13 GMT
Last-Modified: Sat, 15 Feb 2014 19:31:13 GMT
Last-Modified: Sat, 15 Feb 2014 19:31:13 GMT
Last-Modified: Sat, 15 Feb 2014 19:31:13 GMT
Last-Modified: Sat, 15 Feb 2014 19:31:13 GMT
Last_pulled_dt is based on the data inside the feed, usually the pubDate of the first item in the feed. On your blog, my cached last pull date is "2013-12-23T14:18:09Z", which would make every pull 200.
I'm going on a "stop the 200s" hunt now though.
from rss2text.
Last_pulled_dt is based on the data inside the feed
Ah, I incorrectly determined that last_pulled_dt
is the last time you requested a feed.
Is this custom code flipping seconds or Nginx trying to actually make me sad?
This is a result of launching a Jekyll build at the same time across multiple containers, and the build taking slightly longer on one or more containers and crossing a seconds boundary. These are the newest in a set of containers, which also means:
and now it's completely changed and travelled backwards:
is going to happen when you get bounced back to an older container.
The Right Thing for me to do is to build once, then deploy -- which is what I was doing until yesterday. I might end up just moving the balancing out of DNS and let nginx balance based on ip_hash
.
from rss2text.
Related Issues (20)
- filename md5 should be user + filename instead of just filename
- Output order as an option HOT 3
- LWP redirects + UA HOT 1
- ThreeWordPhrase's RSS freaks out pubDate in XML::FeedPP, determine source HOT 1
- Cached 'recent' date needs more intelligent decision path HOT 1
- Cry about time formats. Support more. Cry about undocumented get_pubDate_epoch sub
- Process in parallel HOT 2
- Verbose logging by duping stderr HOT 1
- Actually update the cache with the etag and last-modified values
- Add a real license
- Break it into proper separate files and fatpack/PAR it HOT 1
- Write a blog post about rss2text
- Better detection of new entries
- Use /tmp/ instead of /var/cache/ and offer a cache_dir option HOT 1
- Capture DateTime::Format::W3CDTF death
- Add support for feeds that need cookies/auth HOT 2
- Write tests HOT 3
- fake autoload on certain available commands
- Allow passing of multiple feeds as parameters
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rss2text.