Giter Club home page Giter Club logo

feed-extractor's Introduction

feed-extractor

To read & normalize RSS/ATOM/JSON feed data.

npm version CodeQL CI test Coverage Status

(This library is derived from feed-reader renamed.)

Demo

Install & Usage

Node.js

npm i @extractus/feed-extractor
import { extract } from '@extractus/feed-extractor'

// extract a RSS
const result = await extract('https://news.google.com/rss')
console.log(result)

Deno

import { extract } from 'npm:@extractus/feed-extractor'

Browser

import { extract } from 'https://esm.sh/@extractus/feed-extractor'

Please check the examples for reference.

Automate RSS feed extraction with GitHub Actions

RSS Feed Fetch Action is a GitHub Action designed to automate the fetching of RSS feeds. It fetches an RSS feed from a given URL and saves it to a specified file in your GitHub repository. This action is particularly useful for populating content on GitHub Pages websites or other static site generators.

CJS Deprecated

CJS is deprecated for this package. When calling require('@extractus/feed-extractor') a deprecation warning is now logged. You should update your code to use the ESM export.

  • You can ignore this warning via the environment variable FEED_EXTRACTOR_CJS_IGNORE_WARNING=true
  • To see where the warning is coming from you can set the environment variable FEED_EXTRACTOR_CJS_TRACE_WARNING=true

APIs

Note:

  • Old method read() has been marked as deprecated and will be removed in next major release.

extract()

Load and extract feed data from given RSS/ATOM/JSON source. Return a Promise object.

Syntax

extract(String url)
extract(String url, Object parserOptions)
extract(String url, Object parserOptions, Object fetchOptions)

Example:

import { extract } from '@extractus/feed-extractor'

const result = await extract('https://news.google.com/atom')
console.log(result)

Without any options, the result should have the following structure:

{
  title: String,
  link: String,
  description: String,
  generator: String,
  language: String,
  published: ISO Date String,
  entries: Array[
    {
      id: String,
      title: String,
      link: String,
      description: String,
      published: ISO Datetime String
    },
    // ...
  ]
}

Parameters

url required

URL of a valid feed source

Feed content must be accessible and conform one of the following standards:

parserOptions optional

Object with all or several of the following properties:

  • normalization: Boolean, normalize feed data or keep original. Default true.
  • useISODateFormat: Boolean, convert datetime to ISO format. Default true.
  • descriptionMaxLen: Number, to truncate description. Default 250 characters. Set to 0 = no truncation.
  • xmlParserOptions: Object, used by xml parser, view fast-xml-parser's docs
  • getExtraFeedFields: Function, to get more fields from feed data
  • getExtraEntryFields: Function, to get more fields from feed entry data
  • baseUrl: URL string, to absolutify the links within feed content

For example:

import { extract } from '@extractus/feed-extractor'

await extract('https://news.google.com/atom', {
  useISODateFormat: false
})

await extract('https://news.google.com/rss', {
  useISODateFormat: false,
  getExtraFeedFields: (feedData) => {
    return {
      subtitle: feedData.subtitle || ''
    }
  },
  getExtraEntryFields: (feedEntry) => {
    const {
      enclosure,
      category
    } = feedEntry
    return {
      enclosure: {
        url: enclosure['@_url'],
        type: enclosure['@_type'],
        length: enclosure['@_length']
      },
      category: isString(category) ? category : {
        text: category['@_text'],
        domain: category['@_domain']
      }
    }
  }
})
fetchOptions optional

fetchOptions is an object that can have the following properties:

  • headers: to set request headers
  • proxy: another endpoint to forward the request to
  • agent: a HTTP proxy agent
  • signal: AbortController signal or AbortSignal timeout to terminate the request

For example, you can use this param to set request headers to fetch as below:

import { extract } from '@extractus/feed-extractor'

const url = 'https://news.google.com/rss'
await extract(url, null, {
  headers: {
    'user-agent': 'Opera/9.60 (Windows NT 6.0; U; en) Presto/2.1.1'
  }
})

You can also specify a proxy endpoint to load remote content, instead of fetching directly.

For example:

import { extract } from '@extractus/feed-extractor'

const url = 'https://news.google.com/rss'

await extract(url, null, {
  headers: {
    'user-agent': 'Opera/9.60 (Windows NT 6.0; U; en) Presto/2.1.1'
  },
  proxy: {
    target: 'https://your-secret-proxy.io/loadXml?url=',
    headers: {
      'Proxy-Authorization': 'Bearer YWxhZGRpbjpvcGVuc2VzYW1l...'
    }
  }
})

Passing requests to proxy is useful while running @extractus/feed-extractor on browser. View examples/browser-feed-reader as reference example.

Another way to work with proxy is use agent option instead of proxy as below:

import { extract } from '@extractus/feed-extractor'

import { HttpsProxyAgent } from 'https-proxy-agent'

const proxy = 'http://abc:[email protected]:31113'

const url = 'https://news.google.com/rss'

const feed = await extract(url, null, {
  agent: new HttpsProxyAgent(proxy),
})
console.log('Run feed-extractor with proxy:', proxy)
console.log(feed)

For more info about https-proxy-agent, check its repo.

By default, there is no request timeout. You can use the option signal to cancel request at the right time.

The common way is to use AbortControler:

const controller = new AbortController()

// stop after 5 seconds
setTimeout(() => {
  controller.abort()
}, 5000)

const data = await extract(url, null, {
  signal: controller.signal,
})

A newer solution is AbortSignal's timeout() static method:

// stop after 5 seconds
const data = await extract(url, null, {
  signal: AbortSignal.timeout(5000),
})

For more info:

extractFromJson()

Extract feed data from JSON string. Return an object which contains feed data.

Syntax

extractFromJson(String json)
extractFromJson(String json, Object parserOptions)

Example:

import { extractFromJson } from '@extractus/feed-extractor'

const url = 'https://www.jsonfeed.org/feed.json'
// this resource provides data in JSON feed format
// so we fetch remote content as json
// then pass to feed-extractor
const res = await fetch(url)
const json = await res.json()

const feed = extractFromJson(json)
console.log(feed)

Parameters

json required

JSON string loaded from JSON feed resource.

parserOptions optional

See parserOptions above.

extractFromXml()

Extract feed data from XML string. Return an object which contains feed data.

Syntax

extractFromXml(String xml)
extractFromXml(String xml, Object parserOptions)

Example:

import { extractFromXml } from '@extractus/feed-extractor'

const url = 'https://news.google.com/atom'
// this resource provides data in ATOM feed format
// so we fetch remote content as text
// then pass to feed-extractor
const res = await fetch(url)
const xml = await res.text()

const feed = extractFromXml(xml)
console.log(feed)

Parameters

xml required

XML string loaded from RSS/ATOM feed resource.

parserOptions optional

See parserOptions above.

Test

git clone https://github.com/extractus/feed-extractor.git
cd feed-extractor
pnpm i
pnpm test

feed-extractor-test.png

Quick evaluation

git clone https://github.com/extractus/feed-extractor.git
cd feed-extractor
pnpm i
pnpm eval https://news.google.com/rss

License

The MIT License (MIT)

Support the project

If you find value from this open source project, you can support in the following ways:

Thank you.


feed-extractor's People

Contributors

almis90 avatar ekoeryanto avatar eviltik avatar gouz avatar kahosan avatar m4rc3l05 avatar ndaidong avatar neizod avatar olsonpm avatar turt2live avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

feed-extractor's Issues

Add options to get specifics fields

Hi,

Your library is really simple and great but I have a problem. Actually, I want to get the guid field of rss items but your library don't return it.

It's possible to add the possibility to set the fields to return on option of the parser ?

Thanks

Hardcoded attributeNamePrefix value in xmlParserOptions

Hi!

First and foremost, thanks for your work!

I've been using the library in my GitHub action and I tried to change attributeNamePrefix property in xmlParserOptions but it didn't work. I had a look at the code and noticed it's hardcoded and thus impossible to change:

attributeNamePrefix: '@_',

Is there any reasoning behind this decision I'm not aware of? Is it possible to make it modifiable via xmlParserOptions like the rest of the properties?

I can provide a pull request for this if you don't mind.

Thanks a lot!

Empty description when content is wrapped in CDATA

Hi!

When I pass a feed that contains content wrapped in CDATA tags, normalized feed entry contains empty description.

Sample feeds:

For now I use a dirty workaround using getExtraEntryFields and some custom code to process HTML:

getExtraEntryFields: (feedEntry) => {
	const cdataDescription = feedEntry.description.includes("<![CDATA[")
	  ? stripAndTruncateHTML(
	      feedEntry.description
	        .replaceAll("<![CDATA[", "")
	        .replaceAll("]]>'", ""),
	      siteConfig.maxPostLength
	    )
	  : "";

	return { cdataDescription };
}

Also - do you have a donation link or something? I'd love to buy you a coffee because this project ROCKS. โค๏ธ

Minor regression in v7.0.3

Hi!

After fixing #105 I noticed a small regression among RSS feeds that serve both content:encoded and description in their items.

content:encoded is used first (even though human-friendly description is available) and it results in a pile of HTML/CSS code being served.

Feed that shows this problem: https://turystyka-niecodzienna.pl/rss

I suspect it may be possible to fix by switching the order from content || description into description || content (and perhaps htmlContent || description into description || htmlContent?) in 3e1d612#diff-79bdb3bf907b1dc8f0ca3b16390b8e93716d86d536837a4cbda4d9b0b2b19ee7

fetch error: cert invalid

Type:     FetchError
Message:  request to https://www.logseqtimes.com/rss/ failed, reason: Hostname/IP does not match certificate's altnames: Host: www.logseqtimes.com. is not in the cert's altnames: DNS:fallback.tls.fastly.net

ref: avelino/bots.clj.social#103


there should be an option to bypass the certificate check

missing optionals entry fields ?

Hi again :)

Below a raw rss entry

<entry>
    <author>
        <name>/u/0xdea</name>
        <uri>https://www.reddit.com/user/0xdea</uri>
    </author>
    <category term="netsec" label="r/netsec"/>
    <content type="html">&amp;#32; submitted by &amp;#32; &lt;a href=&quot;https://www.reddit.com/user/0xdea&quot;&gt; /u/0xdea &lt;/a&gt; &lt;br/&gt; &lt;span&gt;&lt;a href=&quot;https://security.humanativaspa.it/automating-binary-vulnerability-discovery-with-ghidra-and-semgrep/&quot;&gt;[link]&lt;/a&gt;&lt;/span&gt; &amp;#32; &lt;span&gt;&lt;a href=&quot;https://www.reddit.com/r/netsec/comments/vtcsdv/automating_binary_vulnerability_discovery_with/&quot;&gt;[comments]&lt;/a&gt;&lt;/span&gt;</content>
    <id>t3_vtcsdv</id>
    <link href="https://www.reddit.com/r/netsec/comments/vtcsdv/automating_binary_vulnerability_discovery_with/" />
    <updated>2022-07-07T07:27:52+00:00</updated>
    <published>2022-07-07T07:27:52+00:00</published>
    <title>Automating binary vulnerability discovery with Ghidra and Semgrep</title>
</entry>

Below attributes returned by feed-reader, some fields are missing

{
  title: 'Automating binary vulnerability discovery with Ghidra and Semgrep',
  link: 'https://www.reddit.com/r/netsec/comments/vtcsdv/automating_binary_vulnerability_discovery_with/',
  description: 'submitted by /u/0xdea [link] [comments]',
  published: '2022-07-07T07:27:52.000Z',
}

We should expect something like that

{
  id:'t3_vtcsdv',
  author: {
    name:'/u/0xdea',
    uri:'https://www.reddit.com/user/0xdea'
  },
  category: {
      term:'netsec',
      label:'r/netsec'
  },
  content:{
      type:"html',
      rawValue:'&amp;#32; submitted by &amp;#32; &lt;a href=&quot;https://www.reddit.com/user/0xdea&quot;&gt; /u/0xdea &lt;/a&gt; &lt;br/&gt; &lt;span&gt;&lt;a href=&quot;https://security.humanativaspa.it/automating-binary-vulnerability-discovery-with-ghidra-and-semgrep/&quot;&gt[link]&lt;/a&gt;&lt;/span&gt;&amp;#32;&lt;span&gt;&lt;ahref=&quot;https://www.reddit.com/r/netsec/comments/vtcsdv/automating_binary_vulnerability_discovery_with/&quot;&gt;[comments]&lt;/a&gt;&lt;/span&gt;'
  },
  title: 'Automating binary vulnerability discovery with Ghidra and Semgrep',
  link: 'https://www.reddit.com/r/netsec/comments/vtcsdv/automating_binary_vulnerability_discovery_with/',
  description: 'submitted by /u/0xdea [link] [comments]',
  published: '2022-07-07T07:27:52.000Z',
  updated: '2022-07-07T07:27:52.000Z',
}

see #36
see #13

So, before start coding on my side, i'd like to know why you didn't implement all fields ? missing opportunity ? don't have time ? or you don't want for good reasons ?

Your module could be a good one, because many of others are using "request" module, which is deprecated since a long time now. Good opportunity. But if we can not access all other fields, your module will stay invisible

What do you think ? thank you !

RSS Results Structure Changes Depending on Normalization

Hello!

I am glad to have found your module, it looks like it will make handling feeds easy.


The structure of the results from fetching an RSS feed depends on if the normalization option is set.

If it is false then the object containing the feed items is called item and if it is true then it is called entries.

I do not know if this is intended behaviour, however the documentation doesn't mention it either way.

Some items being ignore due to hardcoded limits

Can I ask what the purpose behind the length checks at the beginning of normalize() are? Specifically this:

  if (!link || !title ||
    !isString(link) || !isString(title) ||
โ†’   link.length < 10 || title.length < 10) {
    return false;
  }

It's taken me ages to track down why some items in a feed are returning as undefined, and it's because the title was short.

Cannot define extra entry fields to fetch

Hi,
The RSS feed I fetch includes, for each item, an illustration image of the published article. As this field is not provided by default, I have tried to define it in the parser options as explained here in the documentation, but I get the following error when executing my script:

TypeError: Cannot read properties of undefined (reading '@_url')

Here is the definition of my options (I use typescript):

const options = {
    getExtraEntryFields: (entryData: any /* What is the expected type ?? */) => {
        const { enclosure } = entryData
        return {
            enclosure: {
                url: enclosure['@_url'], // enclosure is undefined ...
                type: enclosure['@_type'], // enclosure is undefined ...
            }
        }
    }
}

const rss = await extract(url, options)

I'm new to using RSS feeds, can you help me understand the error and get to the point?

CDATA in description not parsed as desired

Hi Team,

Thanks for building this open-source tool. I'm new to dealing with RSS feeds and wanted an easy way to parse the data into typed objects. I'm having an issue with one feed where they have embedded A LOT Of CDATA in the description, with a lot of HTML with styles and links to images, etc..

Here is an example:
(NOTE: some of this is being hidden by the browser; open this issue in Edit view to see all the data might work. If there is a way to prevent it from rendering as HTML in this Issue, I don't know how.)

<description><![CDATA[<a href="https://someorg.org/blog/meeting-the-obligations-of-the-german-supply-chain-due-diligence-act-faqs/" title="Meeting the Obligations of the German Supply Chain Due Diligence Act: FAQs" rel="nofollow"><img width="300" height="157" src="https://someorg.org/wp-content/uploads/2022/11/Blog-German-DD-FI-300x157.jpg" class="webfeedsFeaturedVisual wp-post-image" alt="German Flag over building" decoding="async" style="float: left; margin-right: 5px;" link_thumbnail="1" loading="lazy" srcset="https://someorg.org/wp-content/uploads/2022/11/Blog-German-DD-FI-300x157.jpg 300w, https://someorg.org/wp-content/uploads/2022/11/Blog-German-DD-FI-1024x536.jpg 1024w, https://someorg.org/wp-content/uploads/2022/11/Blog-German-DD-FI-768x402.jpg 768w, https://someorg.org/wp-content/uploads/2022/11/Blog-German-DD-FI.jpg 1200w" sizes="(max-width: 300px) 100vw, 300px" /></a><p>The German Supply Chain <span class="glossaryLink"  aria-describedby="tt"  data-cmtooltip="&#38;lt;!-- wp:paragraph --&#38;gt;Often the second stage in the third-party risk management life cycle. Due diligence involves conducting a review of a potential third party prior to signing a contract. This review should involve developing a deeper understanding of the third party&#8217;s ownership, operations, resources, financial status, relevant employees, risk and control framework, business continuity program, third-party risk management program, and other factors important to the third-party relationship. Due diligence helps ensure the organization selects an appropriate third party to partner with, and that the organization understands both the inherent and residual risks posed by the relationship. These residual risks should be within the organization&#8217;s risk appetite.&#38;lt;br/&#38;gt;&#38;lt;!-- /wp:paragraph --&#38;gt;"  data-gt-translate-attributes='[{"attribute":"data-cmtooltip", "format":"html"}]'>Due Diligence</span> Act goes into effect January 2023 and is already making waves within supply chain, risk management, and compliance communities. [&#8230;]</p>
<p>The post <a rel="nofollow" href="https://someorg.org/blog/meeting-the-obligations-of-the-german-supply-chain-due-diligence-act-faqs/">Meeting the Obligations of the German Supply Chain Due Diligence Act: FAQs</a> appeared first on <a rel="nofollow" href="https://someorg.org">Aravo</a>.</p>
]]></description>

Options: { descriptionMaxLen: 20000, xmlParserOptions: { // I've tried a bunch. . . nothing "worked"} }

Output:

description:  "The German Supply Chain Due Diligence Act goes into effect January 2023 and is already making waves within supply chain, risk management, and compliance communities. [&#8230;] The post Meeting the Obligations of the German Supply Chain Due Diligence Act: FAQs appeared first on Aravo."
link:  "https://aravo.com/blog/meeting-the-obligations-of-the-german-supply-chain-due-diligence-act-faqs/"
published:  "2022-12-01T14:33:07.000Z"
title:  "Meeting the Obligations of the German Supply Chain Due Diligence Act: FAQs"

Desired output: All contents of the description CDATA

Questions:

  • Is this something that can be supported?
  • How unusual (to you) is this use of the description field (all CDATA of HTML)?

fast-xml-parser regex vulnerability patch could be improved from a safety perspective

Summary

This is a comment on GHSA-6w63-h3fj-q4vw and the patches fixing it.

ref GHSA-gpv5-7x3g-ghjv

Details

The code which validates a name calls the validator:
https://github.com/NaturalIntelligence/fast-xml-parser/blob/ecf6016f9b48aec1a921e673158be0773d07283e/src/xmlparser/DocTypeReader.js#L145-L153
This checks for the presence of an invalid character. Such an approach is always risky, as it is so easy to forget to include an invalid character in the list. A safer approach is to validate entity names against the XML specification: https://www.w3.org/TR/xml11/#sec-common-syn - an ENTITY name is a Name:

[4]   NameStartChar ::= ":" | [A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] |
                        [#x37F-#x1FFF] | [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] |
                        [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]
[4a]  NameChar ::= NameStartChar | "-" | "." | [0-9] | #xB7 | [#x0300-#x036F] | [#x203F-#x2040]
[5]   Name ::= NameStartChar (NameChar)*

so the safest way to validate an entity name is to build a regex to represent this expression and check whether the name given matches the regex. (Something along the lines of /^[name start char class][name char class]*$/.) There's probably a nice way to simplify the explicit list rather than typing it out verbatim using Unicode character properties, but I don't know enough to do so.

Disable item description trimming?

Is it possible to skip truncating the description and returning full contents in any way?

Right now I provide descriptionMaxLenwith some impossible value (999999) but it's a bit hacky workaround.

It would be great if I could pass -1 or false to skip description truncation altogether.

Add `id` property to entries

Nice library!

It would be helpful to have an 'id' property added to each entry. This allows an entry to be uniquely tracked, and ensures that if the URL of a feed item updates, it's still considered the same entry.

  • JSON Feed has a required id property.
  • RSS has guid, but it is optional. If it's not set the recommendation is generally to just use the URL instead as the unique identifier.
  • Atom has the id field for each entry, which is also required.

So it should be pretty easy to normalize this into an id field and make it non-optional in the type definition.

IDE Type Error when adding optional FeedEntries like category

In the index.d.ts file

export interface FeedEntry {
/**
* id, guid, or generated identifier for the entry
*/
id: string;
link?: string;
title?: string;
description?: string;
published?: Date;
}

Since the feed entry allows for custom extra keys like categories and enclosure, Adding an optional parameter would stop the error that pops up in VSCode.

image

In my case, since I am using a string for tags (fetching medium RSS), adding the line like:

interface FeedEntry{
   ...
   category?: Array<string>;
}

But I believe this would fail again if we have a custom object for categories like text and domain as mentioned in the examples on npm.

Add support for fetch options

I needed to be able to pass options to the underlying fetch to adjust timeout, etc. Here's a patch to enable that, in case it's of use to anyone else:

@@ -15,9 +15,9 @@
 var isArray = bella.isArray;
 var isObject = bella.isObject;

-var toJSON = (source) => {
+var toJSON = (source, opts) => {
   return new Promise((resolve, reject) => {
-    fetch(source).then((res) => {
+    fetch(source, opts).then((res) => {
       if (res.ok && res.status === 200) {
         return res.text();
       }
@@ -174,9 +174,9 @@
 };


-var parse = (url) => {
+var parse = (url, opts = {}) => {
   return new Promise((resolve, reject) => {
-    toJSON(url).then((o) => {
+    toJSON(url, opts).then((o) => {
       let result;
       if (o.rss && o.rss.channel) {
         let t = o.rss.channel;

The link cannot be resolved when the hostname is not included

example:

<channel>
  <link>/</link>
  <language>en</language>
  <atom:link href="/index.xml" rel="self" type="application/rss+xml" />
  <item>
    <link>/posts/2023/06/piem/</link>
	<guid>/posts/2023/06/piem/</guid>
  </item>
</channel>

When the link is in the above format, it will be resolved as null:

{
  "link": null,
  "language": "en",
  "atom:link": {
    "@_href": "/index.xml",
    "@_rel": "self",
    "@_type": "application/rss+xml"
  },
  "item": [
    {
      "link": null,
      "guid": "/posts/2023/06/piem/"
    }
  ]
}

Support Get favicon

hi, can feed-extractor, oEmbed Extractor support crawling to favicon URL like article-extractor?

CORS

Hi,
thx for this awesome library.

Unfortunately, I can't fetch 90% of the RSS sources because of CORS issues.
Do you have any suggestions on how to solve it?

[Feature Request] need more fields, want the result can be customized via options

I use this tool to parse rss feed, but there are some fields I need not in the result, such as image, owner.

I want two options extraFeedFields and extraEntryFields used as function, their return value will be merged into the feed and entry fields. So everyone can custom the result.

const feedData = await read('https://some-rss-feed-xml/', {
    extraFeedFields: (channel) {
       return {
          image: channel['itunes:image'],
          owner: channel['iutnes:owner']
      }
   },
})

result:

{
  "title": "xxx",
  "link": "xxx",
  "description": "xxx",
  "language": "",
  "generator": "",
  "published": "",
  "entries": [...],
  "image": {...},
  "owner": {...},
}

atom is working, rss2 and rss gets error message

Site: https://abikw.nvii-dev.de
When trying to fetch a feed with atom it's working, but when trying to fetch a feed with rss2 or rss I'm getting the following error:

TypeError: item.map is not a function
parseRSS webpack-internal:///./node_modules/feed-reader/src/utils/parser.js:89
read webpack-internal:///./node_modules/feed-reader/src/main.js:39
getFeedFile webpack-internal:///./node_modules/cache-loader/dist/cjs.js?!./node_modules/babel-loader/lib/index.js!./node_modules/cache-loader/dist/cjs.js?!./node_modules/vue-loader-v16/dist/index.js?!./src/pages/News.vue?vue&type=script&lang=js:28
created webpack-internal:///./node_modules/cache-loader/dist/cjs.js?!./node_modules/babel-loader/lib/index.js!./node_modules/cache-loader/dist/cjs.js?!./node_modules/vue-loader-v16/dist/index.js?!./src/pages/News.vue?vue&type=script&lang=js:38
callWithErrorHandling webpack-internal:///./node_modules/@vue/runtime-core/dist/runtime-core.esm-bundler.js:6824
callWithAsyncErrorHandling webpack-internal:///./node_modules/@vue/runtime-core/dist/runtime-core.esm-bundler.js:6833
callHook webpack-internal:///./node_modules/@vue/runtime-core/dist/runtime-core.esm-bundler.js:2419
applyOptions webpack-internal:///./node_modules/@vue/runtime-core/dist/runtime-core.esm-bundler.js:2321
finishComponentSetup webpack-internal:///./node_modules/@vue/runtime-core/dist/runtime-core.esm-bundler.js:6561
setupStatefulComponent webpack-internal:///./node_modules/@vue/runtime-core/dist/runtime-core.esm-bundler.js:6473
setupComponent webpack-internal:///./node_modules/@vue/runtime-core/dist/runtime-core.esm-bundler.js:6403
mountComponent webpack-internal:///./node_modules/@vue/runtime-core/dist/runtime-core.esm-bundler.js:4258
processComponent webpack-internal:///./node_modules/@vue/runtime-core/dist/runtime-core.esm-bundler.js:4233
patch webpack-internal:///./node_modules/@vue/runtime-core/dist/runtime-core.esm-bundler.js:3837
patchKeyedChildren webpack-internal:///./node_modules/@vue/runtime-core/dist/runtime-core.esm-bundler.js:4722
patchChildren webpack-internal:///./node_modules/@vue/runtime-core/dist/runtime-core.esm-bundler.js:4541
patchElement webpack-internal:///./node_modules/@vue/runtime-core/dist/runtime-core.esm-bundler.js:4057
processElement webpack-internal:///./node_modules/@vue/runtime-core/dist/runtime-core.esm-bundler.js:3917
patch webpack-internal:///./node_modules/@vue/runtime-core/dist/runtime-core.esm-bundler.js:3834
componentUpdateFn webpack-internal:///./node_modules/@vue/runtime-core/dist/runtime-core.esm-bundler.js:4443
run webpack-internal:///./node_modules/@vue/reactivity/dist/reactivity.esm-bundler.js:195
callWithErrorHandling webpack-internal:///./node_modules/@vue/runtime-core/dist/runtime-core.esm-bundler.js:6824
flushJobs webpack-internal:///./node_modules/@vue/runtime-core/dist/runtime-core.esm-bundler.js:7060
cjs.js:31:17

Code I'm using:
getFeedFile() {
const url = 'https://abikw.nvii-dev.de/feed/rss';

    this.read(url)
      .then((feed) => {
        console.log('News - getFeedFile - feed', feed);
      })
      .catch((err) => {
        console.log('News - getFeedFile - error: ', err);
      });
  },

Issue with package types

Hi,

Theres seems to be a problem with the package types, with something along the lines of:

There are types at '.../node_modules/@extractus/feed-extractor/index.d.ts', but this result could not be resolved when respecting package.json "exports". The '@extractus/feed-extractor' library may need to update its package.json or typings.

I was able to fix this locally by adding to the package.json "types": "./index.d.ts", on the exports section.
I can make a PR on this.

Add content:encoded to FeedEntry

Thanks for a great tool. So far I've been using feed-extractor get feed items, and then passing each item's link to article-extractor to get the full article. However, I notice that in most of my feeds, the full text of the article is included in the RSS feed under the content:encoded tag. Is there already a way to get this data using feed-extractor so I wouldn't need to make a second call to article-extractor? It seems to me like it would be cool thing "encoded" were added as a property on FeedEntry, so that when it exists, we have access to it after parsing the feed. Is there a better way to do this?

CERT_HAS_EXPIRED

Hey bro.
I am using you npm package for some project.
I am getting the follow error

Error
at Function.createFromInputFallback (/Users/wellington/Developer/zuntaz-bots/node_modules/moment/moment.js:320:98)
at configFromString (/Users/wellington/Developer/zuntaz-bots/node_modules/moment/moment.js:2385:15)
at configFromInput (/Users/wellington/Developer/zuntaz-bots/node_modules/moment/moment.js:2611:13)
at prepareConfig (/Users/wellington/Developer/zuntaz-bots/node_modules/moment/moment.js:2594:13)
at createFromConfig (/Users/wellington/Developer/zuntaz-bots/node_modules/moment/moment.js:2561:44)
at createLocalOrUTC (/Users/wellington/Developer/zuntaz-bots/node_modules/moment/moment.js:2648:16)
at createLocal (/Users/wellington/Developer/zuntaz-bots/node_modules/moment/moment.js:2652:16)
at hooks (/Users/wellington/Developer/zuntaz-bots/node_modules/moment/moment.js:12:29)
at normalize (/Users/wellington/Developer/zuntaz-bots/node_modules/feed-reader/src/main.js:62:16)
at modify (/Users/wellington/Developer/zuntaz-bots/node_modules/feed-reader/src/main.js:139:14)
at Array.map ()
at toRSS (/Users/wellington/Developer/zuntaz-bots/node_modules/feed-reader/src/main.js:142:20)
at /Users/wellington/Developer/zuntaz-bots/node_modules/feed-reader/src/main.js:224:18
at runMicrotasks ()
at processTicksAndRejections (internal/process/task_queues.js:85:5)
FetchError: request to https://www.muywindows.com/feed failed, reason: certificate has expired
at ClientRequest. (/Users/wellington/Developer/zuntaz-bots/node_modules/node-fetch/index.js:133:11)
at ClientRequest.emit (events.js:209:13)
at TLSSocket.socketErrorListener (_http_client.js:406:9)
at TLSSocket.emit (events.js:209:13)
at emitErrorNT (internal/streams/destroy.js:91:8)
at emitErrorAndCloseNT (internal/streams/destroy.js:59:3)
at processTicksAndRejections (internal/process/task_queues.js:77:11) {
name: 'FetchError',
message: 'request to https://www.muywindows.com/feed failed, reason: certificate has expired',
type: 'system',
errno: 'CERT_HAS_EXPIRED',
code: 'CERT_HAS_EXPIRED'
}

better axios error handler

Ni !
hi @ndaidong

Thank you for your work.

I've forked your project, i'd like to improve error handling. Currently you return null on every axios errors.

What do you think about that ?

Making a PR right now

Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.