Giter Club home page Giter Club logo

express-sitemap-xml's Introduction

express-sitemap-xml travis npm downloads javascript style guide

Express middleware to serve sitemap.xml from a list of URLs

Create an Express middleware that serves sitemap.xml from a list of URLs.

This package automatically handles sitemaps with more than 50,000 URLs. In these cases, multiple sitemap files will be generated along with a "sitemap index" to comply with the sitemap spec and requirements from search engines like Google.

If only one sitemap file is needed (i.e. there are less than 50,000 URLs) then it is served directly at /sitemap.xml. Otherwise, a sitemap index is served at /sitemap.xml and sitemaps at /sitemap-0.xml, /sitemap-1.xml, etc.

Install

npm install express-sitemap-xml

Demo

You can see this package in action on BitMidi, a site for listening to your favorite MIDI files.

Usage (with Express)

The easiest way to use this package is with the Express middleware.

const express = require('express')
const expressSitemapXml = require('express-sitemap-xml')

const app = express()

app.use(expressSitemapXml(getUrls, 'https://bitmidi.com'))

async function getUrls () {
  return await getUrlsFromDatabase()
}

Remember to add a Sitemap line to robots.txt like this:

Sitemap: https://bitmidi.com/sitemap.xml

Usage (without Express)

The package can also be used without the Express middleware.

const { buildSitemaps } = require('express-sitemap-xml')

async function run () {
  const urls = ['/1', '/2', '/3']
  const sitemaps = await buildSitemaps(urls, 'https://bitmidi.com')

  console.log(Object.keys(sitemaps))
  // ['/sitemap.xml']

  console.log(sitemaps['/sitemap.xml'])
  // `<?xml version="1.0" encoding="utf-8"?>
  //  <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  //    <url>
  //      <loc>https://bitmidi.com/1</loc>
  //      <lastmod>${getTodayStr()}</lastmod>
  //    </url>
  //    <url>
  //      <loc>https://bitmidi.com/2</loc>
  //      <lastmod>${getTodayStr()}</lastmod>
  //    </url>
  //    <url>
  //      <loc>https://bitmidi.com/3</loc>
  //      <lastmod>${getTodayStr()}</lastmod>
  //    </url>
  //  </urlset>`
})

Remember to add a Sitemap line to robots.txt like this:

Sitemap: https://bitmidi.com/sitemap.xml

API

middleware = expressSitemapXml(getUrls, base)

Create a sitemap.xml middleware. Both arguments are required.

The getUrls argument specifies an async function that resolves to an array of URLs to be included in the sitemap. Each URL in the array can either be an absolute or relative URL string like '/1', or an object specifying additional options about the URL:

{
  url: '/1',
  lastMod: new Date('2000-02-02'), // optional (specify `true` for today's date)
  changeFreq: 'weekly' // optional
}

For more information about these options, see the sitemap spec. Note that the priority option is not supported because Google ignores it.

The getUrls function is called at most once per 24 hours. The resulting sitemap(s) are cached to make repeated HTTP requests faster.

The base argument specifies the base URL to be used in case any URLs are specified as relative URLs. The argument is also used if a sitemap index needs to be generated and sitemap locations need to be specified, e.g. ${base}/sitemap-0.xml becomes https://bitmidi.com/sitemap-0.xml.

sitemaps = expressSitemapXml.buildSitemaps(urls, base)

Create an object where the keys are sitemap URLs to be served by the server and the values are strings of sitemap XML content. (This function does no caching.)

The urls argument is an array of URLs to be included in the sitemap. Each URL in the array can either be an absolute or relative URL string like '/1', or an object specifying additional options about the URL. See above for more info about the options.

The base argument is the same as above.

The return value is an object that looks like this:

{
  '/sitemap.xml': '<?xml version="1.0" encoding="utf-8"?>...'
}

Or if multiple sitemaps are needed, then the return object looks like this:

{
  '/sitemap.xml': '<?xml version="1.0" encoding="utf-8"?>...',
  '/sitemap-0.xml': '<?xml version="1.0" encoding="utf-8"?>...',
  '/sitemap-1.xml': '<?xml version="1.0" encoding="utf-8"?>...'
}

License

MIT. Copyright (c) Feross Aboukhadijeh.

express-sitemap-xml's People

Contributors

feross avatar greenkeeper[bot] avatar kikobeats avatar lydell avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

express-sitemap-xml's Issues

An in-range update of standard is breaking the build 🚨

The devDependency standard was updated from 13.0.0 to 13.0.1.

🚨 View failing branch.

This version is covered by your current version range and after updating it in your project the build failed.

standard is a devDependency of this project. It might not break your production code or affect downstream projects, but probably breaks your build or test tools, which may prevent deploying or publishing.

Status Details
  • ❌ continuous-integration/travis-ci/push: The Travis CI build failed (Details).

Commits

The new version differs by 4 commits.

See the full diff

FAQ and help

There is a collection of frequently asked questions. If those don’t help, you can always ask the humans behind Greenkeeper.


Your Greenkeeper Bot 🌴

Ability to exclude <lastmod>

I'm generating a sitemap without specifying a lastMod intentionally since that property is optional, but the library is adding it regardless. Can we add an option to not include <lastmod> unless explicitly set?

No trailing slash for index URL

If I try to add an index URL to my sitemap, it always comes out in the sitemap with a trailing slash, which is not the canonical URL.

For example, if I give it a URL which is just an empty string, it returns one with a trailing slash:

const { buildSitemaps } = require('express-sitemap-xml')

const urls = ['']

async function run () {
  const sitemaps = await buildSitemaps(urls, 'https://event1.io')

  console.log(sitemaps['/sitemap.xml'])
  // `<?xml version="1.0" encoding="utf-8"?>
  //  <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  //    <url>
  //      <loc>https://event1.io/</loc>
  //      <lastmod>${getTodayStr()}</lastmod>
  //    </url>
  //  </urlset>`
}

I would expect it to look like this:

<?xml version="1.0" encoding="utf-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>https://event1.io</loc>
    <lastmod>${getTodayStr()}</lastmod>
  </url>
</urlset>`

It looks as if this is due to the fact that the toAbsolute function creates a new URL object and returns the href which always adds a trailing slash for the index URL.

Wrong content-type

The HTTP content-type header is set to "text/html" rather than "application/xml".

You can try this on bitmidi.com.

❯ curl -i https://bitmidi.com/sitemap.xml
HTTP/2 200 
server: nginx
date: Fri, 09 Nov 2018 14:18:21 GMT
content-type: text/html; charset=utf-8
content-length: 445
vary: Accept-Encoding
x-content-type-options: nosniff
referrer-policy: strict-origin-when-cross-origin
strict-transport-security: max-age=31536000; includeSubDomains; preload
etag: W/"1bd-2bBiOMomgckrx3C9S+h6fksNNNc"

<?xml version="1.0" encoding="utf-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <sitemap>
    <loc>https://bitmidi.com/sitemap-0.xml</loc>
    <lastmod>2018-11-08</lastmod>
  </sitemap>
  <sitemap>
    <loc>https://bitmidi.com/sitemap-1.xml</loc>
    <lastmod>2018-11-08</lastmod>
  </sitemap>
  <sitemap>
    <loc>https://bitmidi.com/sitemap-2.xml</loc>
    <lastmod>2018-11-08</lastmod>
  </sitemap>
</sitemapindex>

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.