Giter Club home page Giter Club logo

ckan-solr's Introduction

CKAN: The Open Source Data Portal Software

License Documentation Support on StackOverflow Build Status Coverage Status Chat on Gitter

CKAN is the world’s leading open-source data portal platform. CKAN makes it easy to publish, share and work with data. It's a data management system that provides a powerful platform for cataloging, storing and accessing datasets with a rich front-end, full API (for both data and catalog), visualization tools and more. Read more at ckan.org.

Installation

See the CKAN Documentation for installation instructions.

Support

If you need help with CKAN or want to ask a question, use either the ckan-dev mailing list, the CKAN chat on Gitter, or the CKAN tag on Stack Overflow (try searching the Stack Overflow and ckan-dev archives for an answer to your question first).

If you've found a bug in CKAN, open a new issue on CKAN's GitHub Issues (try searching first to see if there's already an issue for your bug).

If you find a potential security vulnerability please email [email protected], rather than creating a public issue on GitHub.

Contributing to CKAN

For contributing to CKAN or its documentation, see CONTRIBUTING.

Mailing List

Subscribe to the ckan-dev mailing list to receive news about upcoming releases and future plans as well as questions and discussions about CKAN development, deployment, etc.

Community Chat

If you want to talk about CKAN development say hi to the CKAN developers and members of the CKAN community on the public CKAN chat on Gitter. Gitter is free and open-source; you can sign in with your GitHub, GitLab, or Twitter account.

The logs for the old #ckan IRC channel (2014 to 2018) can be found here: https://github.com/ckan/irc-logs.

Wiki

If you've figured out how to do something with CKAN and want to document it for others, make a new page on the CKAN wiki and tell us about it on the ckan-dev mailing list or on Gitter.

Copying and License

This material is copyright (c) 2006-2023 Open Knowledge Foundation and contributors.

It is open and licensed under the GNU Affero General Public License (AGPL) v3.0 whose full text may be found at:

http://www.fsf.org/licensing/licenses/agpl-3.0.html

ckan-solr's People

Contributors

amercader avatar avdata99 avatar pdelboca avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

ckan-solr's Issues

CKAN 2.9 images embed CKAN 2.10 schema

I've noticed a discrepancy between the Makefiles and the Dockerfiles in regards to how the CKAN version is handled. As a result, the CKAN 2.9 versions of the images end up built with the wrong schemas.

The solrN/Makefile supports the CKAN_VERSION variable. It expects something of the form X.Y, and creates the ckan/ckan-solr:X.Y-solrN Docker image.

The solrN/Dockerfile and solrN/Dockerfile.spatial use a Docker ARG called CKAN_BRANCH that is hardcoded with the default value dev-v2.10.

Importantly, the build argument in the Dockerfiles is not integrated with the corresponding Makefile. What this means is that, regardless of the value of CKAN_VERSION, the dev-v2.10 branch is always used to pull the Solr schema.

Demonstration

To demonstrate, I ran these steps:

# Build against CKAN 2.9
$ make build CKAN_VERSION=2.9
(... output truncated ...)

# Create a container called solr using the image that was built
$ docker run -d --rm --name solr ckan/ckan-solr:2.9-solr9
(... output truncated ...)

# Print out the schema and look for the "schema name" block
$ docker exec solr cat /opt/solr/server/solr/configsets/ckan/conf/managed-schema | grep 'schema name'
<schema name="ckan-2.10" version="1.6">

As the output of the final command shows, the schema is the ckan-2.10 schema, even though I supposedly built a Solr image for CKAN 2.9. As further confirmation, here is the raw schema from the ckan-2.9.9 tag:

https://raw.githubusercontent.com/ckan/ckan/ckan-2.9.9/ckan/config/solr/schema.xml

The relevant line (below) does not match what is installed in the ckan-solr:2.9-solr9 image:

<schema name="ckan" version="2.9">

Possible Fix 1

I think as an easy solution, you could introduce CKAN_BRANCH in the Makefile and explicitly pass it via --build-arg when you invoke docker build, so that users can explicitly select the branch (or tag) from which the Solr schema will be pulled:

CKAN_VERSION="2.10"
CKAN_BRANCH="dev-v2.10"

(...)

build:
    docker build --build-arg CKAN_BRANCH=$(CKAN_BRANCH) -t $(TAG_NAME) -f Dockerfile .
    docker build --build-arg CKAN_BRANCH=$(CKAN_BRANCH) -t $(TAG_NAME) -f Dockerfile.spatial .

I think allowing the CKAN_BRANCH override is important, because it lets people choose to build based on a specific tag. For my team's use-case, we need to be able to specify a specific tagged version of CKAN to build from, for reproducibility.

# Build based on the the ckan-2.10.1 tag in ckan/ckan
make build CKAN_BRANCH=ckan-2.10.1

That said, the CKAN_BRANCH option is still disjoint from the CKAN_VERSION argument, so users would need to specify both values to correctly build a CKAN 2.9-compatible image. For example:

# Build based on the ckan-2.9.9 tag in ckan/ckan
make build CKAN_VERSION=2.9 CKAN_BRANCH=ckan-2.9.9

This is somewhat confusing, but with proper documentation / examples it is probably not a huge deal.

Possible Fix 2

As an alternative option, I am wondering if a better strategy is to adopt a strategy more similar to what ckan/ckan-docker-base does, where directories map to explicit versions of CKAN:

ckan-solr/
  ckan-2.10/
    solr8/
    solr9/
  ckan-2.9/
    solr8/
    solr9/

There is more code duplication this way, but it's a lot more clear what you're getting with each Makefile, and theoretically you won't ever need to update these once they're created. This would also allow you to have better control over which version combinations of Solr & CKAN are supported.

If this approach were to be implemented, I would still want the ability to override the CKAN_BRANCH value at build time so that I can guarantee that I'm building something compatible with CKAN 2.10.1 or CKAN 2.9.8 (for example). The advantage of this approach over the other one is that we would not need to worry about CKAN_VERSION, because that would be set correctly for the Makefile based on the directory we are in.

Handling of Solr Security Issues

Hi,

Note: These images are not vulnerable to CVE-2021-44228 / Log4J2 as are built on top of patched upstream Solr images.

This is right, but there won't be any further updates. The linked repository is archived and won't receive updates anymore.
The new repository is here: https://github.com/apache/solr-docker
So e.g. the /sql handler issue seems not going to be patched, see https://solr.apache.org/security.html

So i would like to know, why these old Solr versions are in use?
Doesn't CKAN support newer Solr APIs?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.