Giter Club home page Giter Club logo

summa's Introduction

Maintenance

PyPI Version Crates.io NPM

Summa

Summa is a full-text IPFS-friendly search engine that may be launched on both large servers and inside your browser.

Summa can be launched entirely inside your browser, enabling you to search in network published indices without ever having to execute search queries on remote servers.

If you're ready to start, be sure to check out our docs:

Key Features

  • Full-text search engine written in Rust with a wide range of supported queries and ranking functions
  • Server with GRPC API for using the search engine
  • Python asynchronous client library and CLI for the API
  • JS-bindings to launch subset of Summa in browsers
  • Also, you may use Kafka for indexing

Online-documentation

Distribution

Server

⚠️ The project is under active development, we do not publish latest images yet. The best option now is testing

Clients

Donate

You may support us at OpenCollective or by cryptos:

  • monero: 464Wws65yssHdqGKGkFsHmbqNhBJ7zoPrbPTGAJma4VmTngtrJmQEaG9i739CUJJak3esALHpbWGXdVwMghzpFToLD6Q7Ne
  • btc: 3HooXUqJnZ4Ad8AGeqfSZ5QZQE72ZaZgY6
  • eth: 0x009AeabF4aeDe417d324077E7858956e6d0962D6

summa's People

Contributors

dependabot[bot] avatar jarviscraft avatar pasha-pplx avatar ppodolsky avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

summa's Issues

Removal in Iroh Store

CIDs deletion should be supported in Iroh Store, removal absence is a big no go because there is no safe way to evict unused data

Quick start guide fails at create-index-from-file step

The schema specified in the quick start guide has a default_fields element which blows up in the current version of aiosumma. It looks like this was removed in 164310a, so the docs should probably be updated to match.

$ summa-cli localhost:8082 - create-index-from-file ~/summa/schema.yaml
SERVER_RESPONDED:
Traceback (most recent call last):
  File "/nix/store/l3xw4yisgv61a1lrg2rcxm2xjsjs9srx-python3-3.11.4-env/lib/python3.11/site-packages/google/protobuf/internal/python_message.py", line 577, in _GetFieldByName
    return message_descriptor.fields_by_name[field_name]
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^
KeyError: 'default_fields'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/nix/store/7hxdkn32a2qqvacpi4fh7sr73yigv75j-python3.11-aiosumma-2.44.1/bin/.summa-cli-wrapped", line 9, in <module>
    sys.exit(main())
             ^^^^^^
  File "/nix/store/l3xw4yisgv61a1lrg2rcxm2xjsjs9srx-python3-3.11.4-env/lib/python3.11/site-packages/aiosumma/cli.py", line 25, in main
    fire.Fire(client_cli, name='summa-client')
  File "/nix/store/l3xw4yisgv61a1lrg2rcxm2xjsjs9srx-python3-3.11.4-env/lib/python3.11/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nix/store/l3xw4yisgv61a1lrg2rcxm2xjsjs9srx-python3-3.11.4-env/lib/python3.11/site-packages/fire/core.py", line 475, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
                                ^^^^^^^^^^^^^^^^^^^^
  File "/nix/store/l3xw4yisgv61a1lrg2rcxm2xjsjs9srx-python3-3.11.4-env/lib/python3.11/site-packages/fire/core.py", line 689, in _CallAndUpdateTrace
    component = loop.run_until_complete(fn(*varargs, **kwargs))
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nix/store/ng1c2jqy48p1x33j1qyg0n5anhfv31g0-python3-3.11.4/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/nix/store/l3xw4yisgv61a1lrg2rcxm2xjsjs9srx-python3-3.11.4-env/lib/python3.11/site-packages/aiogrpcclient/base.py", line 94, in exposing_wrapper
    result = await f
             ^^^^^^^
  File "/nix/store/l3xw4yisgv61a1lrg2rcxm2xjsjs9srx-python3-3.11.4-env/lib/python3.11/site-packages/aiogrpcclient/base.py", line 39, in inner
    return await method(**data)
           ^^^^^^^^^^^^^^^^^^^^
  File "/nix/store/l3xw4yisgv61a1lrg2rcxm2xjsjs9srx-python3-3.11.4-env/lib/python3.11/site-packages/aiosumma/client.py", line 242, in create_index
    index_service_pb.CreateIndexRequest(
  File "/nix/store/l3xw4yisgv61a1lrg2rcxm2xjsjs9srx-python3-3.11.4-env/lib/python3.11/site-packages/google/protobuf/internal/python_message.py", line 548, in init
    new_val = field.message_type._concrete_class(**field_value)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nix/store/l3xw4yisgv61a1lrg2rcxm2xjsjs9srx-python3-3.11.4-env/lib/python3.11/site-packages/google/protobuf/internal/python_message.py", line 516, in init
    field = _GetFieldByName(message_descriptor, field_name)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nix/store/l3xw4yisgv61a1lrg2rcxm2xjsjs9srx-python3-3.11.4-env/lib/python3.11/site-packages/google/protobuf/internal/python_message.py", line 579, in _GetFieldByName
    raise ValueError('Protocol message %s has no "%s" field.' %
ValueError: Protocol message IndexAttributes has no "default_fields" field.

schema in document missing index_engine

following document to create index
summa-cli localhost:8082 - create-index-from-file schema.yaml and get the following error:

Traceback (most recent call last):
  File "/usr/local/bin/summa-cli", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/site-packages/aiosumma/cli.py", line 24, in main
    fire.Fire(client_cli, name='summa-client')
  File "/usr/local/lib/python3.10/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/usr/local/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/usr/local/lib/python3.10/site-packages/fire/core.py", line 689, in _CallAndUpdateTrace
    component = loop.run_until_complete(fn(*varargs, **kwargs))
  File "/usr/local/Cellar/[email protected]/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/usr/local/lib/python3.10/site-packages/aiogrpcclient/base.py", line 86, in exposing_wrapper
    result = await fn(*bound.args, **bound.kwargs)
  File "/usr/local/lib/python3.10/site-packages/aiogrpcclient/base.py", line 39, in inner
    return await method(**data)
TypeError: SummaClient.create_index() missing 1 required positional argument: 'index_engine'

feat: Use `wasm-futures-executor` for thread pool in WASM

Code in summa should be slightly refactored to support wasm-futures-executor ThreadPool.

Precisely, we need a way to off-load requests to a particular index into separate Web Worker. Inside the query to index we may still use async executing but it is the topic to discuss too.

Elastic search compatibility

We have an application where we would like to run search in browser if the query can't be passed to elastic search (offline, etc). Are there any plans of providing an API compatible with elastic search? Because of it's ubiquity, other search engines (like quickwit, based on tantivy) also provide it. It would be great if summa were to also provide such an API

Failing to attach an IPFS index

I followed the quick start guide and the server is running fine. But I have a bug when trying to attach an IPFS index:

summa-cli 0.0.0.0:8082 attach-index my_lib --ipfs '{"cid": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"}'

Error:

ERROR: The function received no value for the required argument: index_engine
Usage: summa-client 0.0.0.0:8082 attach-index INDEX_NAME INDEX_ENGINE <flags>
  optional flags:        --merge_policy | --request_id | --session_id | --format

For detailed information on this command, run:
  summa-client 0.0.0.0:8082 attach-index --help

I guess the CLI API has changed, so I tried:

summa-cli 0.0.0.0:8082 attach-index \
  nexus_science \
  '{"ipfs": {"chunked_cache_config": {"chunk_size": 10000 "cache_size": 10000}}}' \
  --ipfs '{"cid": "xxxxxxxxxxxxxxxxxxxx"}'

and now the error is:

File "PYTHON_ENV/lib/python3.10/site-packages/aiosumma/client.py", line 72, in attach_index
    index_service_pb.AttachIndexRequest(
TypeError: index_service_pb2.AttachIndexRequest() argument after ** must be a mapping, not str

Error when using `attach-index`

To make it work, I had to do the following:

  • install aiosumma from the master branch of this repo (not from pypi).
  • then in aiosumma/client.py line 77 replace **index_engine, by ipfs=ipfs

Warming up index not working

I tried to use the latest version of summa to index a large IPFS dataset (bafyb4iemblftubydyhfo6xhw56zrhudy2xexqb25f7awrahe3qfplse5g4).

I am using:

  • docker: izihawa/summa-server:0.13.0
  • python client: aiosumma==2.30.1

The indexing works and I can perform searches.

But when trying to run summa-cli 0.0.0.0:8082 warmup-index "xxxxx" --is-full and after waiting 10-20 hours, nothing happens. The data folder size remains constant (about 20Mo) and the logs don't show much besides the below:

2023-02-27T21:59:50.535400Z  INFO tokio-runtime-workers-17 summa_core::components::index_holder: action="warming_up"
2023-02-27T22:06:10.069516Z  WARN  tokio-runtime-workers-2 rustls::conn: Sending fatal alert BadCertificate    
2023-02-27T22:07:38.764484Z  WARN tokio-runtime-workers-11 rustls::conn: Sending fatal alert BadCertificate    
2023-02-27T22:09:28.935775Z  WARN tokio-runtime-workers-18 rustls::conn: Sending fatal alert BadCertificate    
2023-02-27T22:41:42.821623Z  WARN tokio-runtime-workers-25 rustls::conn: Sending fatal alert BadCertificate    
2023-02-27T22:46:17.843424Z  WARN  tokio-runtime-workers-8 rustls::conn: Sending fatal alert BadCertificate    
2023-02-27T22:47:53.240032Z  WARN tokio-runtime-workers-20 rustls::conn: Sending fatal alert BadCertificate    
2023-02-27T22:48:27.795842Z  WARN tokio-runtime-workers-26 rustls::conn: Sending fatal alert BadCertificate    
2023-02-27T22:50:18.449929Z  WARN tokio-runtime-workers-18 rustls::conn: Sending fatal alert BadCertificate    
2023-02-27T22:53:19.389835Z  WARN tokio-runtime-workers-12 rustls::conn: Sending fatal alert BadCertificate    
2023-02-27T22:56:08.556449Z  WARN tokio-runtime-workers-15 rustls::conn: Sending fatal alert BadCertificate    
2023-02-27T22:56:28.528263Z  WARN tokio-runtime-workers-12 rustls::conn: Sending fatal alert BadCertificate    
2023-02-27T22:59:43.933639Z  WARN tokio-runtime-workers-15 rustls::conn: Sending fatal alert BadCertificate    
2023-02-27T23:02:04.264890Z  WARN tokio-runtime-workers-26 rustls::conn: Sending fatal alert BadCertificate    
2023-02-27T23:19:00.988748Z  WARN tokio-runtime-workers-10 rustls::conn: Sending fatal alert BadCertificate    
2023-02-27T23:25:39.076722Z  WARN  tokio-runtime-workers-5 rustls::conn: Sending fatal alert BadCertificate    
2023-02-27T23:30:49.920279Z  WARN tokio-runtime-workers-26 rustls::conn: Sending fatal alert BadCertificate    
2023-02-27T23:31:35.280767Z  WARN tokio-runtime-workers-23 rustls::conn: Sending fatal alert BadCertificate    
2023-02-27T23:32:08.273081Z  WARN  tokio-runtime-workers-1 rustls::conn: Sending fatal alert BadCertificate    
2023-02-27T23:35:19.405701Z  WARN tokio-runtime-workers-10 rustls::conn: Sending fatal alert BadCertificate    
2023-02-27T23:37:57.243325Z  WARN  tokio-runtime-workers-2 rustls::conn: Sending fatal alert BadCertificate    
2023-02-27T23:38:23.882930Z  WARN  tokio-runtime-workers-0 rustls::conn: Sending fatal alert BadCertificate    
2023-02-27T23:39:45.993799Z  WARN tokio-runtime-workers-26 rustls::conn: Sending fatal alert BadCertificate    
2023-02-27T23:40:36.507857Z  WARN  tokio-runtime-workers-2 rustls::conn: Sending fatal alert BadCertificate    
2023-02-27T23:44:43.510741Z  WARN  tokio-runtime-workers-5 rustls::conn: Sending fatal alert BadCertificate    
2023-02-27T23:46:14.269627Z  WARN  tokio-runtime-workers-8 rustls::conn: Sending fatal alert BadCertificate    
2023-02-27T23:55:48.495018Z  WARN tokio-runtime-workers-15 rustls::conn: Sending fatal alert BadCertificate    
2023-02-27T23:59:13.761939Z  WARN tokio-runtime-workers-11 rustls::conn: Sending fatal alert BadCertificate    
2023-02-28T00:01:01.027181Z  WARN tokio-runtime-workers-23 rustls::conn: Sending fatal alert BadCertificate    
2023-02-28T00:02:27.219973Z  WARN tokio-runtime-workers-28 rustls::conn: Sending fatal alert BadCertificate    
2023-02-28T00:07:29.976689Z  WARN  tokio-runtime-workers-8 rustls::conn: Sending fatal alert BadCertificate    
2023-02-28T00:11:00.711415Z  WARN  tokio-runtime-workers-2 rustls::conn: Sending fatal alert BadCertificate    
2023-02-28T00:11:23.335158Z  WARN tokio-runtime-workers-25 rustls::conn: Sending fatal alert BadCertificate    
2023-02-28T00:15:48.766198Z  WARN  tokio-runtime-workers-5 rustls::conn: Sending fatal alert BadCertificate    
2023-02-28T00:16:41.485513Z  WARN  tokio-runtime-workers-5 rustls::conn: Sending fatal alert BadCertificate    
2023-02-28T00:16:58.379112Z  WARN tokio-runtime-workers-14 rustls::conn: Sending fatal alert BadCertificate    
2023-02-28T00:23:24.459728Z  WARN tokio-runtime-workers-12 rustls::conn: Sending fatal alert BadCertificate    
2023-02-28T00:26:11.919729Z  WARN tokio-runtime-workers-14 rustls::conn: Sending fatal alert BadCertificate    
2023-02-28T00:32:17.039008Z  WARN tokio-runtime-workers-26 rustls::conn: Sending fatal alert BadCertificate 

Below are the ports set with docker run:

    ports:
      - 8082:8082  # GRPC API
      - 8080:8080  # Iroh Gateway HTTP
      - 4444:4444  # P2P - libp2p connection port. peers will dial your node here
      - 4445:4445  # P2P

I also tried to use network_mode: host but it didn't fix it.

add docker image supported armv7/arm64

hi, developers, may I ask you to add a docker image which support armv7/arm64.

because I have found that many TV-Box with armv7/arm64, 2G/ram,16G/rom, and linux system, have a pretty lower power rate, nearly 3 KWH/mon, and only cost 10$ to buy one.

So, If it is kindly for you to make my issue into consider, there will be a better progress in application of IPFS.

at least in my region, many people could be able to buy a TV-box and install linux in it, then deploy summa by docker, with metadata of books and papers. here nearly every family have a TV-box enough to deploy docker.

thank you~

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.