Giter Club home page Giter Club logo

ftserver's Introduction

Full Text Search Engine Server for Java

User Guide

Setup

  1. Install Java 8+

  2. Install Maven 3+

  3. Download this Project.

  4. Run

$ cd FTServer
$ mvn package cargo:run
  1. Open http://127.0.0.1:8088/

  2. Press [Ctrl-C] to stop the container

Input a Full URL to index the Page, then search.

Move page forward by re-indexing the page.

Search Format

[Word1 Word2 Word3] => text has Word1 and Word2 and Word3

["Word1 Word2 Word3"] => text has "Word1 Word2 Word3" as a whole

Search [https] or [http] => get almost all pages

Developer Guide

Download Netbeans

Dependencies

iBoxDB

Semantic-UI

Jsoup

The Results Order

The results order based on the id() number in class PageText, descending order.

A Page has many PageTexts. if don't need multiple Texts, modify Html.getDefaultTexts(Page), returns only one PageText (the page description text only, Config.DescriptionOnly=true ).

the Page.GetRandomContent() method is used to keep the Search-Page-Content always changing, doesn't affect the real PageText order.

Use the ID number to control the order instead of loading all pages to memory.

Search Method

search (... String keywords, long startId, long count)

startId => which ID(the id when you created PageText) to start, use (startId=Long.MaxValue) to read from the top, descending order

count => records to read, important parameter, the search speed depends on this parameter, not how big the data is.

Next Page

set the startId as the last id from the results of search minus one

startId = search( "keywords", startId, count);
nextpage_startId = startId - 1 // this 'minus one' has done inside search()
...
//read next page
search("keywords", nextpage_startId, count)

mostly, the nextpage_startId is posted from client browser when user reached the end of webpage, and set the default nextpage_startId=Long.MaxValue, in javascript the big number have to write as String ("'" + nextpage_startId + "'")

Private Server

Open

public Page Html.get(String url);

Set your private WebSite text

Page page = new Page();
page.url = url;
page.title = title;
page.text = replace(doc.body().text());
page... = ...
return page;

Configure Cache

Setting JVM Memory from FTServer/.mvn/jvm.config , default is 4GB.

Setting Index Readonly Cache (Readonly_MaxDBCount) from FTServer/src/main/java/ftserver/Config.java .

Stop Tracker daemon

Why does Tracker consume resources on my PC?

[user@localhost ~]$ tracker daemon -k

[user@localhost ~]$ rm -rf .cache/tracker/

Set Maximum Opened Files Bigger

[user@localhost ~]$ cat /proc/sys/fs/file-max
803882
[user@localhost ~]$ ulimit -a | grep files
open files                      (-n) 500000
[user@localhost ~]$  ulimit -Hn
500000
[user@localhost ~]$ ulimit -Sn
500000
[user@localhost ~]$ 


$ vi /etc/security/limits.conf
*         hard    nofile      500000
*         soft    nofile      500000
root      hard    nofile      500000
root      soft    nofile      500000

Set File Readahead(RA) lower

[user@localhost ~]$ sudo blockdev --report
//if Readahead(RA) bigger than hardware speed, can set it lower.
//it depends on hardware parameters.
[user@localhost ~]$ sudo blockdev --setra 128 /dev/sda
[user@localhost ~]$ sudo blockdev --setra 128 /dev/dm-0
[user@localhost ~]$ sudo blockdev --setra 128 /dev/dm-1
[user@localhost ~]$ lsblk -o NAME,RA

[user@localhost ~]$ free -m
[user@localhost ~]$ sudo sysctl vm.drop_caches=3

Add Firewall Port for remoting access

[user@localhost ~]$ firewall-cmd --add-port=8088/tcp --permanent

Set Java Version

//Java 11 Version
export JAVA_HOME=/usr/lib/jvm/java-11-openjdk

//Java 18 Version
export JAVA_HOME=/home/user/Downloads/jdk-18.0.1.1

//Java 21 Version
export JAVA_HOME=/usr/lib/jvm/java-21-openjdk-21.0.2.0.13-1.el9.x86_64

$ alternatives --config java

More

C# ASP.NET Core Version

FTServer for Android with APK


Flag

ftserver's People

Contributors

iboxdb avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.