minimalparts / pears Goto Github PK
View Code? Open in Web Editor NEWArchive repository for the PeARS project. Please head over to https://github.com/PeARSearch/PeARS-orchard for the latest version.
License: MIT License
Archive repository for the PeARS project. Please head over to https://github.com/PeARSearch/PeARS-orchard for the latest version.
License: MIT License
In the cosine_distance function, Numpy sometimes round off the return value of the dot function to Zero resulting this:
RuntimeWarning: invalid value encountered in double_scalars
we might have to go for a high precision mathematics library.
Adding this mostly as a reminder for myself.
Ok, I'm totally puzzled. I'm playing with the scoring function in scorePages.py, and the following is happening.
I typed in the following three queries, in this order:
"python string replace" -- got results
"python unicode" -- no result, redirected to duckduckgo
"video games with animals" -- got results page but with the results of the first query -- 'python string replace'
What gets me is that if I modify the scoring function (change the threshold, etc), I can stop this effect. Regardless... it seems some variable is holding search results longer than it should...
Right now 'scoreDocs' and 'runScript' are taking around 13 seconds and 24 seconds.
Thanks for this projects it looks very promising.
I'm trying to understand the purpose of semantic space. What is it? How is it built?
Thanks!
Mainly based on the article http://aurelieherbelot.net/pears/ I suspect PeARS to be quite prone to e.g. poisoning of search results made across the distributed network.
Also forgery (e.g. "let's promote my web page") seems to be a big topic in this area of distributed "value offerings". Imagine that thanks to IoT anyone can be running thousands of virtual machines visiting only certain web pages to highly influence queries of others.
Do you have any plans to cope with security and attacks?
So far the solution seems totally open and therefore highly prone to basically any attack, which in turn makes it unusable in a real world ๐ข. I really don't want anyone to be able to influence our society (think of e.g. politics) just by such trivial technical means.
lets use some proper logging modules for the logging.
This happened again. I searched for 'Harry Potter', then 'video games', and got the Harry Potter results for the video games (see system output below). I then restarted PeARS and got okay results.
/home/aurelie/PeARS-old/space0915/demo2pear/ 0.306965005991
/home/aurelie/PeARS-old/space0915/demo1pear/ 0.28238008532
/home/aurelie/PeARS-old/space0915/coding_pear/ 0.256275264182
runScript in scorePages took 19.637 ms
['/home/aurelie/PeARS-old/space0915/demo2pear/', 'Glad to help you with your search!', './static/pi-pic.png']
['/home/aurelie/PeARS-old/space0915/demo1pear/', 'Glad to help you with your search!', './static/pi-pic.png']
['/home/aurelie/PeARS-old/space0915/coding_pear/', 'Glad to help you with your search!', './static/pi-pic.png']
Loading URL dictionary for /home/aurelie/PeARS-old/space0915/demo2pear/
Loading word clouds...
scoreDS in scorePages took 62.301 ms
scoreURL in scorePages took 44.462 ms
scoreDocs in scorePages took 119.813 ms
Loading URL dictionary for /home/aurelie/PeARS-old/space0915/demo1pear/
Loading word clouds...
scoreDS in scorePages took 41.331 ms
scoreURL in scorePages took 15.022 ms
http://en.wikipedia.org/wiki/Angband_(video_game) 0.402860741276 0.727272727273
http://en.wikipedia.org/wiki/0_A.D._(video_game) 0.402860741276 0.727272727273
scoreDocs in scorePages took 58.678 ms
Loading URL dictionary for /home/aurelie/PeARS-old/space0915/coding_pear/
Loading word clouds...
scoreDS in scorePages took 365.648 ms
scoreURL in scorePages took 14.297 ms
scoreDocs in scorePages took 381.863 ms
https://en.wikipedia.org/wiki/Academy_Award_for_Best_Directing 0.767015190846
https://en.wikipedia.org/wiki/Academy_Award_for_Best_Director 0.766009656823
https://en.wikipedia.org/wiki/Geraldine_Somerville 0.754053608067
https://en.wikipedia.org/wiki/List_of_fictional_dogs 0.745350930536
https://en.wikipedia.org/wiki/Harry_Potter 0.712330202675
https://en.wikipedia.org/wiki/Helena_Bonham_Carter 0.665556768013
https://en.wikipedia.org/wiki/Richard_Harris 0.657491004169
http://www.bbc.co.uk/iplayer/episode/p02h1mvr/the-casual-vacancy-episode-1 0.647969058168
https://ind.ie/summit/ 0.642431113644
https://en.wikipedia.org/wiki/Film_director 0.621951494845
runScript in scorePages took 581.633 ms
127.0.0.1 - - [14/Sep/2015 13:34:09] "GET /index?q=video+games HTTP/1.1" 200 -
Now that the app is rewritten to flask, the current installation instructions are obsolete. We should update the installation instructions. (using virtualenvs etc)
Note to myself... The semantic space has words in both capitalised and non-capitalised forms, for instance 'potter/Potter'. So when the user types in 'harry potter', the 'wrong' vector is returned and results suffer.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.