Comments (12)
This seems to be a regression that came with the new HTML popup. @fstachura, do you think you could have a look ?
from elixir.
@mwalle Sorry to hear that. As a temporary fix - middle or ctrl click will open the identifier in a new tab, bypassing the popup. The results seem to be correct there. Alternatively, disabling javascript for elixir.bootlin.com with a browser extension should revert back to the old behavior.
@tpetazzoni
Sure, it seems that the API incorrectly returns empty results for this identifier
https://elixir.bootlin.com/api/ident/u-boot/srand?version=latest&family=C
The same URL without the family parameter returns correct results (NOTE: try it with curl, for some reason that link still returns empty results when opened in a browser)
https://elixir.bootlin.com/api/ident/u-boot/srand?version=latest
from elixir.
@mwalle Sorry to hear that. As a temporary fix - middle or ctrl click will open the identifier in a new tab, bypassing the popup. The results seem to be correct there. Alternatively, disabling javascript for elixir.bootlin.com with a browser extension should revert back to the old behavior.
Thanks for the productivity boost ;)
from elixir.
Thanks a lot @fstachura for the initial investigation! Indeed, looks like a server-side issue then.
from elixir.
@tpetazzoni
I downloaded and indexed u-boot sources on my local machine, and the results are correct there, even with the family parameter. I have a suspicion that the cache is somehow involved.
from elixir.
I noticed another strange thing. I added a URL parameter to force Varnish to return uncached content. On some requests that bypass cache, the backend returns correct data.
% for n in `seq 4000 4100`; do echo https://elixir.bootlin.com/api/ident/u-boot/srand\?version\=latest\&family\=C\&test\=$n; curl https://elixir.bootlin.com/api/ident/u-boot/srand\?version\=latest\&family\=C\&test\=$n; echo; done
https://elixir.bootlin.com/api/ident/u-boot/srand?version=latest&family=C&test=4000
{"definitions": [], "references": [], "documentations": []}
https://elixir.bootlin.com/api/ident/u-boot/srand?version=latest&family=C&test=4001
{"definitions": [], "references": [], "documentations": []}
https://elixir.bootlin.com/api/ident/u-boot/srand?version=latest&family=C&test=4002
{"definitions": [], "references": [], "documentations": []}
https://elixir.bootlin.com/api/ident/u-boot/srand?version=latest&family=C&test=4003
{"definitions": [], "references": [], "documentations": []}
https://elixir.bootlin.com/api/ident/u-boot/srand?version=latest&family=C&test=4004
{"definitions": [], "references": [], "documentations": []}
https://elixir.bootlin.com/api/ident/u-boot/srand?version=latest&family=C&test=4005
{"definitions": [], "references": [], "documentations": []}
https://elixir.bootlin.com/api/ident/u-boot/srand?version=latest&family=C&test=4006
{"definitions": [], "references": [], "documentations": []}
https://elixir.bootlin.com/api/ident/u-boot/srand?version=latest&family=C&test=4007
{"definitions": [], "references": [], "documentations": []}
https://elixir.bootlin.com/api/ident/u-boot/srand?version=latest&family=C&test=4008
{"definitions": [], "references": [], "documentations": []}
https://elixir.bootlin.com/api/ident/u-boot/srand?version=latest&family=C&test=4009
{"definitions": [{"path": "include/rand.h", "line": 20, "type": "prototype"}, {"path": "drivers/crypto/ace_sha.c", "line": 122, "type": "function"}, {"path": "drivers/rng/npcm_rng.c", "line": 72, "type": "function"}, {"path": "lib/rand.c", "line": 29, "type": "function"}], "references": [{"path": "cmd/mem.c", "line": "1297", "type": null}, {"path": "drivers/crypto/ace_sha.c", "line": "152,181", "type": null}, {"path": "drivers/ram/stm32mp1/stm32mp1_tests.c", "line": "616,619", "type": null}, {"path": "drivers/rng/sandbox_rng.c", "line": "24", "type": null}, {"path": "lib/uuid.c", "line": "477,479", "type": null}, {"path": "net/dhcpv6.c", "line": "618", "type": null}, {"path": "net/net_rand.h", "line": "55,57", "type": null}, {"path": "scripts/kconfig/conf.c", "line": "537", "type": null}, {"path": "test/dm/mux-cmd.c", "line": "129,163", "type": null}, {"path": "tools/gen_eth_addr.c", "line": "17", "type": null}], "documentations": [{"path": "include/rand.h", "line": "12", "type": null}]}
https://elixir.bootlin.com/api/ident/u-boot/srand?version=latest&family=C&test=4010
{"definitions": [], "references": [], "documentations": []}
https://elixir.bootlin.com/api/ident/u-boot/srand?version=latest&family=C&test=4011
{"definitions": [{"path": "include/rand.h", "line": 20, "type": "prototype"}, {"path": "drivers/crypto/ace_sha.c", "line": 122, "type": "function"}, {"path": "drivers/rng/npcm_rng.c", "line": 72, "type": "function"}, {"path": "lib/rand.c", "line": 29, "type": "function"}], "references": [{"path": "cmd/mem.c", "line": "1297", "type": null}, {"path": "drivers/crypto/ace_sha.c", "line": "152,181", "type": null}, {"path": "drivers/ram/stm32mp1/stm32mp1_tests.c", "line": "616,619", "type": null}, {"path": "drivers/rng/sandbox_rng.c", "line": "24", "type": null}, {"path": "lib/uuid.c", "line": "477,479", "type": null}, {"path": "net/dhcpv6.c", "line": "618", "type": null}, {"path": "net/net_rand.h", "line": "55,57", "type": null}, {"path": "scripts/kconfig/conf.c", "line": "537", "type": null}, {"path": "test/dm/mux-cmd.c", "line": "129,163", "type": null}, {"path": "tools/gen_eth_addr.c", "line": "17", "type": null}], "documentations": [{"path": "include/rand.h", "line": "12", "type": null}]}
from elixir.
Ok, I think I know what the issue is, or at least why separate requests sometimes return different results for the same URL. I have managed to reproduce something similar on my local machine. Cache is not at fault (obviously).
api.py is a WSGI script. mod_wsgi starts multiple processes of the same WSGI script
https://github.com/bootlin/elixir/blob/master/api/api.py#L40
build_query imports query (local import, executed on each call)
https://github.com/bootlin/elixir/blob/master/query.py#L30
query.db is a module-global variable that is initialized on first import. From what I understand, it is an interface to project data
https://github.com/bootlin/elixir/blob/master/lib.py#L187
lib.getDataDir() simply returns $LXR_DATA_DIR which is set to the data directory of a project on each build_query call. This value is used to initialize query.db.
I think that the assumption made in build_query was that on each call, the local import statement will reimport the module, recreating all the global variables from that module. That does not seem to be true, imports are globally cached.
https://docs.python.org/3/reference/import.html#the-module-cache
So, there is separate query.db for each process, and it is initialized on the first bulid_query call.
On a fresh apache start, if the first request is for something from Linux codebase, and the second request is for something from Musl codebase, its likely that each request will go to a different process. In each process, query.db will be set to a different database. Then, following requests might unexpectedly return empty data (which will be then cached etc etc) because some requests will go to the process that has the Linux db, and some will go the the process that has the musl db.
This was never a problem in the original cgi script, because it's re-executed from scratch on each HTTP request.
I think that there still is something missing in my reasoning. The bug would imply that the number of projects working correctly is limited by the number of WSGI processes started by apache. I guess that most people use Elixir for browsing Linux sources, which would mean that on most processes query.db points to the Linux project database. Meanwhile all projects from elixir.bootlin.com that I tried to browse seemed to be working fine. But I wasn't looking very closely. Maybe the cache or the algorithm that picks the process that will handle a request somehow helped to maintain the illusion that everything is OK.
I see two possible fixes:
- Quick and dirty - call importlib.reload in build_query https://docs.python.org/3/library/importlib.html#importlib.reload
- Refactor query.py so that it allows to hold and dynamically load references to multiple databases.
I'm gonna go ahead and open a PR with the first fix, since this probably impacts a sizeable chunk of the userbase and therefore is more or less urgent.
from elixir.
Thanks a lot @fstachura for the investigation and quick fix!
I am personally not familiar with the Elixir code base, so I find the need to force re-import a module to not be very great. And it seems like you agree: this is the quick and dirty fix, and we need a more long term fix that reworks the code so that query.py is better structured to handle multiple projects.
I'll let @michaelopdenacker handle your pull request with the quick fix.
from elixir.
The quick fix is not enough. Mod_wsgi utilizes multithreading and multiprocessing, meanwhile this solution does not seem thread safe (and I'm actually not sure if DB is thread safe at all)
from elixir.
@mwalle A fix for this was just deployed on elixir.bootlin.com. Could you try to reproduce the bug now?
from elixir.
@fstachura thanks, seems to work now!
from elixir.
Thanks for confirming. Closing this bug.
from elixir.
Related Issues (20)
- CSS: horizontal scrolling issue with very long lines HOT 6
- Elixir Linux Kernel: powerpc version of __WARN_FLAGS() is not seen as a macro HOT 4
- Improve indexing of Zephyr
- Improve devicetree file navigation HOT 6
- Android kernel & bionic on elixir.bootlin.com HOT 1
- Add systemd source code HOT 1
- script.sh: Can't find repository HOT 3
- Ability to get raw file HOT 3
- Improve indexing performance (`update.py`) HOT 3
- Move from bsddb3 to berkeleydb HOT 1
- Comments in assembler files for some architectures are indexed as identifiers HOT 3
- Inconsistent number of references if update job is ran for multiple tags at once HOT 5
- Searching for an ident with slash gives blank page
- Searching for filepath could redirect to this path
- cgi and cgitb dependencies are deprecated HOT 1
- Avoid using Perl
- Git submodules are visible in source tree, but are not valid links HOT 1
- Symlinks redirect to invalid paths HOT 2
- `http/web.py` performance issues HOT 5
- Front-end not inserting idents in source page HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from elixir.