src-d / engine-deprecated Goto Github PK
View Code? Open in Web Editor NEW[DISCONTINUED] Go to https://github.com/src-d/sourced-ce/
Home Page: https://docs.sourced.tech/engine
License: Apache License 2.0
[DISCONTINUED] Go to https://github.com/src-d/sourced-ce/
Home Page: https://docs.sourced.tech/engine
License: Apache License 2.0
Right now there are several ways of error handling along with several ways of outputting stuff to the user in the tool, so we should unify it to make it look (both for the user and the developer) consistent.
A few of them don't work anymore, like the ones using ARRAY_LENGTH
, that will fail with:
rpc error: code = Unknown desc = SQL query failed: Error 1105: unknown error: invalid type: BLOB
Also, the queries should end with ;
to make it easier to copy&paste to the engine sql
console.
@ajnavarro says: Babelfish image is too old.
I think the same because SupportedLanguages()
API has been added since ages.
According to our guidelines, we should have a MAINTAINERS file with, well, list of maintainers.
From the docs:
Parses a file and returns the resulting native AST. This command installs any missing drivers.
The name has been changed, so please update the import paths from engine
to engine-cli
.
When clicking "Please refer to our contribution guide" on Contribution section, the link is broken.
Link: https://github.com/src-d/engine/blob/master/CONTRIBUTING.md
When running srcd web sql
for the "first" time (Docker clean, fresh repo) I get a fatal error:
FATA[0194] could not start gitbase web client at port 8080: rpc error: code = Unknown desc = could not create srcd-cli-bblfshd: rpc error: code = DeadlineExceeded desc = context deadline exceeded
Running it a second time it works fine, though.
Maybe this behavior is know, but the error message was not very obvious if that's the case, or something can be fixed—not everyone gives a second chance. :)
engine/build on master > ./srcd web sql
INFO[0000] installing "srcd/cli-daemon:latest"
INFO[0013] installed "srcd/cli-daemon:latest"
INFO[0015] couldn't find network srcd-cli-network: Error: No such network: srcd-cli-network
INFO[0015] creating it now
INFO[0019] this is taking a while, if this is the first time you launch this web client, it might take a few more minutes while we install all the required images
FATA[0194] could not start gitbase web client at port 8080: rpc error: code = Unknown desc = could not create srcd-cli-bblfshd: rpc error: code = DeadlineExceeded desc = context deadline exceeded
engine/build on master took 3m 15s > ./srcd web sql
INFO[0003] this is taking a while, if this is the first time you launch this web client, it might take a few more minutes while we install all the required images
Go to http://localhost:8080 for the gitbase web client. Press Ctrl-C to stop it.
Command help shows a flag for an undocumented config file.
--config string config file (default is $HOME/.srcd.yaml)
I may want to stop working on this right now, but I don't want to wait for the download process next time I continue. Maybe a scrd stop
?
From the docs:
All of the subcomands of srcd parse drivers provide management for the language drivers installed on bblfsh.
but it's marked as TBD.
We need to test it and update docs if everything works.
When you try to parse anything with bblfsh the container hangs and cannot be used or deleted
srcd web sql
opens http://localhost:8080/
but nothing is being served there
The first time I run srcd web sql
it will install the recommended drivers for bblfsh. I'd like the log to reflect which drivers are being downloaded & installed (as we do on srcd init
with the cli-daemon).
Current:
srcd web sql --port=8088
INFO[0003] this is taking a while, if this is the first time you launch this web client, it might take a few more minutes while we install all the required images
Desired:
srcd web sql --port=8088
INFO[0003] this is taking a while, if this is the first time you launch this web client, it might take a few more minutes while we install all the required images
INFO[0004] installing "bblfsh/python-driver:latest"
INFO[0005] installed "bblfsh/python-driver:latest"
etc...
We still pull pilosa image when you run gitbase from engine.
We don't need it anymore.
With the current bblfsh v2.90 a bblfshctl driver install
fails for already installed drivers (context here).
This means we can get errors like this:
$ srcd parse uast ~/repos/gitbase-web/cmd/gitbase-web/main.go
FATA[0000] could not stream: rpc error: code = AlreadyExists desc = driver already installed: go (image reference: docker://bblfsh/go-driver:latest): %s
We should add the new --force
flag to bblfshctl
(which actually means ignore, not force), or use the new latest-drivers
images.
I run srcd web sql
, do some stuff and then Ctrl-C it. Then docker ps
shows:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
e5b4e1f45aa9 srcd/gitbase "/tini -- /bin/sh -c…" 2 hours ago Up 2 hours 0.0.0.0:3306->3306/tcp srcd-cli-gitbase
1a8417b883f1 pilosa/pilosa:v0.9.0 "/pilosa server --da…" 2 hours ago Up 2 hours 10101/tcp srcd-cli-pilosa
25b70c99ed8c srcd/cli-daemon "/srcd-server --work…" 2 hours ago Up 2 hours 0.0.0.0:4242->4242/tcp srcd-cli-daemon
the last one is OK, Pilosa one is removed in the most recent gitbase so can be ignored. However, gitbase is left running while it should not.
$ scrd sql
gitbase> SELECT cf.file_path, f.blob_content
-> FROM ref_commits r
-> NATURAL JOIN commit_files cf
-> NATURAL JOIN files f
-> WHERE r.ref_name = 'HEAD'
-> AND r.history_index = 0
-> ;
rpc error: code = ResourceExhausted desc = grpc: received message larger than max (15214220 vs. 4194304)
Engine master, gitbase v0.17.1.
As requested by @eiso.
This comes with a series of issues as exposing the ports to the host could create conflicts.
I think the best way to solve it is to provide some expose
mechanism, either as srcd sql expose -p 3306
or srcd expose sql -p 3306
.
In any case, I'd argue this could be a candidate for a second release and not be a blocking issue.
I am running the bblfshd
container on port 9432, so srcd web sql
fails with
FATA[0000] could not start gitbase web client at port 8080: rpc error: code = Unknown desc = could not create srcd-cli-bblfshd: could not start container: srcd-cli-bblfshd: Error response from daemon: driver failed programming external connectivity on endpoint srcd-cli-bblfshd (ea6cb5c1288e1708be363cce4662528a2045d269c2d4501d6c1350d19442f77e): Bind for 0.0.0.0:9432 failed: port is already allocated
because port 9432 is hardcoded. It would be nice to add a --bblfshd-port
CLI argument so that I don't have to stop the already running container every time.
We do break backward compatibility from time to time.
Gibase should have correct version of bblfshd, gitbase-web should have correct version of gitbase and so on.
Cons: We will have to manually update versions from time to time
Pros: Engine wouldn't get broken when all latests aren't compatible with each other and avoid problems like #58 #67 #65
I'm trying to use the interface of gitbase, after installing the engine according to "quickstart" on documentation.
After getting an error, I tried cleaning up and restarting it. I followed the commands below and got the error on the image that's attached.
$ srcd kill
INFO[0000] removing containers...
INFO[0000] removing container srcd-cli-bblfshd
INFO[0000] removing container srcd-cli-daemon
INFO[0001] removing volumes...
INFO[0001] removing volume srcd-cli-bblfsh-storage
INFO[0002] removing images...
INFO[0002] removing image srcd/cli-daemon:latest
INFO[0002] removing image bblfsh/bblfshd:latest
$ srcd init
INFO[0000] starting daemon with working directory: /Users/fernandagomes
INFO[0000] installing "srcd/cli-daemon:latest"
INFO[0004] installed "srcd/cli-daemon:latest"
$ srcd web sql
INFO[0003] this is taking a while, if this is the first time you launch this web client, it might take a few more minutes while we install all the required images
Go to http://localhost:8080 for the gitbase web client. Press Ctrl-C to stop it.
I run srcd web sql
and get:
Also:
The images are missing. When I hit Ctrl-R
, I get straight to "Gitbase Web" interface. No more "sign in" buttons.
Tagging @bzz
As Go 1.11 and the corresponding docker images have been released the vendor folder is not needed anymore as one can use go module on their host system and the docker image.
Gitbase indexes should be persisted in a volume somewhere.
We could have a shared folder, like ~/.srcd
where we can store drivers and indexes, so they can be mounted in the containers
Ideally, ~/.srcd/gitbase/WORKDIR/PATH
, so every time you start gitbase with a different path the indexes are not mixed up
Should we also start a pilosa server? We have pilosalib now, which doesn't require it, but don't know how stable that is or if it's on master. Gotta check it
Driver installation should have some persistence as a mounted volume somewhere
Since the source{d} engine is the packaging of our projects as a product, it should have a logo.
I run srcd kill
and this is what it does:
INFO[0000] removing containers...
INFO[0000] removing container srcd-cli-gitbase
INFO[0000] removing container srcd-cli-pilosa
INFO[0001] removing container srcd-cli-bblfshd
INFO[0002] removing container srcd-cli-daemon
INFO[0003] removing volumes...
INFO[0003] removing volume srcd-cli-bblfsh-storage
INFO[0006] removing images...
INFO[0006] removing image srcd/gitbase:latest
INFO[0006] removing image bblfsh/bblfshd:latest
INFO[0007] removing image srcd/style-analyzer:latest
INFO[0007] removing image srcd/cli-daemon:latest
INFO[0007] removing image srcd/gitbase-web:latest
INFO[0007] removing image srcd/spark:2.2.1
INFO[0007] removing image srcd/codelab:latest
INFO[0009] removing image pilosa/pilosa:v0.9.0
INFO[0009] removing image srcd/spark:2.2.0_v2
Removing containers is fine, but removing images o_O It erases bblfshd which I use frequently, spark (btw: two versions??) and style-analyzer which is innocent.
My suggestion is to not remove the images by default - it is very cruel.
If I run srcd init
on a new directory, it does not restart the daemon container, so the new path is not applied and e.g. gitbase queries execute on the old git repositories.
It's a minor usability thing, but in this log:
$ ./srcd init
INFO[0000] starting daemon with working directory: /home/cmartin/engine/engine_linux_amd64
INFO[0000] installing "srcd/cli-daemon:latest"
INFO[0005] installed "srcd/cli-daemon:latest"
INFO[0007] couldn't find network srcd-cli-network: Error: No such network: srcd-cli-network
INFO[0007] creating it now
Is "it" the missing network, or is "it" the daemon
mentioned in the first log line?
We could make the last message more explicit to assure that we are taking care of the missing network error.
./srcd kill
fails if one of containers is not running. For instance I stop pilosa container (to proof that we don't use it), After that I run kill
and it failed because engine could stop all containers.
I think in this case we should move on and stop as much containers as we can.
Should we move all output from logrus to a human-readable output? After all, I don't think the logs of this will be processed.
Only for the srcd
command, not the daemon.
Examples readme contains "Get all LICENSE blobs using pilosa index".
It should either provide more context (What is pilosa, why should I care, and how do I use it?), or just skip any mention to it.
$ srcd sql "SELECT repository_id, num_files FROM (
SELECT COUNT(f.*) num_files, f.repository_id
FROM ref_commits r
NATURAL JOIN commit_files cf
NATURAL JOIN files f
WHERE r.ref_name = 'HEAD' GROUP BY f.repository_id
) AS t
ORDER BY num_files DESC LIMIT 10"
2018/08/28 18:28:42 rpc error: code = DeadlineExceeded desc = context deadline exceeded
```
Currently on Windows, when running srcd init C:\path\etc
the engine hangs on installing the cli daemon.
Create an animated svg (suggested tool) to include right below the Introduction
title in the new README.
@campoy I am happy to take care of this but I expect you'll want to do it yourself.
The README says:
Code Retrieval: retrieve and store git repositories as a dataset.
However, borges is not currently integrated and the current commands do not deliver this promise.
I request the borges integration.
CI is giving trouble uploading the artifacts to GitHub, so we should fix that.
We need to explain that you need to run init
again if you want to process new repositories, otherwise it might be confusing.
From Slack by @kuba--: https://src-d.slack.com/archives/C7UDG0GP6/p1539289002000100
we still pull pilosa-0.9 image which is not needed (I even checked if we still use it in engine and stopped pilosa container).
Bug faced by one of our users and reported via drift:
Command srcd web sql
ATA[0075] could not start gitbase web client at port 8080: rpc error: code = Unknown desc = could not create srcd-cli-gitbase: could not start container: srcd-cli-gitbase: Error response from daemon: driver failed programming external connectivity on endpoint srcd-cli-gitbase (b70dc2431912a12797b3bef7f324df3902e35543af79e5642d5ff457c293e33b): Error starting userland proxy: Bind for 0.0.0.0:3306 failed: port is already allocated
User checked docker ps -a
and I doesn't see port 3306 in use
On his os nothing is on that port: lsof -i :3306
Docker daemon is not hung
In gitbase> prompt
He tries typing show tables;
and SELECT * FROM repositories;
gitbase> SELECT COUNT(*) FROM repositories;
rpc error: code = Unknown desc = could not create srcd-cli-gitbase: could not start container: srcd-cli-gitbase: Error response from daemon: driver failed programming external connectivity on endpoint srcd-cli-gitbase (22454bed8c5fdbf21a681030e71ea76de0aaaa59fcbb79f444c92514d99c7863): Error starting userland proxy: Bind for 0.0.0.0:3306 failed: port is already allocated
Seems like there is no flag to change the gitbase port. Can someone make it a flag and tag a new release for this ?
Expectation for srcd kill
command:
Removes all containers, docker images and docker volumes used by the source{d} engine.
Reality:
From Slack by @kuba--:
After I stoped pilosa container I tried
srcd kill
and it failed because we could not stop all containers (I still had bblfsh and daemon running). I think we should stop as many containers as we can.
I'm running this request:
SELECT lang, SUM(size) as total_size
FROM (
SELECT LANGUAGE(t.tree_entry_name, b.blob_content) AS lang, b.blob_size as size
FROM refs r
JOIN commits c ON r.commit_hash = c.commit_hash
JOIN commit_trees ct ON c.commit_hash = ct.commit_hash
JOIN tree_entries t ON ct.tree_hash = t.tree_hash
JOIN blobs b ON t.blob_hash = b.blob_hash
WHERE r.ref_name = 'HEAD' and r.repository_id = 'go'
) AS sizes
GROUP BY lang
ORDER BY total_size DESC;
When I run it from mysql
I get the following:
mysql> SELECT lang, SUM(size) as total_size
-> FROM (
-> SELECT LANGUAGE(t.tree_entry_name, b.blob_content) AS lang, b.blob_size as size
-> FROM refs r
-> JOIN commits c ON r.commit_hash = c.commit_hash
-> JOIN commit_trees ct ON c.commit_hash = ct.commit_hash
-> JOIN tree_entries t ON ct.tree_hash = t.tree_hash
-> JOIN blobs b ON t.blob_hash = b.blob_hash
-> WHERE r.ref_name = 'HEAD' and r.repository_id = 'go'
-> ) AS sizes
-> GROUP BY lang
-> ORDER BY total_size DESC;
+------+------------+
| lang | total_size |
+------+------------+
| Go | 52602209 |
| | 14894002 |
| Text | 8770161 |
| Unix Assembly | 4224188 |
| HTML | 4001098 |
| JSON | 276793 |
| C | 253893 |
| Shell | 87609 |
| Wavefront Object | 74392 |
| Perl | 61454 |
| Markdown | 44898 |
| XML | 32890 |
| JavaScript | 25392 |
| SVG | 17826 |
| Graphviz (DOT) | 17286 |
| Python | 14529 |
| Raw token data | 12673 |
| Assembly | 10349 |
| Batchfile | 7421 |
| CSS | 3120 |
| Makefile | 2534 |
| Logos | 2411 |
| Protocol Buffer | 2138 |
| C++ | 1370 |
| Diff | 623 |
| Awk | 450 |
| Fortran | 394 |
| Objective-C | 184 |
+------+------------+
28 rows in set (1 min 14.31 sec)
Same for gitbase-web
, but somehow the result on srcd sql
is empty:
gitbase> SELECT lang, SUM(size) as total_size
-> FROM (
-> SELECT LANGUAGE(t.tree_entry_name, b.blob_content) AS lang, b.blob_size as size
-> FROM refs r
-> JOIN commits c ON r.commit_hash = c.commit_hash
-> JOIN commit_trees ct ON c.commit_hash = ct.commit_hash
-> JOIN tree_entries t ON ct.tree_hash = t.tree_hash
-> JOIN blobs b ON t.blob_hash = b.blob_hash
-> WHERE r.ref_name = 'HEAD' and r.repository_id = 'go'
-> ) AS sizes
-> GROUP BY lang
-> ORDER BY total_size DESC;
+------+------------+
| LANG | TOTAL SIZE |
+------+------------+
+------+------------+
Regardless of whether srcd kill
is supposed to nuke everything or just the container instances, it misses to remove srcd-cli-network
network.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.