tabbyml / tabby Goto Github PK
View Code? Open in Web Editor NEWSelf-hosted AI coding assistant
Home Page: https://tabby.tabbyml.com/
License: Other
Self-hosted AI coding assistant
Home Page: https://tabby.tabbyml.com/
License: Other
It'll be nice if we could support completion in vim.
Maybe starts with an omni func: https://vim.fandom.com/wiki/Omni_completion
Please describe the feature you want
Support Ruby in tabby
Additional context Add any other context or screenshots about the feature request here.
I believe that the LanguagePresets should look like this:
Language.RUBY: LanguagePreset(
max_length=128,
stop_words=[
"\n\n",
"\ndef",
"\n#",
"\nrequire",
"\ninclude",
"\nclass",
"\nmodule",
],
),
However I don't know how to implement the extension support.
Please reply with a π if you want this feature.
I tried tabby in a browser and in vscode and I'm not getting any completions:
No requests are sent from vscode:
Server logs for that time:
2023-04-07 09:59:43,953 DEBG 'server' stdout output:
INFO: 10.0.2.100:0 - "POST /v1/completions HTTP/1.1" 200 OK
2023-04-07 10:00:00,060 DEBG 'dagu_scheduler' stdout output:
2023/04/07 10:00:00 [2023-04-07 10:00:00] start analytic
2023-04-07 10:00:00,066 DEBG 'dagu_scheduler' stdout output:
2023/04/07 10:00:00 server is running at "/tmp/@dagu-analytic-af09eed2a725067a7f323f1fc0f93c29.sock"
2023-04-07 10:00:00,067 DEBG 'dagu_scheduler' stdout output:
2023/04/07 10:00:00 start running: Collect Tabby
2023-04-07 10:00:00,161 DEBG 'dagu_scheduler' stdout output:
2023/04/07 10:00:00 Collect Tabby failed
2023-04-07 10:00:00,167 DEBG 'dagu_scheduler' stdout output:
2023/04/07 10:00:00 schedule finished.
2023-04-07 10:00:00,167 DEBG 'dagu_scheduler' stdout output:
2023/04/07 10:00:00
Summary ->
+--------------------------------------+----------+---------------------+---------------------+--------+--------+---------------+
| REQUESTID | NAME | STARTED AT | FINISHED AT | STATUS | PARAMS | ERROR |
+--------------------------------------+----------+---------------------+---------------------+--------+--------+---------------+
| 1a9305ea-f32c-4865-995b-ed5c809c6b1f | analytic | 2023-04-07 10:00:00 | 2023-04-07 10:00:00 | failed | | exit status 1 |
+--------------------------------------+----------+---------------------+---------------------+--------+--------+---------------+
Details ->
+---+---------------+---------------------+---------------------+--------+----------------------------------------------------------+---------------+
| # | STEP | STARTED AT | FINISHED AT | STATUS | COMMAND | ERROR |
+---+---------------+---------------------+---------------------+--------+----------------------------------------------------------+---------------+
| 1 | Collect Tabby | 2023-04-07 10:00:00 | 2023-04-07 10:00:00 | failed | ./tabby/tools/analytic/main.sh collect_tabby_server_logs | exit status 1 |
+---+---------------+---------------------+---------------------+--------+----------------------------------------------------------+---------------+
2023-04-07 10:00:00,168 DEBG 'dagu_scheduler' stdout output:
2023/04/07 10:00:00 Failed to start DAG: exit status 1
2023-04-07 10:00:00,168 DEBG 'dagu_scheduler' stdout output:
2023/04/07 10:00:00 runner: entry failed analytic: exit status 1
I can see CPU utilisation going up to 550% for 3-6 seconds, then drops with no results
There are currently plugins for VIM and VS Code, it would be great to have one for pycharm as well
Please reply with a π if you want this feature.
To be used as a page in admin server.
The client should filter out unuseful completions before displaying them, such as:
2023-04-09 00:05:43,013 DEBG 'triton' stderr output:
terminate called after throwing an instance of 'std::runtime_error'
what(): [FT][ERROR] CUDA runtime error: the provided PTX was compiled with an unsupported toolchain. /workspace/build/fastertransformer_backend/build/_deps/repo-ft-src/src/fastertransformer/utils/cuda_utils.h:274
[e79652de0dc1:01747] *** Process received signal ***
[e79652de0dc1:01747] Signal: Aborted (6)
[e79652de0dc1:01747] Signal code: (-6)
[e79652de0dc1:01747] [ 0] /usr/lib/x86_64-linux-gnu/libpthread.so.0(+0x14420)[0x7fbedac13420]
[e79652de0dc1:01747] [ 1] /usr/lib/x86_64-linux-gnu/libc.so.6(gsignal+0xcb)[0x7fbed949e00b]
[e79652de0dc1:01747] [ 2] /usr/lib/x86_64-linux-gnu/libc.so.6(abort+0x12b)[0x7fbed947d859]
[e79652de0dc1:01747] [ 3] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x9e911)[0x7fbed9857911]
[e79652de0dc1:01747] [ 4]
2023-04-09 00:05:43,014 DEBG 'triton' stderr output:
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xaa38c)[0x7fbed986338c]
[e79652de0dc1:01747] [ 5] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xaa3f7)[0x7fbed98633f7]
[e79652de0dc1:01747] [ 6] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xaa6a9)[0x7fbed98636a9]
[e79652de0dc1:01747] [ 7] /opt/tritonserver/backends/fastertransformer/libtriton_fastertransformer.so(+0x3b949)[0x7fbebc686949]
[e79652de0dc1:01747] [ 8] /opt/tritonserver/backends/fastertransformer/libtriton_fastertransformer.so(+0x20f65)[0x7fbebc66bf65]
[e79652de0dc1:01747] [ 9] /opt/tritonserver/backends/fastertransformer/libtriton_fastertransformer.so(+0x2d794)[0x7fbebc678794]
[e79652de0dc1:01747] [10] /opt/tritonserver/backends/fastertransformer/libtriton_fastertransformer.so(TRITONBACKEND_ModelInitialize+0x38d)[0x7fbebc678e0d]
[e79652de0dc1:01747] [11] /opt/tritonserver/bin/../lib/libtritonserver.so(+0x10689b)[0x7fbed9d4889b]
[e79652de0dc1:01747] [12] /opt/tritonserver/bin/../lib/libtritonserver.so(+0x1c4f5d)[0x7fbed9e06f5d]
[e79652de0dc1:01747] [13] /opt/tritonserver/bin/../lib/libtritonserver.so(+0x1caccd)[0x7fbed9e0cccd]
[e79652de0dc1:01747] [14] /opt/tritonserver/bin/../lib/libtritonserver.so(+0x3083a0)[0x7fbed9f4a3a0]
[e79652de0dc1:01747] [15] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xd6de4)[0x7fbed988fde4]
[e79652de0dc1:01747] [16] /usr/lib/x86_64-linux-gnu/libpthread.so.0(+0x8609)[0x7fbedac07609]
[e79652de0dc1:01747] [17] /usr/lib/x86_64-linux-gnu/libc.so.6(clone+0x43)[0x7fbed957a133]
[e79652de0dc1:01747] *** End of error message ***
tabby/tabby/server/backend/python.py
Line 90 in be7894a
Maybe prefix tree (trie)
according https://github.com/TabbyML/tabby#get-started-server to start a docker container not running
Nvidia GPU is Tesla P40 GPU Driver is 470.57.02
When a languageId is not presented in openai specs (http://localhost:5000/openai.json), it should drop the request.
Please describe the feature you want
It would be nice if we could have list of supported languages in docs.
Please reply with a π if you want this feature.
Meilisearch has now been integrated into #85, enabling us to experiment with including relevant code snippets.
Related flags:
* [] FLAGS_enable_meilisearch
* [ ] FLAGS_rewrite_prompt_with_search_snippet
It'll be nice if tabby can be deployed in a single docker run - just for a quick try-on.
Thank you for releasing Tabby, very excited to try this!
This is admittedly small potatoes, but it would be nice to include a name flag in the docker command in the readme: --name=tabby
(beats 'priceless_agnesi')
tabby/tabby/server/backend/python.py
Line 34 in be7894a
In the case of when some inline suggestion has shown up, and i do not use <Tab>
to accept full suggestion. Instead, I input characters according to suggestion, or use Accept Word
(<Ctl+β>
) or Accept Line
in VSCode, the remaining part of inline suggestion should keep unchanged.
Please reply with a π if you want this feature.
Is there any direction on how to use the tabby.toml file to add additional projects? Specifically if they're in a private repo?
Would love for it to be documented in the README.md.
Please describe the feature you want
Support for the pytorch vulkan backend so that older nvidia gpus, as well as intell, amd, and some phone gpus can be supported.
https://pytorch.org/tutorials/prototype/vulkan_workflow.html
Additional context
Personally ran into difficulties testing this project, because my laptop is too old to support Nvidia, and my cloud accounts aren't authorized to deploy GPU compute. I imagine I am not the only one limited on working on this project by these kinds of lim facs.
Please reply with a π if you want this feature.
The Vim plugin currently invokes a completion request every time the cursor moves in INSERT mode. This can cause confusion when attempting to simply move the cursor through a word or code block without any intention of editing it.
Like other completion engines, the Vim plugin should not invoke a completion request under these circumstances.
The Visual Studio Code Marketplace is not available where the OSS version of vscode is used due to licensing issues. As an alternative, open-vsx.org is heavily used. Please register the tabby extension with open-vsx.org as well.
'HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /TabbyML/J-350M/resolve/main/tokenizer_config.json (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f8ac76d3430>, 'Connection to huggingface.co timed out. (connect timeout=10)'))' thrown while requesting HEAD https://huggingface.co/TabbyML/J-350M/resolve/main/tokenizer_config.json
^CTraceback (most recent call last):
File "/home/app/.pyenv/versions/3.10.10/lib/python3.10/site-packages/transformers/utils/hub.py", line 409, in cached_file
resolved_file = hf_hub_download(
File "/home/app/.pyenv/versions/3.10.10/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 120, in _inner_fn
return fn(*args, **kwargs)
File "/home/app/.pyenv/versions/3.10.10/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1247, in hf_hub_download
raise LocalEntryNotFoundError(
huggingface_hub.utils._errors.LocalEntryNotFoundError: Cannot find the requested files in the disk cache and outgoing traffic has been disabled. To enable hf.co look-ups and downloads online, set 'local_files_only' to False.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/app/tabby/tools/download_models.py", line 41, in
preload(local_files_only=args.prefer_local_files)
File "/home/app/tabby/tools/download_models.py", line 26, in preload
AutoTokenizer.from_pretrained(args.repo_id, local_files_only=local_files_only)
File "/home/app/.pyenv/versions/3.10.10/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 634, in from_pretrained
config = AutoConfig.from_pretrained(
File "/home/app/.pyenv/versions/3.10.10/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 896, in from_pretrained
config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
File "/home/app/.pyenv/versions/3.10.10/lib/python3.10/site-packages/transformers/configuration_utils.py", line 573, in get_config_dict
config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
File "/home/app/.pyenv/versions/3.10.10/lib/python3.10/site-packages/transformers/configuration_utils.py", line 628, in _get_config_dict
resolved_config_file = cached_file(
File "/home/app/.pyenv/versions/3.10.10/lib/python3.10/site-packages/transformers/utils/hub.py", line 443, in cached_file
raise EnvironmentError(
OSError: We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like TabbyML/J-350M is not the path to a directory containing a file named config.json.
Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.
The agent lib should hold http client and common client-side logic and provide a json IO interface to work with native part for each IDE extension.
This should start with extracting node_scripts from current VIM plugin.
DAG config:
https://github.com/TabbyML/tabby/blob/main/tabby/tasks/trainer.yaml
Training:
https://github.com/TabbyML/tabby/blob/main/tabby/tools/trainer
I try to use the script in this repo to convert gptneox model from huggingface model file to fastertransformer model file.
It worked when I converted files for single-GPU inference. However, when I converted a 2-GPU version of the fastertransformer model file that should work with tensor parallel, it generated nonsensical results.
Model: https://huggingface.co/TabbyML/NeoX-70M
Convert command:
# 1-gpu
python huggingface_gptneox_convert.py \
-i /input/huggingface/model/path -o /output/fastertransfomrer/model/path -i_g 1 -m_n gptneox
# 2-gpu
python huggingface_gptneox_convert.py \
-i /input/huggingface/model/path -o /output/fastertransfomrer/model/path -i_g 2 -m_n gptneox
1-gpu FT model result:
[WARNING] gemm_config.in is not found; using default GEMM algo
[FT][WARNING] Skip NCCL initialization since requested tensor/pipeline parallel sizes are equals to 1.
[WARNING] gemm_config.in is not found; using default GEMM algo
[FT][WARNING] Skip NCCL initialization since requested tensor/pipeline parallel sizes are equals to 1.
====================
latency: 0.011725187301635742
--------------------
prompt:
--------------------
Game start,
--------------------
output:
--------------------
The first thing you notice is that the first thing you notice is that
2-gpu FT model resut:
My script did not judge the rank number before print, so the result was printed twice.
[WARNING] gemm_config.in is not found; using default GEMM algo
[WARNING] gemm_config.in is not found; using default GEMM algo
[FT][INFO] NCCL initialized rank=0 world_size=2 tensor_para=NcclParam[rank=0, world_size=2, nccl_comm=0x55b3a743e150] pipeline_para=NcclParam[rank=0, world_size=1, nccl_comm=0x55b3a6d71000]
[FT][INFO] NCCL initialized rank=1 world_size=2 tensor_para=NcclParam[rank=1, world_size=2, nccl_comm=0x557fa70e5f20] pipeline_para=NcclParam[rank=0, world_size=1, nccl_comm=0x557fa6aca340]
[WARNING] gemm_config.in is not found; using default GEMM algo
[WARNING] gemm_config.in is not found; using default GEMM algo
[FT][INFO] NCCL initialized rank=1 world_size=2 tensor_para=NcclParam[rank=1, world_size=2, nccl_comm=0x557fa70e5f20] pipeline_para=NcclParam[rank=0, world_size=1, nccl_comm=0x557fa6aca340]
[FT][INFO] NCCL initialized rank=0 world_size=2 tensor_para=NcclParam[rank=0, world_size=2, nccl_comm=0x55b3a743e150] pipeline_para=NcclParam[rank=0, world_size=1, nccl_comm=0x55b3a6d71000]
====================
latency: 0.011738300323486328
--------------------
prompt:
--------------------
Game start,
--------------------
output:
--------------------
,,,,,,,,,,,,,,,,
====================
latency: 0.011530399322509766
--------------------
prompt:
--------------------
Game start,
--------------------
output:
--------------------
,,,,,,,,,,,,,,,,
Please describe the feature you want
Support typescript in tabby
Additional context
Add any other context or screenshots about the feature request here.
Please reply with a π if you want this feature.
it seems that you use TensorFlow, so it should be fairly trivial, and m1/m2 should be powerful enough to run it, probably.
Hi,
I tried running the docker version mentioned in the README, only to be greeted with the following startup error, in a loop. nvidia-container-toolkit 1.12.1 is correctly installed.
2023-04-07 19:15:35,308 DEBG 'triton' stderr output:
terminate called after throwing an instance of 'std::runtime_error'
what(): [FT][ERROR] CUDA runtime error: operation not supported /workspace/build/fastertransformer_backend/build/_deps/repo-ft-src/src/fastertransformer/utils/allocator.h:160
[c8d62a9a73a2:00460] *** Process received signal ***
[c8d62a9a73a2:00460] Signal: Aborted (6)
[c8d62a9a73a2:00460] Signal code: (-6)
2023-04-07 19:15:35,308 DEBG 'triton' stderr output:
[c8d62a9a73a2:00460] [ 0] /usr/lib/x86_64-linux-gnu/libpthread.so.0(+0x14420)[0x7f61d6d24420]
[c8d62a9a73a2:00460] [ 1] /usr/lib/x86_64-linux-gnu/libc.so.6(gsignal+0xcb)[0x7f61d55af00b]
[c8d62a9a73a2:00460] [ 2] /usr/lib/x86_64-linux-gnu/libc.so.6(abort+0x12b)[0x7f61d558e859]
[c8d62a9a73a2:00460] [ 3] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x9e911)[0x7f61d5968911]
[c8d62a9a73a2:00460] [ 4] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xaa38c)[0x7f61d597438c]
[c8d62a9a73a2:00460] [ 5] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xaa3f7)[0x7f61d59743f7]
[c8d62a9a73a2:00460] [ 6] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xaa6a9)[0x7f61d59746a9]
[c8d62a9a73a2:00460] [ 7]
2023-04-07 19:15:35,308 DEBG 'triton' stderr output:
/opt/tritonserver/backends/fastertransformer/libtransformer-shared.so(_ZN17fastertransformer5checkI9cudaErrorEEvT_PKcS4_i+0x219)[0x7f617b00c0f9]
[c8d62a9a73a2:00460] [ 8] /opt/tritonserver/backends/fastertransformer/libtransformer-shared.so(_ZN17fastertransformer9AllocatorILNS_13AllocatorTypeE0EEC1Ei+0x123)[0x7f617b048d73]
[c8d62a9a73a2:00460] [ 9] /opt/tritonserver/backends/fastertransformer/libtransformer-shared.so(_ZN15GptJTritonModelI13__nv_bfloat16E19createModelInstanceEiiP11CUstream_stSt4pairISt6vectorIN17fastertransformer9NcclParamESaIS7_EES9_ESt10shared_ptrINS6_18AbstractCustomCommEE+0xa7)[0x7f617b108967]
[c8d62a9a73a2:00460] [10] /opt/tritonserver/backends/fastertransformer/libtriton_fastertransformer.so(+0x19b88)[0x7f61d0632b88]
[c8d62a9a73a2:00460] [11] /opt/tritonserver/backends/fastertransformer/libtriton_fastertransformer.so(+0x1a473)[0x7f61d0633473]
[c8d62a9a73a2:00460] [12] /opt/tritonserver/backends/fastertransformer/libtriton_fastertransformer.so(+0x3c29e)[0x7f61d065529e]
[c8d62a9a73a2:00460] [13] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xd6de4)[0x7f61d59a0de4]
[c8d62a9a73a2:00460] [14] /usr/lib/x86_64-linux-gnu/libpthread.so.0(+0x8609)[0x7f61d6d18609]
[c8d62a9a73a2:00460] [15]
2023-04-07 19:15:35,309 DEBG 'triton' stderr output:
/usr/lib/x86_64-linux-gnu/libc.so.6(clone+0x43)[0x7f61d568b133]
[c8d62a9a73a2:00460] *** End of error message ***
Please describe the feature you want
How would one achieve securing the admin panel as well as the server. How would we be able to setup the vscode extension to provide such authentication?
Following the README.md and using Docker version 20.10.21, build baeda1f
on Mac M1 with the following commands:
docker run \
-it --rm \
-v ./data:/data \
-v ./data/hf_cache:/home/app/.cache/huggingface \
-p 5000:5000 \
-e MODEL_NAME=TabbyML/J-350M \
tabbyml/tabby
leads to the following error:
docker: Error response from daemon: create ./data: "./data" includes invalid characters for a local volume name, only "[a-zA-Z0-9][a-zA-Z0-9_.-]" are allowed. If you intended to pass a host directory, use absolute path.
SOLUTION
docker run \
-it --rm \
-v "/$(pwd)/data:/data" \
-v "/$(pwd)/data/hf_cache:/home/app/.cache/huggingface" \
-p 5000:5000 \
--platform linux/amd64 \
-e MODEL_NAME=TabbyML/J-350M \
tabbyml/tabby
for explanation, see:
https://docs.docker.com/storage/bind-mounts/
https://stackoverflow.com/questions/46526165/docker-invalid-characters-for-volume-when-using-relative-paths
Please describe the feature you want
Is there some instruction how to install Tabby VSCode Extension ?
I searching in the repository or on the Tabby site but didn't find nothing.
Related: https://huggingface.co/replit/replit-code-v1-3b
Looking at https://huggingface.co/replit/replit-code-v1-3b/blob/main/replit_lm.py, it seems to be a standard GPT model adapted from https://github.com/mosaicml/examples/blob/52cd4fef69497f225a034fcd10692f8613732d10/examples/llm/src/models/mosaic_gpt/mosaic_gpt.py
Currently, we only support inference on CUDA. However, it would be advantageous to expand our support to include M1/M2, as many developers work on Mac laptops.
We'd like to revisited the feature once ggml became stable
References
Please reply with a π if you want this feature.
Nice to have access to such a great open source project. I tried running it on my tesla p40 but failed. Can you please provide the compilation method of the model so that I can try to compile it myself using compute capability 6.1.
By the way, is there a model with more parameters to try?
2023-04-11 16:42:16,242 DEBG 'admin' stderr output:
<jemalloc>: MADV_DONTNEED does not work (memset will be used instead)
<jemalloc>: (This is the expected behaviour if you are running under QEMU)
2023-04-11 16:42:20,620 DEBG 'admin' stdout output:
Collecting usage statistics. To deactivate, set browser.gatherUsageStats to False.
2023-04-11 16:42:20,628 DEBG 'admin' stderr output:
Traceback (most recent call last):
2023-04-11 16:42:20,631 DEBG 'admin' stderr output:
File "/home/app/.pyenv/versions/3.10.10/bin/streamlit", line 8, in <module>
sys.exit(main())
File "/home/app/.pyenv/versions/3.10.10/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
2023-04-11 16:42:20,633 DEBG 'admin' stderr output:
return self.main(*args, **kwargs)
File "/home/app/.pyenv/versions/3.10.10/lib/python3.10/site-packages/click/core.py", line 1055, in main
2023-04-11 16:42:20,637 DEBG 'admin' stderr output:
rv = self.invoke(ctx)
File "/home/app/.pyenv/versions/3.10.10/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
2023-04-11 16:42:20,641 DEBG 'admin' stderr output:
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/app/.pyenv/versions/3.10.10/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
2023-04-11 16:42:20,644 DEBG 'admin' stderr output:
return ctx.invoke(self.callback, **ctx.params)
File "/home/app/.pyenv/versions/3.10.10/lib/python3.10/site-packages/click/core.py", line 760, in invoke
2023-04-11 16:42:20,649 DEBG 'admin' stderr output:
return __callback(*args, **kwargs)
File "/home/app/.pyenv/versions/3.10.10/lib/python3.10/site-packages/streamlit/web/cli.py", line 209, in main_run
_main_run(target, args, flag_options=kwargs)
File "/home/app/.pyenv/versions/3.10.10/lib/python3.10/site-packages/streamlit/web/cli.py", line 245, in _main_run
2023-04-11 16:42:20,656 DEBG 'admin' stderr output:
bootstrap.run(file, command_line, args, flag_options)
File "/home/app/.pyenv/versions/3.10.10/lib/python3.10/site-packages/streamlit/web/bootstrap.py", line 397, in run
_install_pages_watcher(main_script_path)
File "/home/app/.pyenv/versions/3.10.10/lib/python3.10/site-packages/streamlit/web/bootstrap.py", line 373, in _install_pages_watcher
watch_dir(
File "/home/app/.pyenv/versions/3.10.10/lib/python3.10/site-packages/streamlit/watcher/path_watcher.py", line 153, in watch_dir
return _watch_path(
File "/home/app/.pyenv/versions/3.10.10/lib/python3.10/site-packages/streamlit/watcher/path_watcher.py", line 128, in _watch_path
watcher_class(
File "/home/app/.pyenv/versions/3.10.10/lib/python3.10/site-packages/streamlit/watcher/event_based_path_watcher.py", line 92, in __init__
2023-04-11 16:42:20,667 DEBG 'admin' stderr output:
path_watcher.watch_path(
File "/home/app/.pyenv/versions/3.10.10/lib/python3.10/site-packages/streamlit/watcher/event_based_path_watcher.py", line 170, in watch_path
folder_handler.watch = self._observer.schedule(
File "/home/app/.pyenv/versions/3.10.10/lib/python3.10/site-packages/watchdog/observers/api.py", line 301, in schedule
emitter.start()
File "/home/app/.pyenv/versions/3.10.10/lib/python3.10/site-packages/watchdog/utils/__init__.py", line 92, in start
self.on_thread_start()
File "/home/app/.pyenv/versions/3.10.10/lib/python3.10/site-packages/watchdog/observers/inotify.py", line 119, in on_thread_start
self._inotify = InotifyBuffer(path, self.watch.is_recursive)
File "/home/app/.pyenv/versions/3.10.10/lib/python3.10/site-packages/watchdog/observers/inotify_buffer.py", line 37, in __init__
self._inotify = Inotify(path, recursive)
File "/home/app/.pyenv/versions/3.10.10/lib/python3.10/site-packages/watchdog/observers/inotify_c.py", line 167, in __init__
Inotify._raise_error()
File "/home/app/.pyenv/versions/3.10.10/lib/python3.10/site-packages/watchdog/observers/inotify_c.py", line 432, in _raise_error
raise OSError(err, os.strerror(err))
OSError: [Errno 38] Function not implemented
2023-04-11 16:42:20,997 DEBG fd 6 closed, stopped monitoring <POutputDispatcher at 274928023968 for <Subprocess at 274928023872 with name admin in state RUNNING> (stdout)>
2023-04-11 16:42:20,999 DEBG fd 8 closed, stopped monitoring <POutputDispatcher at 274928472448 for <Subprocess at 274928023872 with name admin in state RUNNING> (stderr)>
2023-04-11 16:42:21,001 WARN exited: admin (exit status 1; not expected)
2023-04-11 16:42:21,003 DEBG received SIGCHLD indicating a child quit
2023-04-11 16:42:22,009 INFO spawned: 'admin' with pid 930
2023-04-11 16:42:23,018 INFO success: admin entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
The following command are used to launch docker, with chown -R $USER data
docker run \
-it --rm \
-v "/$(pwd)/data:/data" \
-v "/$(pwd)/data/hf_cache:/home/app/.cache/huggingface" \
-p 5001:5001 \
--platform linux/amd64 \
-e MODEL_NAME=TabbyML/J-350M \
tabbyml/tabby
Describe the bug
Got "attention mask and the pad token id were not set" when trying to use tabby with VSCode extension. When executing API requests or using playground everything working correct.
Console log when got error:
2023-04-12 11:44:42,190 DEBG 'server' stderr output:
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
2023-04-12 11:44:45,657 DEBG 'server' stdout output:
INFO: 172.17.0.1:0 - "POST /v1/completions HTTP/1.1" 200 OK
Also got the same result when using CPU.
Information about your GPU
Wed Apr 12 14:29:45 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.30.02 Driver Version: 531.18 CUDA Version: 12.1 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 3060 L... On | 00000000:01:00.0 Off | N/A |
| N/A 46C P8 12W / N/A| 179MiB / 6144MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 22 G /Xwayland N/A |
+---------------------------------------------------------------------------------------+
Additional context
Run command:
sudo docker run \
-it --rm \
--gpus all \
-v "/$(pwd)/data:/data" \
-v "/$(pwd)/data/hf_cache:/home/app/.cache/huggingface" \
-p 5000:5000 \
-e MODEL_NAME=TabbyML/J-350M \
--name=tabby \
tabbyml/tabby
Using WSL2, Ubuntu 22.04 distribution and Docker Desktop. GPU: Nvidia GeForce RTX 3060 Laptop.
Would like clearer instructions on how one would run this on an ubuntu machine locally without using docker.
Additional context
NVIDIA/nvidia-container-toolkit#229
https://forums.docker.com/t/nvidia-cuda-doesnt-work-on-docker-desktop-but-works-on-docker-engine/130668/5
Please reply with a π if you want this feature.
I follow the quick start, then visit the http://127.0.0.1:5000/_admin/
it shows
triton | down server,vector,dagu | live
Congrats, your server is live!
To get started with Tabby, you can either install the extensions below or use the Editor.
but when i go to Editor, it can't complete for my code.
I go to the log, and got thisοΌ
2023-04-07 09:45:20,415 DEBG 'server' stderr output:
As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
2023-04-07 09:45:20,415 DEBG 'server' stderr output:
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
what should i do
Sorry I'm not a professional programmer, but the read me is really unhelpful. Need clearer detail to install
Adding API-level cache Key: hash(completion request) -> Value: completion response
at client-side should improve response speed, and reduce API requests and model invocation.
This should be useful in the case of user typing backspace.
Related to #130.
On running the docker script
mkdir -p data/hf_cache && chown -R 1000 data
docker run \
--gpus all \
-it --rm \
-v "/$(pwd)/data:/data" \
-v "/$(pwd)/data/hf_cache:/home/app/.cache/huggingface" \
-p 5000:5000 \
-e MODEL_NAME=TabbyML/J-350M \
-e MODEL_BACKEND=triton \
--name=tabby \
tabbyml/tabby
I get the error
mkdir: cannot create directory '/data/config': Permission denied
Similar to issue 58. But I didn't use sudo and I don't see anything in the script pointing to an absolute path /data/config.
I'm running NixOS with an Nvidia 1060.
Hey guys,
Is there a specific reason why you removed the CPU mode from the project readme's description? If access to GPU is not available, is there an alternative way to run the project?
Using Mac M1 and the following for running the Docker container:
# Create data dir and grant owner to 1000 (Tabby run as uid 1000 in container)
sudo mkdir -p data/hf_cache && chown -R 1000 data
docker run \
-it --rm \
-v "/$(pwd)/data:/data" \
-v "/$(pwd)/data/hf_cache:/home/app/.cache/huggingface" \
-p 5000:5000 \
--platform linux/amd64 \
-e MODEL_NAME=TabbyML/J-350M \
tabbyml/tabby
I get this error output:
WARNING: The NVIDIA Driver was not detected. GPU functionality will not be available.
Use the NVIDIA Container Toolkit to start this container with GPU support; see
https://docs.nvidia.com/datacenter/cloud-native/ .ERROR: This container was built for CPUs supporting at least the AVX instruction set, but
the CPU detected was , which does not report
support for AVX. An Illegal Instrution exception at runtime is likely to result.
See https://en.wikipedia.org/wiki/Advanced_Vector_Extensions#CPUs_with_AVX .mkdir: cannot create directory '/data/config': Operation not permitted
Is this error related to incompatibility with Mac M1, the docker run command, or something entirely different?
Hello,
I tried it on my Mac OS and I got this error:
https://app.warp.dev/block/0irl2mLPBzNRJCWZIXUtFI
Any help?
How to deploy locally In Linux
When an inline suggestion appears on the UI, Tabby clients should send a view
event to the server.
We currently need a proposed VSCode API to accomplish this.
Waiting for an update from VSCode to implement this.
Describe the bug
running tabby server but client cant output anything generate
Additional context
2023-05-09 12:27:00,172 DEBG 'dagu_scheduler' stdout output:
2023/05/09 12:27:00 Failed to start DAG: exit status 1
2023-05-09 12:27:00,172 DEBG 'dagu_scheduler' stdout output:
2023/05/09 12:27:00 runner: entry failed analytic: exit status 1
2023-05-09 12:30:00,059 DEBG 'dagu_scheduler' stdout output:
2023/05/09 12:30:00 [2023-05-09 12:30:00] start analytic
2023-05-09 12:30:00,063 DEBG 'dagu_scheduler' stdout output:
2023/05/09 12:30:00 server is running at "/tmp/@dagu-analytic-af09eed2a725067a7f323f1fc0f93c29.sock"
2023-05-09 12:30:00,063 DEBG 'dagu_scheduler' stdout output:
2023/05/09 12:30:00 start running: Collect Tabby
2023-05-09 12:30:00,137 DEBG 'dagu_scheduler' stdout output:
2023/05/09 12:30:00 Collect Tabby failed
2023-05-09 12:30:00,164 DEBG 'dagu_scheduler' stdout output:
2023/05/09 12:30:00 schedule finished.
2023-05-09 12:30:00,168 DEBG 'dagu_scheduler' stdout output:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. πππ
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google β€οΈ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.