Comments (5)
Hey @letto4135, once #692 is merged, Khoj will be able to index all text files, including your code files. This should work both when indexing local folders or Github repositories.
So maybe it just doesn't index because it doesn't go into sub folders to look for things to index?
The desktop app should index recursively down, so should include subfolders as well if you index ~/gh
.
from khoj.
Hey @sabaimran, continuing from discord.
Just for context I'm running locally on an apple silicon Mac installed with pip.
I'm surprised you said there isn't much usage of this integration, I wonder if most people just index the repos locally? I'm not entirely sure that the indexing local folders works though so I don't like it myself. If I add a folder from the desktop app nothing happens on the server logs when I click save so I've got no way of knowing if it is actually indexing anything I asked it to. And checking the "Files" in settings on the web ui only shows documents from obsidian so it makes me think it didn't index correctly.
I have 2 gh accounts one personal and one business so I'd like to be able to set up PATs for both and I'm not sure what would happen if I removed one PAT to index repos in the other account if it would effect the repos at some point from the PAT I removed...
The gh integration seems to only allow one repo add at a time, if you try and add multiple and then save it thows errors in the server logs, but still acts like it saved them all in the UI with the "save successful" message, which is why I suggested a check mark or something to say yes they've been indexed, because you really can't tell it failed from the UI.
KeyError: 'tree'
[11:38:05.870103] ERROR 🚨 Failed to update server via API: Failed to update content index api.py:188
╭──────────────────────────────────────── Traceback (most recent call last) ────────────────────────────────────────╮
│ /opt/homebrew/lib/python3.11/site-packages/khoj/routers/api.py:185 in update │
│ │
│ 182 │ │ logger.warning(error_msg) │
│ 183 │ │ raise HTTPException(status_code=500, detail=error_msg) │
│ 184 │ try: │
│ ❱ 185 │ │ initialize_content(regenerate=force, search_type=t, init=False, user=user) │
│ 186 │ except Exception as e: │
│ 187 │ │ error_msg = f"🚨 Failed to update server via API: {e}" │
│ 188 │ │ logger.error(error_msg, exc_info=True) │
│ │
│ /opt/homebrew/lib/python3.11/site-packages/khoj/configure.py:265 in initialize_content │
│ │
│ 262 │ │ │ │ if not status: │
│ 263 │ │ │ │ │ raise RuntimeError("Failed to update content index") │
│ 264 │ │ except Exception as e: │
│ ❱ 265 │ │ │ raise e │
│ 266 │
│ 267 │
│ 268 def configure_routes(app): │
│ │
│ /opt/homebrew/lib/python3.11/site-packages/khoj/configure.py:263 in initialize_content │
│ │
│ 260 │ │ │ │ │ user=user, │
│ 261 │ │ │ │ ) │
│ 262 │ │ │ │ if not status: │
│ ❱ 263 │ │ │ │ │ raise RuntimeError("Failed to update content index") │
│ 264 │ │ except Exception as e: │
│ 265 │ │ │ raise e │
│ 266 │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeError: Failed to update content index
[11:38:05.897153] INFO 127.0.0.1:62902 - "GET /api/update?t=github HTTP/1.1" 500
And on that, being able to index entire orgs or entire users all at once would be nice. You can find out all the repos and the default branch of each easy enough with the gh cli. It could potentially create an entry for every repo it finds under a user/org with the info filled in so you can adjust it if needed.
A "Reindex" button, or auto reindex on a schedule or both would be nice as well. I don't have that many repos so telling it to reindex once a week or something would be fine, but if someone has a lot of repos they might not want to do that and opt to reindex manually when needed.
from khoj.
@letto4135 what type of files are you trying to index? It's worth mentioning that currently, only plaintext files and PDFs are supported from the desktop application. So, you're .txt
, .md
, etc files should be picked up.
Could you describe what sort of chatting/interactions you're hoping to do? Would you want to chat with documentation, or the underlying code itself?
Let me try out the integration again and let you know if I find bugs/repro. If the repo you're indexing is public, send me a link and I'll go ahead and try that directly?
from khoj.
I'm indexing the code instead of using the gh integration, the thought was if I couldn't get GH integration to work easily that I would index the folder where the code is instead, so for me I've got
ls ~/gh
- ~/gh/<repo1>
- ~/gh/<repo2>
- etc...
So maybe it just doesn't index because it doesn't go into sub folders to look for things to index? I suppose it wouldn't index much anyway, just the readmes from what you're saying, but its not doing that either.
Could you describe what sort of chatting/interactions you're hoping to do? Would you want to chat with documentation, or the underlying code itself?
I'd like it to index the code so it has context over everything when I ask a question. Kind of like copilot and JetBrains AI, but potentially better because it can have context over multiple repos that work together in a system instead of only the one open currently.
from khoj.
@debanjum Gorgeous! 🙇
from khoj.
Related Issues (20)
- [FIX] Issue with file filter being applied to the notes command
- [IDEA] Support exclusion file filters HOT 2
- [FIX] Documents take a long time to start indexing from desktop app HOT 1
- Default ollama support? HOT 3
- Bugs? HOT 3
- [FIX] Bad Request (400) running in docker HOT 1
- How to use with docker commands? HOT 2
- [FIX] Docker-compose up HOT 2
- [IDEA] Incorporate Google Search API
- 如何使用 HOT 1
- [FIX] Obsidian - not able to select text
- Suggestion for Persian Font Improvement HOT 4
- [IDEA] Let's make the UI more cooler! HOT 2
- [IDEA] Live Scrolling with Scrolling Buffer HOT 2
- [FIX] Force Sync never resolving
- [FIX] Permission denied HOT 4
- Have /help constrain the request to site:khoj.dev to get help
- [IDEA] HOT 1
- [IDEA] HOT 3
- [IDEA] HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from khoj.