Giter Club home page Giter Club logo

Comments (5)

virtualarchitectures avatar virtualarchitectures commented on August 24, 2024 1

That was spot on. It is now reading my vault in Windows perfectly from the top level down. This is a really nice package. Great to have a bridge from Obsidian to NetworkX! Also a useful reminder for myself on explicit UTF-8 encoding. Many thanks.

from obsidiantools.

virtualarchitectures avatar virtualarchitectures commented on August 24, 2024

I looked into this further and tried running in VSCode to see if I could spot an issue. Validating the path with os.path.exists(vault_dir) returns True. I checked over my Obsidian note.md names for any slashes but the only non-alpha numeric characters I have are: '_', '-' and '&'.

from obsidiantools.

mfarragher avatar mfarragher commented on August 24, 2024

@virtualarchitectures thanks for raising this - this doesn't look like a filename issue from the error trace, it looks like this is happening when content in a file is being parsed.

Are you using non-Latin text in notes, any special characters, etc.? This is more info on the 0x9d character online. A character that you may not be able to see clearly in one of your notes, so it may be a pain to find which file has it.

It's not ideal that the vault 'connection' fails because of (possibly) one note not being able to be read. Would be great to see how we can make this more robust. 🙂

The python frontmatter parsing uses utf-8 as default and that's kept in the implementation of obsidiantools. It would be difficult to get an approach that can infer the right encoding of a file. One of your notes probably doesn't align with utf-8.

Initial ideas on what could be done as a 'fix', to let the vault setup complete:

  • Robust but tricky: one way around this may be to introduce chardet as a dependency to handle files where utf-8 doesn't work
  • Quick way: if file is not utf-8, print an error and let other files in the vault continue processing so that the vault object can still be set.

from obsidiantools.

virtualarchitectures avatar virtualarchitectures commented on August 24, 2024

Thanks for looking into this. I've isolated it to a folder containing of notes containing iFrames for YouTube videos. I can connect to the individual files .md files in the folder but not to the to the folder itself. Doing so raises the original Unicode Error. The notes in that folder don't contain any inlinks but they do have YAML frontmatter with tags. The tags seem pretty standard though.

I tried writing a loop to read each file with UTF-8 encoding using os.open() but was getting false 'file not found' errors. I think that was because I'm not so familiar with pathlib. If there's a loop you want me to try and run to isolate the error and get some more info I'm happy to give it a try.

from obsidiantools.

mfarragher avatar mfarragher commented on August 24, 2024

Try re-installing the package to bring in changes I've done in dev branch, and see if the error persists?:
pip install git+https://github.com/mfarragher/obsidiantools.git@dev

The main change I've done is to specify utf-8 encoding when a markdown file is opened. What might be happening is that your computer uses something different, so setting it as utf-8 explicitly might help: https://www.python.org/dev/peps/pep-0597/

from obsidiantools.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.