Comments (5)
@polm could you please chime in
from fugashi.
I'm not sure I understand your problem entirely, but it sounds like you want to download the UniDic dictionary from code instead of the command line. You can do that like this:
from unidic.download import download_version
download_version()
The command line download
command is just a wrapper for that function.
from fugashi.
I'm not sure I understand your problem entirely, but it sounds like you want to download the UniDic dictionary from code instead of the command line. You can do that like this:
from unidic.download import download_version download_version()
The command line
download
command is just a wrapper for that function.
Thank you @polm!
But I am still getting the error "The unidic_lite dictionary is not installed".
My model is making a call to the class BertJapaneseTokenizer. (I would request you to take a look at the source code here. It will greatly facilitate my explanation: [https://huggingface.co/transformers/v4.11.3/_modules/transformers/models/bert_japanese/tokenization_bert_japanese.html]
You will see that on following the trail of function calls in reverse, it looks like that error can occur only when the variable mecab_dic is equal to = 'unidic_lite' (mecab_dic is an attribute of the class MecabTokenizer that you will also find in the link above).
My question is can I override this value of mecab_dic and change it to "unidic" instead of the current "unidic_lite"
That way I will not get the error.
from fugashi.
I looked at the code that you linked to but I'm not sure what to tell you - I didn't write that code and it's not part of this library, so I have no control over it. Your code (which you haven't shared) or the HuggingFace code is setting that value somewhere.
I'm also not sure why you think you can't use unidic-lite. Can you explain that or try it?
from fugashi.
Closing for lack of response / because this issue isn't relevant to fugashi directly.
For the record, if you want help with this you should show the code where you're using BertJapaneseTokenizer and explain what you're actually trying to do. It sounds like you have a usage question about the HuggingFace code, but with just the information you've given there's no way to tell what's going on.
from fugashi.
Related Issues (20)
- How to use with Contemporary Spoken Japanese dictionary unidic? HOT 3
- method for preserving half-width spaces? HOT 8
- Unable to Install (Windows x64, Python 3.11.0, fugashi 1.2.0) HOT 3
- When building a user dict, check number of fields
- UniDic v3.1.1 サポート件 HOT 1
- Importing fugashi raises ImportError on macOS HOT 3
- cmmap_->open(filename, mode)] cannot open HOT 12
- Lemmatizing particles に、で HOT 3
- Vectorizing Japanese After Lemmatization HOT 1
- Is it possible to apply the user dictionary which is a object instead of a file ? HOT 2
- Questions and thoughts(fix of making user dict, unidic terms and mecab_node_t attributes) HOT 5
- Add access to more Node fields
- Installing error when using `python:alpine` as the base image HOT 7
- Failed initializing MeCab HOT 4
- Question about installing on visual studio 2022 windows HOT 3
- Can't install on MacOS Ventura Intel x86 Python 3.11 HOT 5
- Pylance linting gives error: "Tagger" is not a known member of module "fugashi" HOT 2
- 'kana' field differs between the raw MeCab output and the Fugashi tagger output, returning "体" HOT 2
- Unable to build wheels, Python3 Windows install HOT 7
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fugashi.