dborrelli / chat-intents Goto Github PK
View Code? Open in Web Editor NEWClustering sentence embeddings to extract message intent
License: MIT License
Clustering sentence embeddings to extract message intent
License: MIT License
@dborrelli When I specify a value for the "random_state" parameter in the "bayesian_search," I receive the following warning: "UserWarning: n_jobs value -1 overridden to 1 by setting random_state. Use no seed for parallelism. warn(f"n_jobs value {self.n_jobs} overridden to 1 by setting random_state. Use no seed for parallelism.")
Hyper Param tuning is taking significant amount of time.I want to use the 'random_state' parameter to ensure reproducibility, while also setting 'n_jobs' to -1 to enable parallel processing. What's the best way to achieve this?"
Hi,
I am using chat-intents and the clustering works very well.
However, I am working with french data and the label extraction gives poor results. I assume it's because this method necessarily uses a specialized spacy model for English.
I was wondering if the name of the loaded spacy model or at least the language could be passed as a parameter of apply_and_summarize_labels
for example ?
This way, the performance could be much better for all languages other than English.
!pip install chatintents
leads to
ERROR: Could not find a version that satisfies the requirement chatintents (from versions: none)
ERROR: No matching distribution found for chatintents
I'm using google colab
Hi, how do I put a limit on the maximum number of topics to be generated? And a minimum? Is there a way to do this within the hyper parameter optimization?
Thanks,
Ari
Hi, while I'm using apply_and_summarize_labels,
it's causing an issue as below. Please help.
df_summary, labeled_docs = model.apply_and_summarize_labels(data_sample.sentence)
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
/tmp/ipykernel_21756/2555802596.py in <module>
----> 1 df_summary, labeled_docs = model.apply_and_summarize_labels(data_sample.sentence)
/opt/conda/lib/python3.9/site-packages/chatintents/ChatIntents.py in apply_and_summarize_labels(self, df_data)
418 df_clustered[category_col] = self.best_clusters.labels_
419
--> 420 numerical_labels = df_clustered[category_col].unique()
421
422 # create dictionary mapping the numerical category to the generated
AttributeError: 'numpy.ndarray' object has no attribute 'unique'
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.