Comments (10)
I assume you're running the server on your local machine. Could you please check the memory/CPU usage while the server is running? The FastAPI server logs you provided indicate that error code 134 might be due to insufficient resources.
from labml.
Hello, I am running this on my server with ssh. My CPU model is Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
, RAM is 128GB
and I am not running any other resource consuming applications.
I have observed the CPU and memory usage using htop
and it looks like there are no anomalies in resource usage.
from labml.
Thanks, could you please verify if MongoDB is installed and running on the default port (27017)? Additionally, could you provide details about the operating system?
from labml.
Sorry, I observed the service in running state after installing MongoDB and assumed it was working fine, when in fact for some reason it quit unexpectedly. I restarted MongoDB and it worked.
Thank you for your help, this is now resolved but I still have two minor issues.
1. The first one is that I have configured the following configuration item in .labml.yaml
, based on the pypi and guides:
web_api: 'http://localhost:5005/api/v1/track?'
But it shows
LABML WARNING: Method Not Allowed 405: http://localhost:5005/api/v1/track?run_uuid=0b405056f0c811ee902521fc9abb02ad&rank=0&world_ size=0&labml_version=0.4.168.
I'm guessing it might be caused by different versions of different components of labml not adapting?
2. The second question is, if I have completed a training session and the logs folder has not been artificially modified since the training, can I still see the entire training process from the browser?
from labml.
Glad that your problem is solved now.
- I think you need to update the labml pip package
- All the data to view training progress in the browser is kept in a mongodb database, so you should able to view the training progress in the browser, But it should also save in the log folder.
from labml.
Thank you for your help! I have no further questions and I will close this issue.
from labml.
Thank you for your help.
I have solved my problem, but I have some suggestions. The latest version of labml-nn is currently 0.4.136, which is incompatible with the latest versions of labml (0.5.1) and labml-app (0.5.2). If the user uses pip install labml-app labml labml-nn
for installation, it will result in an exception.
Also, different versions of labml use different configuration files, e.g. web_api
in some versions, app_url
in others, and I haven't found a complete documentation on which configuration file should be used for that version,.
I'm not sure what the URL of that configuration file should be for each version, Sometime it is http://localhost:5005/api/v1/track?
,sometimes it is http://localhost:5005/api/v1/default
.
Your work has greatly facilitated my development tasks, thank you again!
from labml.
We have updated the Readme.md with the latest configuration.
from labml.
We have updated the Readme.md with the latest configuration.
Yes, but the instructions in README.md apply to the latest version of labml(0.5.1) and labml-app(0.5.2). However, the latest version of labml-nn(0.4.136) does not adapt to the latest version of labml.
I am trying to run switch transfomers, and if I use the latest version of labml-nn as well as the latest version of labml, these codes report an error, so I have to use an older version of labml as well as an older version of labml-app, but I don't know where the documentation is for these older versions.
from labml.
We have updated the Readme.md with the latest configuration.
Yes, but the instructions in README.md apply to the latest version of labml and labml-app. However, the latest version of labml-nn does not adapt to the latest version of labml.
I am trying to run switch transfomers, and if I use the latest version of labml-nn as well as the latest version of labml, these codes report an error, so I have to use an older version of labml as well as an older version of labml-app, but I don't know where the documentation is for these older versions.
from labml.
Related Issues (20)
- Implement Flash attention stable diffusion problems HOT 1
- Hardware Naming In Monitor HOT 1
- UnicodeEncodeError: 'gbk' codec can't encode character HOT 1
- Tracker bug: UnicodeEncodeError: 'charmap' codec can't encode characters HOT 1
- Running issue... HOT 1
- Columns and DataType Not Explicitly Set on line 133 of build_numpy_cache.py
- Failed to connect server HOT 3
- labml.experiment' has no attribute 'add_pytorch_models' HOT 1
- Show experiment description in compare view HOT 1
- In smoothing make "left" on by default HOT 1
- "Failed to load metrics" error HOT 1
- Imports Lightning loggers
- Tags HOT 1
- API
- More search options
- Multiple comparison charts
- Show clear the runs list when searching
- Some private runs visible on homepage before login or refresh HOT 1
- How can I restarted thread again? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from labml.