Giter Club home page Giter Club logo

Comments (31)

t1101675 avatar t1101675 commented on September 6, 2024

您可以把指定的端口换成别的端口。

from eva.

jiangliqin avatar jiangliqin commented on September 6, 2024

我指定的端口是6088 和默认的6000并不是一个端口,deepspeed.init_distributed()也会提示Address already in use

from eva.

jiangliqin avatar jiangliqin commented on September 6, 2024

换成其他端口也会冲突

from eva.

xwwwwww avatar xwwwwww commented on September 6, 2024

deepspeed占用的端口由bash脚本中的--master_port指定。
如果与指定的端口不同,但deepspeed仍报错Address already in use,请自行检查上次运行的进程是否已经正常结束。

from eva.

jiangliqin avatar jiangliqin commented on September 6, 2024

我的意思是bash脚本中有master_port,执行的eva_server.py是个flask服务端口比如是3088,flask服务启动后,调用main()中generate_samples部分会提示Address already in use,

from eva.

jiangliqin avatar jiangliqin commented on September 6, 2024

启动flask服务就是想用接口体验对话服务,所以使用了generate_samples部分,等于是一个服务启动了2个端口

from eva.

t1101675 avatar t1101675 commented on September 6, 2024

您可以先在同样的机器和环境里试一下直接跑 interactive 脚本能不能跑起来

from eva.

jiangliqin avatar jiangliqin commented on September 6, 2024

可以跑interactive起来

from eva.

jiangliqin avatar jiangliqin commented on September 6, 2024

想做成flask接口服务进行体验

from eva.

t1101675 avatar t1101675 commented on September 6, 2024

检查一下是否调用了多次 deepspeed.init_distributed(),如果还不 work 的话检查一下 flask 的使用是不是有问题

from eva.

jiangliqin avatar jiangliqin commented on September 6, 2024

image

from eva.

jiangliqin avatar jiangliqin commented on September 6, 2024

检查了没问题,deepspeed.init_distributed的tcp和flask tcp 无法在一个服务中并存,请问你们有尝试过吗?

from eva.

t1101675 avatar t1101675 commented on September 6, 2024

没有

from eva.

jiangliqin avatar jiangliqin commented on September 6, 2024

你好,请问可以不用分布式载入模型吗?

from eva.

jiangliqin avatar jiangliqin commented on September 6, 2024

你好,请问可以不用分布式载入模型吗?

已经转成了MP_SIZE=1并且可以在单卡载入,怎么可以不用torch.distributed.launch分布式载入呢?

from eva.

jiangliqin avatar jiangliqin commented on September 6, 2024

@t1101675 请帮忙抽空解答下,谢谢哈

from eva.

t1101675 avatar t1101675 commented on September 6, 2024

这个我们目前没有试过。之后我们会提供 huggingface 版本的代码,那时应该可以不用分布式载入模型。

from eva.

jiangliqin avatar jiangliqin commented on September 6, 2024

好的,可以告知下大概时间吗?

from eva.

t1101675 avatar t1101675 commented on September 6, 2024

这个不太确定,得看我们这边项目的安排,代码整理完后会第一时间通知您

from eva.

jiangliqin avatar jiangliqin commented on September 6, 2024

@t1101675 你好,请问在eva_interactive封装一层flask
第一种:封装成类实例化后,flask监听端口,加载模型成功后会再次加载模型,导致master_port占用
用eva_inference_interactive_beam.sh启动会提示address is in use,现象是model会load两次, deepspeed.init_distributed()执行了2遍。

if __name__ == '__main__':
    model = eva_model()
    app.run(host='0.0.0.0', port=xxx, debug=True)

第二种:flask先监听端口,再实例化类加载模型,不会端口占用,但是在flask路由中调用model.generate_samples()提示model未定义错误。
if name == 'main':
app.run(host='0.0.0.0', port=xxx, debug=True)
model = eva_model()

2种封装形式皆不成功,请帮忙看看,能帮忙给点建议吗?谢谢~

from eva.

jiangliqin avatar jiangliqin commented on September 6, 2024

@t1101675 能帮忙指点下吗?

from eva.

t1101675 avatar t1101675 commented on September 6, 2024

您好,我们并没有使用 flask 的经验,不清楚为什么会出现这样的错误。我们用到的分布式框架基本没有超出 torch 自带框架的范围,所以您可以参考其官方文档或者其它与 flask 结合的项目。并且用如何其他框架对模型进行封装和部署也不在本项目的考虑范围之内。

from eva.

Hermes777 avatar Hermes777 commented on September 6, 2024

我遇到过类似的问题,记得把debug模式关了就好了

from eva.

Hermes777 avatar Hermes777 commented on September 6, 2024

但关debug模式的方法,只对单卡模型有效,对于多卡还是会报错

from eva.

jiangliqin avatar jiangliqin commented on September 6, 2024

@Hermes777 您说的debug模式是部署成server服务,而不是窗口交互方式吗?具体怎么做的?

from eva.

Hermes777 avatar Hermes777 commented on September 6, 2024

暂时就是最简单的这种,但我也试过把他作为server,稍稍改一下就行
交互都写在generate_samples方法里面


from flask import Flask
from werkzeug.contrib.cache import SimpleCache
from flask import Flask, render_template, request, jsonify, session
cache = SimpleCache()

app = Flask(__name__)

@app.route("/")
def hello():
    print("Init /")
    return render_template('chat.html')

@app.route("/ask", methods=['POST','GET'])
def ask():
    print("Init /ask")
    message = str(request.form['messageText'])
    full_context_list = cache.get('history')
    if full_context_list is None:
        full_context_list = []

    bot_response,full_context_list = generate_samples(message,full_context_list)
    cache.set('history', full_context_list)

    return jsonify({'status':'OK','answer':bot_response})


if __name__ == "__main__":
    print("prepared")
    app.run(debug=False,host='0.0.0.0',port=8603)

from eva.

jiangliqin avatar jiangliqin commented on September 6, 2024

这是flask的简单示例,我意思是将EVA推理部分加在这个flask里面是不成功的,出现我上面描述的2种问题现象 @Hermes777

from eva.

Hermes777 avatar Hermes777 commented on September 6, 2024

这是flask的简单示例,我意思是将EVA推理部分加在这个flask里面是不成功的,出现我上面描述的2种问题现象 @Hermes777

我的意思是,亲测简单示例就能成功,只要你把debug设成False。因为Flask的debug模式会同时开两个进程。至于多卡的情况,我暂时也没法解决,但如果只是看个demo,或者把它设成一个restful server,那么简单示例是完全够用了

from eva.

jiangliqin avatar jiangliqin commented on September 6, 2024

试了下在单卡可以了

from eva.

Hermes777 avatar Hermes777 commented on September 6, 2024

@Hermes777 你知道业内有开源其他的开放域闲聊模型吗?

英文应该有不少,任务导向的中文机器人也有一些。但中文闲聊机器人除了这一工作,就只有CDial-GPT,是同一个组的,但更早(不过反正最后都要加入规则的)。另外你可以试试用通用文本模型,来对付对话数据,不保证能够有用(据说光是用CPM训练,也会比CDial-GPT更好)

from eva.

jiangliqin avatar jiangliqin commented on September 6, 2024

@Hermes777 你知道业内有开源其他的开放域闲聊模型吗?

英文应该有不少,任务导向的中文机器人也有一些。但中文闲聊机器人除了这一工作,就只有CDial-GPT,是同一个组的,但更早(不过反正最后都要加入规则的)。另外你可以试试用通用文本模型,来对付对话数据,不保证能够有用(据说光是用CPM训练,也会比CDial-GPT更好)

这是flask的简单示例,我意思是将EVA推理部分加在这个flask里面是不成功的,出现我上面描述的2种问题现象 @Hermes777

我的意思是,亲测简单示例就能成功,只要你把debug设成False。因为Flask的debug模式会同时开两个进程。至于多卡的情况,我暂时也没法解决,但如果只是看个demo,或者把它设成一个restful server,那么简单示例是完全够用了

多卡部署server你有相关经验吗?

from eva.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.