Hi! Your benchmarks are functioning well with version 0.3.0 of lm-ev

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

I checked the link you provided <a href="https://github.com/artemorloff/l

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

0.4.0 lm-evaluation-harness about mera HOT 9 OPEN

germanjke commented on August 14, 2024

0.4.0 lm-evaluation-harness

from mera.

Comments (9)

germanjke commented on August 14, 2024 1

@LSinev Hi,
Looks like vllm engine which supported in 0.4.0 works faster than hf engine

from mera.

LSinev commented on August 14, 2024 1

I checked the link you provided here, a

This link goes to fork of lm-evaluation-harness. In this fork there is a code needed for RuTiE task, which is PRed in lm-evaluation-harness, but not yet approved and merged.
There is no plans yet to submit MERA tasks directly into lm-evaluation-harness.

new_harness_codebase is using 0.4.x code, but tasks are not in fully yaml format yet (will be, but not yet, just like, for example, SQUADv2 task in lm-evaluation-harness). MERA tasks are stored in https://github.com/ai-forever/MERA/tree/update/new_harness_codebase/benchmark_tasks as new code allows to use tasks from external directory.

from mera.

LSinev commented on August 14, 2024

Yes, there are! :) stay tuned!

from mera.

LSinev commented on August 14, 2024

Do you have any particular expectations for improvements with the upgrade to the 0.4.0+ backend?

from mera.

germanjke commented on August 14, 2024

hello guys! can I ask you, do you work on this topic, maybe you have some estimated dates?
@LSinev

from mera.

LSinev commented on August 14, 2024

will give more information next week, or may be even branch for playing/testing work in progress

from mera.

LSinev commented on August 14, 2024

new_harness_codebase — "work in progress" branch with submoduled patched (waiting for PR to be merged) lm-evaluation-harness.
All scores will change. Leaderboard will not publish these yet, but you can use for private scoring. Baseline models scoring should be done by you. Changes to model running code (lm-evaluation-harness side) should be done at their repository to be supported here.

from mera.

germanjke commented on August 14, 2024

great, thank you!

from mera.

germanjke commented on August 14, 2024

Hi @LSinev,

I noticed that the tasks from the branch do not include the MERA tasks in 0.4.x format. I checked the link you provided here, and it seems they are indeed missing.

Could you please confirm if the MERA tasks will be added to this branch, or if there is another location where they might be available?

Thanks!

from mera.

Recommend Projects

0.4.0 lm-evaluation-harness about mera HOT 9 OPEN

Comments (9)

Related Issues (14)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent