Giter Club home page Giter Club logo

Comments (5)

StanGirard avatar StanGirard commented on June 8, 2024 1

The goal here is to evaluate an LLM in realtime. We give them the ability to make 3-5 moves ahead of time. Large LLMs can generate more move but yes they take longer.

The goal is to have that inference latency but we could add an option to remove this with a parameter for some games.

Please feel free to open a PR to put this into place but optionnaly and not by default ;)

from llm-colosseum.

taozhiyuai avatar taozhiyuai commented on June 8, 2024 1

WechatIMG83

win rate 44% after 50 rounds

@oulianov

from llm-colosseum.

taozhiyuai avatar taozhiyuai commented on June 8, 2024

in my experience, yes. small model has high token/second, always generate actions. while big model waits for tokens to know how to re-act. @_@

from llm-colosseum.

taozhiyuai avatar taozhiyuai commented on June 8, 2024

The record show small model can generate more actions with high token/second

0.5b wins 3 rounds!

Player 1 using: ollama:qwen:14b-chat-v1.5-fp16
Player 2 using: ollama:qwen:0.5b-chat-v1.5-fp16

Round 1

🏟️ (0647) (0)Starting game
🏟️ (0647) (0)Waiting for fight to start
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
2024-03-30 12:20:26.448 | WARNING | agent.robot:get_moves_from_llm:317 - Many invalid moves: ['Evaluate Opponent', 'Assess Distance for Effective Attacks']
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 1 move: super attack 3
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 2 move: low kick
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 1 move: super attack 3
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 2 move: fireball
Player 2 move: super attack 2
Player 2 move: super attack 3
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 1 move: jump closer
Player 1 move: megapunch
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 2 move: low kick
Player 2 move: medium kick
Player 2 move: high kick
Player 2 move: medium kick
Player 2 move: high kick
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 1 move: move closer
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 2 move: fireball
Player 2 move: high kick
Player 2 move: low kick
Player 2 move: super attack 2
Player 2 move: super attack 3
Player 2 move: super attack 4
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 1 move: fireball
Player 1 move: move closer
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 1 move: fireball
Player 1 move: move closer
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 2 move: low kick
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 1 move: super attack 2
Player 1 move: move closer
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 2 move: low kick
Player 2 move: medium kick
Player 2 move: high kick
Player 2 move: jump away
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 1 move: move closer
Player 1 move: high punch
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 1 move: move closer
Player 1 move: high punch
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 1 move: jump away
Player 1 move: megapunch
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 2 move: high kick
Player 2 move: low punch
Player 2 move: high punch
Player 2 move: low kick
Player 2 move: low punch
Player 2 move: low punch
2024-03-30 12:21:41.329 | WARNING | agent.robot:get_moves_from_llm:317 - Many invalid moves: ['Mid Punch', 'Mid Punch']
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 1 move: move closer
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
🏟️ (0647) (0)Round won by P2
(0)Moving to next round
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 1 move: megapunch
Player 1 move: hurricane
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 2 move: low punch
Player 2 move: medium punch
Player 2 move: high punch
Player 2 move: low kick
Player 2 move: medium kick
Player 2 move: high kick
Player2 ollama:qwen:0.5b-chat-v1.5-fp16 Daddy won!

—————————

round 2

🏟️ (2b8a) (0)Starting game
🏟️ (2b8a) (0)Waiting for fight to start
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 1 move: super attack 3
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 2 move: high kick
Player 2 move: low kick
Player 2 move: low punch
Player 2 move: medium punch
Player 2 move: high punch
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 1 move: super attack 3
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 1 move: high punch
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 1 move: high punch
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 2 move: high punch
Player 2 move: low kick
Player 2 move: medium punch
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 1 move: move closer
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 1 move: jump closer
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 2 move: fireball
Player 2 move: jump closer
Player 2 move: jump away
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 1 move: move closer
Player 1 move: jump closer
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 1 move: jump away
Player 1 move: super attack 2
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 1 move: jump closer
Player 1 move: high punch
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 1 move: megafireball
Player 1 move: move closer
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 1 move: jump closer
Player 1 move: megapunch
Player 1 move: low punch
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 1 move: megafireball
Player 1 move: super attack 2
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 2 move: high kick
Player 2 move: super attack 2
Player 2 move: super attack 3
Player 2 move: super attack 4
Player 2 move: low punch
Player 2 move: medium punch
Player 2 move: high punch
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 1 move: high punch
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 1 move: high punch
Player 1 move: jump closer
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 2 move: low punch
Player 2 move: medium punch
Player 2 move: high punch
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 1 move: high punch
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 2 move: high punch
Player 2 move: high punch
Player 2 move: high punch
Player 2 move: megapunch
Player 2 move: low punch
Player 2 move: low punch
Player 2 move: low kick
🏟️ (2b8a) (0)Round won by P2
(0)Moving to next round
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 1 move: high kick
Player 1 move: megapunch
Player2 ollama:qwen:0.5b-chat-v1.5-fp16 Daddy won!

———————

Round 3

🏟️ (b34c) (0)Starting game
🏟️ (b34c) (0)Waiting for fight to start
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 1 move: super attack 3
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 1 move: super attack 3
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 1 move: move closer
Player 1 move: high punch
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 2 move: move away
Player 2 move: medium punch
Player 2 move: super attack 2
Player 2 move: high punch
Player 2 move: low kick
Player 2 move: medium kick
Player 2 move: high kick
Player 2 move: jump closer
Player 2 move: jump away
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
2024-03-30 12:28:29.109 | WARNING | agent.robot:get_moves_from_llm:317 - Many invalid moves: ['Move Closer to get into better attacking range', 'Megafireball or Super attack 2 as a powerful offensive option while closing in']
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 2 move: fireball
Player 2 move: megapunch
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 1 move: megafireball
Player 1 move: high punch
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 2 move: megafireball
Player 2 move: super attack 2
Player 2 move: super attack 3
Player 2 move: super attack 4
Player 2 move: low punch
Player 2 move: medium punch
Player 2 move: high punch
Player 2 move: jump closer
Player 2 move: jump away
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 1 move: move closer
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 2 move: move away
Player 2 move: high punch
Player 2 move: low kick
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 1 move: megafireball
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 2 move: fireball
Player 2 move: high kick
Player 2 move: fireball
Player 2 move: high kick
Player 2 move: fireball
Player 2 move: high kick
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 1 move: jump closer
Player 1 move: megafireball
Player 1 move: medium punch
Player 1 move: fireball
2024-03-30 12:28:58.413 | WARNING | agent.robot:get_moves_from_llm:317 - Many invalid moves: ['Assess the distance to the opponent', 'If close', 'If far', 'Move Clo']
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 2 move: high kick
Player 2 move: low kick
Player 2 move: low kick
Player 2 move: medium kick
Player 2 move: high kick
Player 2 move: low kick
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 1 move: megafireball
Player 1 move: move closer
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 2 move: low kick
Player 2 move: medium kick
Player 2 move: high kick
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 1 move: super attack 2
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 1 move: jump closer
Player 1 move: megafireball
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 2 move: fireball
Player 2 move: megapunch
Player 2 move: hurricane
Player 2 move: megafireball
Player 2 move: super attack 2
Player 2 move: super attack 3
Player 2 move: super attack 4
Player 2 move: low punch
Player 2 move: medium punch
🏟️ (b34c) (0)Round won by P2
(0)Moving to next round
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 1 move: move closer
INFO:httpx:HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
Player 2 move: move away
Player 2 move: super attack 3
Player 2 move: low kick
Player 2 move: high kick
Player 2 move: jump closer
Player 2 move: jump away
Player2 ollama:qwen:0.5b-chat-v1.5-fp16 Daddy won!

from llm-colosseum.

oulianov avatar oulianov commented on June 8, 2024

Very interesting results!

from llm-colosseum.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.