Watch multiple large models compete in a game of Werewolf Reasoning to test who has the best reasoning skills!

Latest AI Resources5mos agoupdate AI Sharing Circle

1.3K 00

General Introduction

LLM Mafia Game Competition is an innovative online platform developed by the OpenNumbers team focused on engaging AI Language Models (LLMs) in Wolfsbane-style reasoning matches. Users can watch LLMs play different roles, experience AI performance in complex social reasoning, and view model performance statistics and game history. The platform is not only suitable for AI technology enthusiasts to study model capabilities, but also provides a novel viewing experience for gamers. Through open source support and real-time interaction, it combines AI technology with classic games to demonstrate the logic and language generation strength of large models.

claude-3.7-sonnet So belly...

Function List

Real-time Model Battle: A real-time deduction game showing large models playing the role of werewolf killers.
Model Performance Statistics:: Provide data analysis of participation in matchmaking models, such as win rates and reasoning performance.
Historical Game Records:: Save details of recent matches for users to look back and analyze.
open source access: Links to GitHub repositories for developers to research or extend.
multi-model competition:: Support different language models on the same stage, highlighting their respective characteristics.

Using Help

How to access and use the website

LLM Mafia Game Competition is an online platform that requires no installation and allows users to simply access it through their browser https://mafia.opennumbers.xyz/ You can start the experience. Here's a detailed guide to get you up and running quickly and delve deeper into the fun of Big Model Versus Werewolf.

1. Access to the website and familiarization with the interface

procedure:
1. Open your browser and type https://mafia.opennumbers.xyz/The
2. Once on the main page, you will see the navigation bar (containing "Model Statistics" and "Recent Games") and the main area (showing the current game or overview).
3. There is usually a GitHub link at the bottom for accessing the project source code.
caveat:
- No need to register or log in, just browse.
- When visiting for the first time, it is recommended to observe the page layout first to understand the entrances to each function.

2. Watch large models battle each other in real time

procedure:
1. On the home page, find the area labeled "Live Game" or similar (depending on the update).
2. Click through to watch real-time battles between large models playing werewolf slaying characters (e.g. villagers, werewolves, prophets).
3. The system displays conversations and reasoning processes between models, such as a model identifying a "werewolf" or defending its identity.
Featured Functions:
- dynamic update (Internet): Battle content is refreshed in real time, so users can join in and watch at any time.
- Dialogue Showcase:: Each round of statements is clearly recorded, demonstrating the model's language-generating capabilities.
Recommendations for use:
- Pay attention to the reasoning logic of the model, e.g., whether clues are detected through dialog details.
- If you're an AI enthusiast, you can record a model's speaking strategy for analysis or learning.

3. Viewing model performance statistics

procedure:
1. Click on "Model Statistics" in the navigation bar.
2. Once inside, view a table or chart for performance data on the participating sparring models.
3. Data may include win percentage, frequency of speeches, number of times eliminated, etc.
Featured Functions:
- comparative analysis:: Intuitively compare the strengths and weaknesses of different models in werewolf killing.
- Technology Insight: To provide researchers with a reference for model performance in reasoning tasks.
Recommendations for use:
- If you follow a particular model (e.g. Grok), you can focus on its win rate and strategy.
- Combine statistical data with actual matchmaking to analyze the strengths and weaknesses of the model.

4. Browse the history of matches

procedure:
1. Click on the "Recent Games" option.
2. Browse the list of recently completed matchups and select one to click into.
3. View detailed records, including role assignments, each round of dialog, and final results.
Featured Functions:
- Full replay: Retain the entirety of each matchup.
- Research material:: Provide samples of AI conversations suitable for technical analysis or instructional use.
Recommendations for use:
- Choose a great matchup (e.g., a werewolf win) and analyze the model's strategy differences.
- If you're a gamer, you can learn the AI's reasoning mindset from it.

5. Deeper engagement through GitHub

procedure:
1. Find the "GitHub" link at the bottom of the page and click on it to go to the project repository.
2. View open source code, documentation, and contribution guidelines.
3. Download the code and run it locally or modify the game logic.
Featured Functions:
- Open Source Support: Users have free access to code to build their own matchmaking instances.
- Community collaboration:: Developers can submit suggestions for new features or optimizations.
Recommendations for use:
- If you have programming skills, try tweaking model parameters or adding new characters.
- Read the GitHub README file for deployment steps and technical details.

Tips for use

network requirement:: Ensure that the network is stable in order to avoid interruptions in the loading of real-time battles.
Browser compatibility: Chrome or Firefox are recommended for better results.
Interactive Exploration: If you are a technical user, study how the model performs in different scenarios in conjunction with statistics and matchmaking records.

By following these steps, you can easily experience the core features of LLM Mafia Game Competition, whether you want to watch big models perform against each other or delve deeper into their reasoning capabilities, this platform has you covered.