Leaderboard | BALSAM

What is this leaderboard about?

BALSAM's leaderboard shows the average score of each model evaluated on all tasks within the benchmark. The average score is calculated based on the individual task scores.

Each row in the leaderboard represents a model, and each column represents a task. You can compare the performance of different models on different tasks using the filter menus.

What does the score mean?

The score is a metric that is specific to the task. Each task might have a different scoring metric, and the score is calculated based on that metric. The score is a number between 0 and 1, where 1 is the best possible score.

Model	Average score	Creative Writing	Entailment	Fill in the Blank	Information Extraction	Logic	Program Execution	Question Answering	Reading Comprehension	Sequence Tagging	Summarization	Text Classification	Text Manipulation	Translation/Transliteration
No results

Model

Average score

Creative Writing

Entailment

Fill in the Blank

Information Extraction

Logic

Program Execution

Question Answering

Reading Comprehension

Sequence Tagging

Summarization

Text Classification

Text Manipulation

Translation/Transliteration

No results