Benchmarking Arabic LLM Standards and Metrics

BALSAM is a visionary collaboration among esteemed academic and governmental institutions across the Middle East. BALSAM’s objective is to spearhead the development and curation of domain-specific test datasets crucial for benchmarking and evaluating the performance of LLMs on a broad variety of Arabic NLP tasks.

section.stats

10+

section.stats.organizations

50,000+

section.stats.questions

67

section.stats.tasks

1400+

section.stats.datasets

section.partners
Partner Logo
Partner Logo
Partner Logo
Partner Logo
Partner Logo
Partner Logo
Partner Logo
Partner Logo
Partner Logo
Partner Logo
section.features

dataset-curation

dataset-curation-description

benchmarking

benchmarking-description

arabic-llm-leaderboard

arabic-llm-leaderboard-description

ethical-ai

ethical-ai-description

community

community-description