In the rapidly evolving field of artificial intelligence, testing AI models has become increasingly crucial. A comprehensive benchmark offers a standardized structure for contrasting the capabilities of different AI models across various tasks. By conducting thorough side-by-consecutive comparisons, researchers and developers can acquire valuable i