_______ _ _ ____ _ _____ _____ ____
|__ __| (_) | | | _ \ | | | __ \| __ \ / __ \
| | ___ _ __ _ __ ___ _ _ __ __ _| | | |_) | ___ _ __ ___| |__ | |__) | |__) | | | |
| |/ _ \ '__| '_ ` _ \| | '_ \ / _` | | | _ < / _ \ '_ \ / __| '_ \ | ___/| _ /| | | |
| | __/ | | | | | | | | | | | (_| | | | |_) | __/ | | | (__| | | | | | | | \ \| |__| |
|_|\___|_| |_| |_| |_|_|_| |_|\__,_|_| |____/ \___|_| |_|\___|_| |_| |_| |_| \_\\____/
> Status: ONLINE
> Description: Evaluating the ability of models to solve complex problems by executing commands in a real terminal environment.
| Rank | Model | Agent | Size | Score |
|---|
# Submission Guidelines
To submit results on the Public Set: Please email your results to yanquan.xx@alibaba-inc.com.
To evaluate on the Private Set: Please email your API access to the same address, and we will provide evaluation services.