add 6 tasks, fix log format, multi-turn retry, grader improvements 4108ae8 rak2315 commited on Apr 13