| 开始时间: Sat Nov 1 14:02:07 CST 2025 | |
| 节点列表: SH-IDCA1404-10-140-54-69 | |
| 总进程数: 8 | |
| 当前任务ID: 6181860 | |
| INFO: User not listed in /etc/subuid, trying root-mapped namespace | |
| INFO: No user namespaces available, using only the fakeroot command | |
| WARNING: nv files may not be bound with --writable | |
| WARNING: Skipping mount /etc/localtime [binds]: /etc/localtime doesn't exist in container | |
| WARNING: Skipping mount /usr/bin/nvidia-debugdump [files]: /usr/bin/nvidia-debugdump doesn't exist in container | |
| WARNING: Skipping mount /usr/bin/nvidia-persistenced [files]: /usr/bin/nvidia-persistenced doesn't exist in container | |
| WARNING: Skipping mount /usr/bin/nvidia-cuda-mps-control [files]: /usr/bin/nvidia-cuda-mps-control doesn't exist in container | |
| WARNING: Skipping mount /usr/bin/nvidia-cuda-mps-server [files]: /usr/bin/nvidia-cuda-mps-server doesn't exist in container | |
| 开始时间: Sat Nov 1 14:07:35 CST 2025 | |
| 节点列表: SH-IDCA1404-10-140-54-69 | |
| 总进程数: 8 | |
| 当前任务ID: 6181860 | |
| INFO: User not listed in /etc/subuid, trying root-mapped namespace | |
| INFO: No user namespaces available, using only the fakeroot command | |
| WARNING: nv files may not be bound with --writable | |
| WARNING: Skipping mount /etc/localtime [binds]: /etc/localtime doesn't exist in container | |
| WARNING: Skipping mount /usr/bin/nvidia-debugdump [files]: /usr/bin/nvidia-debugdump doesn't exist in container | |
| WARNING: Skipping mount /usr/bin/nvidia-persistenced [files]: /usr/bin/nvidia-persistenced doesn't exist in container | |
| WARNING: Skipping mount /usr/bin/nvidia-cuda-mps-control [files]: /usr/bin/nvidia-cuda-mps-control doesn't exist in container | |
| WARNING: Skipping mount /usr/bin/nvidia-cuda-mps-server [files]: /usr/bin/nvidia-cuda-mps-server doesn't exist in container | |
| [INFO|2025-11-01 06:07:53] llamafactory.launcher:143 >> Initializing 8 distributed tasks at: 127.0.0.1:17821 | |
| W1101 06:07:57.063000 40488 site-packages/torch/distributed/run.py:792] | |
| W1101 06:07:57.063000 40488 site-packages/torch/distributed/run.py:792] ***************************************** | |
| W1101 06:07:57.063000 40488 site-packages/torch/distributed/run.py:792] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. | |
| W1101 06:07:57.063000 40488 site-packages/torch/distributed/run.py:792] ***************************************** | |
| slurmstepd: error: *** JOB 6181860 ON SH-IDCA1404-10-140-54-69 CANCELLED AT 2025-11-01T14:08:08 DUE TO PREEMPTION *** | |
| W1101 06:08:08.849000 40488 site-packages/torch/distributed/elastic/agent/server/api.py:719] Received 15 death signal, shutting down workers | |
| W1101 06:08:08.851000 40488 site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 40555 closing signal SIGTERM | |
| W1101 06:08:08.852000 40488 site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 40556 closing signal SIGTERM | |
| W1101 06:08:08.853000 40488 site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 40557 closing signal SIGTERM | |
| W1101 06:08:08.854000 40488 site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 40558 closing signal SIGTERM | |
| W1101 06:08:08.855000 40488 site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 40559 closing signal SIGTERM | |
| W1101 06:08:08.856000 40488 site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 40560 closing signal SIGTERM | |
| W1101 06:08:08.857000 40488 site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 40561 closing signal SIGTERM | |
| W1101 06:08:08.858000 40488 site-packages/torch/distributed/elastic/multiprocessing/api.py:897] Sending process 40562 closing signal SIGTERM | |
| 开始时间: Sat Nov 1 14:14:51 CST 2025 | |
| 节点列表: SH-IDCA1404-10-140-54-51 | |
| 总进程数: 8 | |
| 当前任务ID: 6181860 | |
| INFO: User not listed in /etc/subuid, trying root-mapped namespace | |
| INFO: No user namespaces available, using only the fakeroot command | |
| WARNING: nv files may not be bound with --writable | |
| WARNING: Skipping mount /etc/localtime [binds]: /etc/localtime doesn't exist in container | |
| WARNING: Skipping mount /usr/bin/nvidia-debugdump [files]: /usr/bin/nvidia-debugdump doesn't exist in container | |
| WARNING: Skipping mount /usr/bin/nvidia-persistenced [files]: /usr/bin/nvidia-persistenced doesn't exist in container | |
| WARNING: Skipping mount /usr/bin/nvidia-cuda-mps-control [files]: /usr/bin/nvidia-cuda-mps-control doesn't exist in container | |
| WARNING: Skipping mount /usr/bin/nvidia-cuda-mps-server [files]: /usr/bin/nvidia-cuda-mps-server doesn't exist in container | |
| 开始时间: Sat Nov 1 14:36:14 CST 2025 | |
| 节点列表: SH-IDCA1404-10-140-54-69 | |
| 总进程数: 8 | |
| 当前任务ID: 6181860 | |
| INFO: User not listed in /etc/subuid, trying root-mapped namespace | |
| INFO: No user namespaces available, using only the fakeroot command | |
| WARNING: nv files may not be bound with --writable | |
| WARNING: Skipping mount /etc/localtime [binds]: /etc/localtime doesn't exist in container | |
| WARNING: Skipping mount /usr/bin/nvidia-debugdump [files]: /usr/bin/nvidia-debugdump doesn't exist in container | |
| WARNING: Skipping mount /usr/bin/nvidia-persistenced [files]: /usr/bin/nvidia-persistenced doesn't exist in container | |
| WARNING: Skipping mount /usr/bin/nvidia-cuda-mps-control [files]: /usr/bin/nvidia-cuda-mps-control doesn't exist in container | |
| WARNING: Skipping mount /usr/bin/nvidia-cuda-mps-server [files]: /usr/bin/nvidia-cuda-mps-server doesn't exist in container | |
| [INFO|2025-11-01 06:36:31] llamafactory.launcher:143 >> Initializing 8 distributed tasks at: 127.0.0.1:17821 | |
| W1101 06:36:35.028000 101064 site-packages/torch/distributed/run.py:792] | |
| W1101 06:36:35.028000 101064 site-packages/torch/distributed/run.py:792] ***************************************** | |
| W1101 06:36:35.028000 101064 site-packages/torch/distributed/run.py:792] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. | |
| W1101 06:36:35.028000 101064 site-packages/torch/distributed/run.py:792] ***************************************** | |
| [2025-11-01 06:36:53,992] [INFO] [real_accelerator.py:254:get_accelerator] Setting ds_accelerator to cuda (auto detect) | |
| [2025-11-01 06:36:53,992] [INFO] [real_accelerator.py:254:get_accelerator] Setting ds_accelerator to cuda (auto detect) | |
| [2025-11-01 06:36:53,992] [INFO] [real_accelerator.py:254:get_accelerator] Setting ds_accelerator to cuda (auto detect) | |
| [2025-11-01 06:36:53,993] [INFO] [real_accelerator.py:254:get_accelerator] Setting ds_accelerator to cuda (auto detect) | |
| [2025-11-01 06:36:53,993] [INFO] [real_accelerator.py:254:get_accelerator] Setting ds_accelerator to cuda (auto detect) | |
| [2025-11-01 06:36:53,993] [INFO] [real_accelerator.py:254:get_accelerator] Setting ds_accelerator to cuda (auto detect) | |
| [2025-11-01 06:36:53,994] [INFO] [real_accelerator.py:254:get_accelerator] Setting ds_accelerator to cuda (auto detect) | |
| [2025-11-01 06:36:53,994] [INFO] [real_accelerator.py:254:get_accelerator] Setting ds_accelerator to cuda (auto detect) | |
| /opt/conda/lib/python3.11/site-packages/jieba/_compat.py:18: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. | |
| import pkg_resources | |
| /opt/conda/lib/python3.11/site-packages/jieba/_compat.py:18: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. | |
| import pkg_resources | |
| /opt/conda/lib/python3.11/site-packages/jieba/_compat.py:18: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. | |
| import pkg_resources | |
| /opt/conda/lib/python3.11/site-packages/jieba/_compat.py:18: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. | |
| import pkg_resources | |
| /opt/conda/lib/python3.11/site-packages/jieba/_compat.py:18: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. | |
| import pkg_resources | |
| /opt/conda/lib/python3.11/site-packages/jieba/_compat.py:18: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. | |
| import pkg_resources | |
| /opt/conda/lib/python3.11/site-packages/jieba/_compat.py:18: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. | |
| import pkg_resources | |
| /opt/conda/lib/python3.11/site-packages/jieba/_compat.py:18: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. | |
| import pkg_resources | |