| 2024-05-09 14:01:23,067 INFO MainThread:4587 [wandb_setup.py:_flush():76] Current SDK version is 0.16.6 |
| 2024-05-09 14:01:23,068 INFO MainThread:4587 [wandb_setup.py:_flush():76] Configure stats pid to 4587 |
| 2024-05-09 14:01:23,068 INFO MainThread:4587 [wandb_setup.py:_flush():76] Loading settings from /home/users/sschmidg/.config/wandb/settings |
| 2024-05-09 14:01:23,068 INFO MainThread:4587 [wandb_setup.py:_flush():76] Loading settings from /scratch/groups/willhies/sschmidg/prismatic-vlms/wandb/settings |
| 2024-05-09 14:01:23,068 INFO MainThread:4587 [wandb_setup.py:_flush():76] Loading settings from environment variables: {} |
| 2024-05-09 14:01:23,068 INFO MainThread:4587 [wandb_setup.py:_flush():76] Applying setup settings: {'_disable_service': False} |
| 2024-05-09 14:01:23,068 INFO MainThread:4587 [wandb_setup.py:_flush():76] Inferring run settings from compute environment: {'program_relpath': 'scripts/pretrain.py', 'program_abspath': '/scratch/groups/willhies/sschmidg/prismatic-vlms/scripts/pretrain.py', 'program': '/scratch/groups/willhies/sschmidg/prismatic-vlms/scripts/pretrain.py'} |
| 2024-05-09 14:01:23,068 INFO MainThread:4587 [wandb_setup.py:_flush():76] Applying login settings: {} |
| 2024-05-09 14:01:23,068 INFO MainThread:4587 [wandb_setup.py:_flush():76] Applying login settings: {'mode': 'offline'} |
| 2024-05-09 14:01:23,068 INFO MainThread:4587 [wandb_init.py:_log_setup():521] Logging user logs to runs/med-instruct+testmodel+stage-finetune+x7/wandb/offline-run-20240509_140123-vqw25nkx/logs/debug.log |
| 2024-05-09 14:01:23,068 INFO MainThread:4587 [wandb_init.py:_log_setup():522] Logging internal logs to runs/med-instruct+testmodel+stage-finetune+x7/wandb/offline-run-20240509_140123-vqw25nkx/logs/debug-internal.log |
| 2024-05-09 14:01:23,068 INFO MainThread:4587 [wandb_init.py:init():561] calling init triggers |
| 2024-05-09 14:01:23,068 INFO MainThread:4587 [wandb_init.py:init():568] wandb.init called with sweep_config: {} |
| config: {'model': {'type': 'one-stage+7b', 'model_id': 'testmodel', 'arch_specifier': 'no-align+gelu-mlp', 'vision_backbone_id': 'clip-vit-b', 'llm_backbone_id': 'llama2-7b-chat', 'image_resize_strategy': 'letterbox', 'llm_max_length': 2048, 'align_epochs': 1, 'align_max_steps': None, 'align_global_batch_size': 256, 'align_per_device_batch_size': 16, 'align_learning_rate': 0.001, 'align_weight_decay': 0.0, 'align_max_grad_norm': 1.0, 'align_lr_scheduler_type': 'linear-warmup+cosine-decay', 'align_warmup_ratio': 0.03, 'align_train_strategy': 'fsdp-shard-grad-op', 'finetune_epochs': 1, 'finetune_max_steps': None, 'finetune_global_batch_size': 128, 'finetune_per_device_batch_size': 16, 'finetune_learning_rate': 2e-05, 'finetune_weight_decay': 0.1, 'finetune_max_grad_norm': 1.0, 'finetune_lr_scheduler_type': 'linear-warmup+cosine-decay', 'finetune_warmup_ratio': 0.03, 'finetune_train_strategy': 'fsdp-full-shard', 'enable_gradient_checkpointing': True, 'enable_mixed_precision_training': True, 'reduce_in_full_precision': False}, 'dataset': {'type': 'med-instruct', 'dataset_id': 'med-instruct', 'align_stage_components': ['download/med-v0.1-instruct/med_v0_1_mix.json', 'download/med-v0.1-instruct'], 'finetune_stage_components': ['download/med-v0.1-instruct/med_v0_1_mix.json', 'download/med-v0.1-instruct'], 'dataset_root_dir': 'data'}, 'stage': 'finetune', 'pretrained_checkpoint': None, 'run_id': 'med-instruct+testmodel+stage-finetune+x7', 'run_root_dir': 'runs', 'seed': 7, 'trackers': ['jsonl', 'wandb'], 'wandb_project': 'onyx-vlms', 'wandb_entity': 'stanford-voltron'} |
| 2024-05-09 14:01:23,068 INFO MainThread:4587 [wandb_init.py:init():611] starting backend |
| 2024-05-09 14:01:23,068 INFO MainThread:4587 [wandb_init.py:init():615] setting up manager |
| 2024-05-09 14:01:23,073 INFO MainThread:4587 [backend.py:_multiprocessing_setup():105] multiprocessing start_methods=fork,spawn,forkserver, using: spawn |
| 2024-05-09 14:01:23,075 INFO MainThread:4587 [wandb_init.py:init():623] backend started and connected |
| 2024-05-09 14:01:23,086 INFO MainThread:4587 [wandb_init.py:init():715] updated telemetry |
| 2024-05-09 14:01:23,094 INFO MainThread:4587 [wandb_init.py:init():748] communicating run to backend with 90.0 second timeout |
| 2024-05-09 14:01:23,099 INFO MainThread:4587 [wandb_init.py:init():799] starting run threads in backend |
| 2024-05-09 14:01:30,270 INFO MainThread:4587 [wandb_run.py:_console_start():2335] atexit reg |
| 2024-05-09 14:01:30,270 INFO MainThread:4587 [wandb_run.py:_redirect():2190] redirect: wrap_raw |
| 2024-05-09 14:01:30,270 INFO MainThread:4587 [wandb_run.py:_redirect():2255] Wrapping output streams. |
| 2024-05-09 14:01:30,270 INFO MainThread:4587 [wandb_run.py:_redirect():2280] Redirects installed. |
| 2024-05-09 14:01:30,271 INFO MainThread:4587 [wandb_init.py:init():842] run started, returning control to user process |
| 2024-05-09 14:08:56,984 WARNING MsgRouterThr:4587 [router.py:message_loop():77] message_loop has been closed |
|
|