Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: FileNotFoundError in "001_getting_started" tutorial #2053

Closed
1 task done
baek85 opened this issue May 13, 2024 · 7 comments
Closed
1 task done

[Bug]: FileNotFoundError in "001_getting_started" tutorial #2053

baek85 opened this issue May 13, 2024 · 7 comments

Comments

@baek85
Copy link

baek85 commented May 13, 2024

Describe the bug

Error occurs in Padim model fitting process

start training

engine = Engine(task=TaskType.SEGMENTATION)
engine.fit(model=model, datamodule=datamodule)

Dataset

MVTec

Model

PADiM

Steps to reproduce the behavior

Follow tutorial code in "001_getting_started.ipynb"

OS information

OS information:

  • OS: Ubuntu 20.04.5
  • Python version: 2.1.2
  • Anomalib version: [e.g. 0.3.6]
  • PyTorch version: 3.10
  • CUDA/cuDNN version: 12.0
  • GPU models and configuration: NVIDIA TITAN X
  • Any other relevant information: Nothing

Expected behavior

Estimate multi-variate gaussian distribution of MVTec AD dataset using pre-trained model

Screenshots

No response

Pip/GitHub

GitHub

What version/branch did you use?

main

Configuration YAML

No config in tutorial

Logs

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
Cell In[8], line 3
      1 # start training
      2 engine = Engine(task=TaskType.SEGMENTATION)
----> 3 engine.fit(model=model, datamodule=datamodule)

File ~/.local/lib/python3.10/site-packages/anomalib/engine/engine.py:533, in Engine.fit(self, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path)
    524     ckpt_path = Path(ckpt_path).resolve()
    526 self._setup_workspace(
    527     model=model,
    528     train_dataloaders=train_dataloaders,
   (...)
    531     versioned_dir=True,
    532 )
--> 533 self._setup_trainer(model)
    534 self._setup_dataset_task(train_dataloaders, val_dataloaders, datamodule)
    535 self._setup_transform(model, datamodule=datamodule, ckpt_path=ckpt_path)

File ~/.local/lib/python3.10/site-packages/anomalib/engine/engine.py:324, in Engine._setup_trainer(self, model)
    321     self._cache.update(model)
    323 # Setup anomalib callbacks to be used with the trainer
--> 324 self._setup_anomalib_callbacks()
    326 # Temporarily set devices to 1 to avoid issues with multiple processes
    327 self._cache.args["devices"] = 1

File ~/.local/lib/python3.10/site-packages/anomalib/engine/engine.py:415, in Engine._setup_anomalib_callbacks(self)
    412 has_checkpoint_callback = any(isinstance(c, ModelCheckpoint) for c in self._cache.args["callbacks"])
    413 if has_checkpoint_callback is False:
    414     _callbacks.append(
--> 415         ModelCheckpoint(
    416             dirpath=self._cache.args["default_root_dir"] / "weights" / "lightning",
    417             filename="model",
    418             auto_insert_metric_name=False,
    419         ),
    420     )
    422 # Add the post-processor callbacks.
    423 _callbacks.append(_PostProcessorCallback())

File ~/.local/lib/python3.10/site-packages/lightning/pytorch/callbacks/model_checkpoint.py:253, in ModelCheckpoint.__init__(self, dirpath, filename, monitor, verbose, save_last, save_top_k, save_weights_only, mode, auto_insert_metric_name, every_n_train_steps, train_time_interval, every_n_epochs, save_on_train_epoch_end, enable_version_counter)
    251 self.dirpath: Optional[_PATH]
    252 self.__init_monitor_mode(mode)
--> 253 self.__init_ckpt_dir(dirpath, filename)
    254 self.__init_triggers(every_n_train_steps, every_n_epochs, train_time_interval)
    255 self.__validate_init_configuration()

File ~/.local/lib/python3.10/site-packages/lightning/pytorch/callbacks/model_checkpoint.py:475, in ModelCheckpoint.__init_ckpt_dir(self, dirpath, filename)
    472 self._fs = get_filesystem(dirpath if dirpath else "")
    474 if dirpath and _is_local_file_protocol(dirpath if dirpath else ""):
--> 475     dirpath = os.path.realpath(os.path.expanduser(dirpath))
    477 self.dirpath = dirpath
    478 self.filename = filename

File /opt/conda/lib/python3.10/posixpath.py:396, in realpath(filename, strict)
    393     """Return the canonical path of the specified filename, eliminating any
    394 symbolic links encountered in the path."""
    395     filename = os.fspath(filename)
--> 396     path, ok = _joinrealpath(filename[:0], filename, strict, {})
    397     return abspath(path)

File /opt/conda/lib/python3.10/posixpath.py:456, in _joinrealpath(path, rest, strict, seen)
    454         return join(newpath, rest), False
    455 seen[newpath] = None # not resolved symlink
--> 456 path, ok = _joinrealpath(path, os.readlink(newpath), strict, seen)
    457 if not ok:
    458     return join(path, rest), False

FileNotFoundError: [Errno 2] No such file or directory: '/home/sunghyun.baek/home/workspace/AD/anomalib-main/results/Padim/MVTec/bottle/latest'

Code of Conduct

  • I agree to follow this project's Code of Conduct
@alexriedel1
Copy link
Contributor

You must run all the cells of the notebook in the corresponding order

@baek85
Copy link
Author

baek85 commented May 13, 2024

You must run all the cells of the notebook in the corresponding order

I run all the cells of the notebook in the corresponding order

@cjy513203427
Copy link

Have you run these?

  datamodule = MVTec(num_workers=0)
  datamodule.prepare_data()  # Downloads the dataset if it's not in the specified `root` directory
  datamodule.setup()  # Create train/val/test/prediction sets.
  
  i, data = next(enumerate(datamodule.val_dataloader()))
  print(data.keys())

This code will download MVTec dataset.

@baek85
Copy link
Author

baek85 commented May 13, 2024

@cjy513203427 The data loading code you mentioned runs fine, but when I run ‘engine.fit(model=model, datamodule=datamodule)’ afterwards, I get the same error

The error message seems to be about a folder called 'anomalib-main/results/Padim/MVTec/bottle/latest', does anyone know why I am getting this error?

@alexriedel1
Copy link
Contributor

hmm yes it's about receiving the model checkpoint directories..
Did you try to start from a freshly installed anaconda environment?

I'm not even sure how os.path.realpath() can fail because it is not supposed to. maybe you have some unexpected symlink?

@cjy513203427
Copy link

@cjy513203427 The data loading code you mentioned runs fine, but when I run ‘engine.fit(model=model, datamodule=datamodule)’ afterwards, I get the same error

The error message seems to be about a folder called 'anomalib-main/results/Padim/MVTec/bottle/latest', does anyone know why I am getting this error?

The easiest way is to delete the project and conda env, clone again and install dependency. Maybe you have too long '/home/sunghyun.baek/home/workspace/AD/anomalib-main/results/Padim/MVTec/bottle/latest'. Or it could be permmission problem. I had once os.mkdirs permission failed on Ubuntu

@baek85
Copy link
Author

baek85 commented May 20, 2024

It was a symlink problem in my setting.

@baek85 baek85 closed this as completed May 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants