Skip to content
This repository has been archived by the owner on Sep 18, 2020. It is now read-only.

long success story

Grzegorz Mrukwa edited this page Feb 6, 2017 · 3 revisions

Description

Covers full processing pipeline.

Steps

  1. Log in.
  2. No datasets are available: upload a dataset.
  3. Select uploaded dataset.
  4. Upload an optical image.
  5. Match digital data with optical image.
  6. Apply peak alignment using PAFFT method. *
  7. Select derived dataset with peaks aligned.
  8. Apply baseline removal. *
  9. Select derived dataset with baseline removed.
  10. Estimate and apply Gaussian Mixture Model. *
  11. Select derived GMM-modelled dataset.
  12. Perform DiviK with limit of 3 steps. *
  13. Select an automatically found ROI which visually best overlaps optically found tumor.
  14. Calculate coverage statistics.
  15. Create classifier distinguishing selected ROI and the rest of preparation. *
  16. Select GMM-modelled dataset.
  17. Perform DiviK compression. *
  18. Select compressed dataset.
  19. Load ROI from step 13.
  20. Use compressed data for classifier training. *
  21. Compare classifiers from steps 15 and 20.
  22. Upload second preparation.
  23. Repeat steps 3-9 for this preparation.
  24. Apply GMM model estimated for first preparation.
  25. Classify spectra using classifiers from steps 15 and 20 and score them.
  26. Mark tumor regions.
  27. Create classification report basing on automatically found regions and manually selected ROIs.
  28. Log out.

All steps marked with '*' involve scheduling the job, presenting some progress info and running it in the background thread. It would be perfect, to allow user to work on a dataset which is under construction and schedule this work after current calculations, e.g.:

  1. User wants to perform GMM modelling. This takes some time, while no one wants to wait until calculations end.
  2. To make user not leaving the tool, when GMM got scheduled, we create dataset entry (indicated as virtual dataset, non-existing yet).
  3. User selects this virtual dataset.
  4. No insight can be performed into this dataset, while user can schedule another analysis on the same data - e.g. DiviK
  5. User logs out and goes for a one week party / whatever.
  6. Calculations finish.
  7. User logs in after a break.
  8. Datasets after GMM and DiviK are fully operable, so user can select them and visualize the spectra / groups.
Clone this wiki locally