Skip to content

Commit

Permalink
initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
fracpete committed May 26, 2024
1 parent 1eb0330 commit d7d3a20
Show file tree
Hide file tree
Showing 5 changed files with 204 additions and 1 deletion.
10 changes: 10 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
*~
*.bak
*.swp
*.mdb
*.db
*.db-journal
tmp/
.idea/
*.iml
*.props
45 changes: 44 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,45 @@
# substrate-classification-organic-wastes
On-site substrate characterization in the anaerobic digestion context: a dataset of near infrared spectra acquired with four different optical systems on freeze-dried and ground organic wastes.
On-site substrate characterization in the anaerobic digestion context: a dataset of near infrared spectra acquired
with four different optical systems on freeze-dried and ground organic wastes.

The near infrared spectra of thirty-three freeze-dried and ground organic waste samples of various biochemical
composition were collected on four optical systems, including a laboratory spectrometer (Buchi FT-NIR NirFlex N-500),
a transportable spectrometer (ARCoptix FT-NIR Rocket) with two measurement configurations (an immersed probe,
and a polarized light system (PoLis)) and a micro-spectrometer (Si-ware FT-NIR NeoSpectra). The provided data
contains one file per spectroscopic system including the reflectance or absorbance spectra with the corresponding
sample name and wavelengths (nm). A reference data file displaying carbohydrates, lipid and nitrogen content,
biochemical methane potential and chemical oxygen demand for each sample is also provided. This data enables
the comparison of the optical systems for predictive model calibration based for example on Partial
Least Square Regression (PLS-R), but could be used more broadly to test new chemometrics methods. For example,
the data could be used to evaluate different transfer functions between spectroscopic systems. (2021-01-22)

## Notes

* Immersed rows 26,28 have missing amplitudes


## Downloads

* [2024.5.27](https://github.com/spectral-datasets/substrate-classification-organic-wastes/releases/tag/v2024.5.27)

* [all.zip](https://github.com/spectral-datasets/substrate-classification-organic-wastes/releases/download/v2024.5.27/all.zip)

* [raw](https://github.com/spectral-datasets/substrate-classification-organic-wastes/releases/tag/raw)

* [dataverse_files.zip](https://github.com/spectral-datasets/substrate-classification-organic-wastes/releases/download/raw/dataverse_files.zip)


## Citation

Mallet, Alexandre; Pérémé, Margaud; Charnier, Cyrille; Roger, Jean-Michel; Steyer, Jean-Philippe; Latrille, Eric; Bendoula, Ryad, 2021, "On-site substrate characterization in the anaerobic digestion context: a dataset of near infrared spectra acquired with four different optical systems on freeze-dried and ground organic wastes", https://doi.org/10.15454/SQQTUU, Recherche Data Gouv, V1


## Links

* [Dataset](https://data.inrae.fr/dataset.xhtml?persistentId=doi:10.15454/SQQTUU)


## License

* Dataset: [etalab 2.0](https://spdx.org/licenses/etalab-2.0.html)
* ADAMS flow: [MIT](https://opensource.org/licenses/MIT)
146 changes: 146 additions & 0 deletions convert.flow
Original file line number Diff line number Diff line change
@@ -0,0 +1,146 @@
# Project: adams
# Date: 2024-05-27 10:45:54
# User: fracpete
# Charset: UTF-8
# Modules: adams-bootstrapp,adams-compress,adams-core,adams-db,adams-event,adams-excel,adams-heatmap,adams-imaging,adams-imaging-boofcv,adams-json,adams-math,adams-matlab,adams-meta,adams-ml,adams-net,adams-odf,adams-pdf,adams-pyro4,adams-python,adams-r,adams-rats-core,adams-rats-net,adams-rats-redis,adams-rats-rest,adams-rats-webservice,adams-redis,adams-rest,adams-security,adams-spectral-2dim-core,adams-spectral-2dim-handheld,adams-spectral-2dim-r,adams-spectral-2dim-rats,adams-spectral-2dim-webservice,adams-spectral-3way-core,adams-spectral-app,adams-spreadsheet,adams-terminal,adams-visualstats,adams-webservice,adams-webservice-core,adams-weka-lts,adams-xml,adams-yaml
#
adams.flow.control.Flow -annotation "Expects two sub-dirs in the same dir as the flow:\\n- raw\\n- output\\nThe raw directory must contain the tab files." -error-handling ACTORS_DECIDE_TO_STOP_ON_ERROR -flow-execution-listener adams.flow.execution.NullListener -flow-restart-manager adams.flow.control.flowrestart.NullManager
adams.flow.standalone.CallableActors
adams.flow.sink.SpectrumDisplay -name Substrate -display-type adams.flow.core.displaytype.NoDisplay -writer adams.gui.print.NullWriter -color-provider adams.gui.visualization.core.DefaultColorProvider -paintlet "adams.gui.visualization.spectrum.SpectrumPaintlet -always-show-markers false -anti-aliasing-enabled false" -plot-updater "adams.flow.sink.spectrumdisplay.SimplePlotUpdater -update-interval 50"
adams.flow.control.Sequence -name "save CSV" -annotation "requires variables: name - the filename, no path/ext"
adams.flow.transformer.DeleteStorageValue -storage-name data
adams.flow.control.Tee -name "generate instances"
adams.flow.transformer.ArrayToSequence
adams.flow.control.TransformerReset -var-name name
adams.flow.transformer.InstanceGenerator -generator "adams.data.instances.SimpleInstanceGenerator -add-sample-id true -additional \"Carbohydrate content (g/gDM)[N]\" -additional \"Nitrogen content (g/g DM)[N]\" -additional \"Lipid content (g/g MS)[N]\" -additional \"BMP (NL(CH4)/g DM)[N]\" -no-additional-prefix true -field \"COD (mg COD/g DM)[N]\" -wave-number-as-suffix true"
adams.flow.control.TransformerReset -name "TransformerReset (2)" -var-name name
adams.flow.transformer.WekaInstanceBuffer
adams.flow.transformer.SetStorageValue -storage-name data
adams.flow.control.Trigger -name "save instances"
adams.flow.standalone.SetVariable -var-name output_file -var-value @{flow_dir}/output/@{name}.csv -expand-value true
adams.flow.source.StorageValue -storage-name data -conversion adams.data.conversion.UnknownToUnknown
adams.flow.sink.WekaFileWriter -output @{output_file} -use-custom true -saver "weka.core.converters.SpreadSheetSaver -writer \"adams.data.io.output.CsvSpreadSheetWriter -always-quote-text true\""
adams.flow.standalone.SetVariable -var-name wavenumber_header_row -var-value 1
adams.flow.standalone.SetVariable -name "SetVariable (2)" -var-name wavenumber_regex -var-value (.*)
adams.flow.standalone.SetVariable -name "SetVariable (3)" -var-name columns_with_wavenumbers -var-value 2-last
adams.flow.standalone.SetVariable -name "SetVariable (5)" -var-name rows_with_samples -var-value 2-last
adams.flow.standalone.SetVariable -name "SetVariable (6)" -var-name columns_with_meta -var-value ""
adams.flow.standalone.SetVariable -name "SetVariable (8)" -var-name row_sample_data_names -var-value ""
adams.flow.standalone.SetVariable -name "SetVariable (9)" -var-name id_column -var-value 1
adams.flow.source.Variable -var-name flow_dir -conversion adams.data.conversion.StringToString
adams.flow.control.Tee -name "Read sample data and put in storage"
adams.flow.transformer.AppendName -name "AppendName (2)" -suffix raw/reference_data.tab -use-forward-slashes true
adams.flow.transformer.SpreadSheetFileReader -reader "adams.data.io.input.CsvSpreadSheetReader -data-row-type adams.data.spreadsheet.DenseDataRow -spreadsheet-type adams.data.spreadsheet.DefaultSpreadSheet -separator \\t -no-header true"
adams.flow.transformer.Convert -conversion "adams.data.conversion.SpreadSheetRowsToSampleData -row-sampledata-names 1 -rows-sampledata-values 2-last -cols-sampledata 1-last -col-id 1"
adams.flow.transformer.Convert -name "Convert (2)" -conversion adams.data.conversion.SampleDataArrayToMap
adams.flow.transformer.SetStorageValue -storage-name sampledata_map
adams.flow.control.Tee -name "Read Spectra Immersed" -annotation "Remove rows 26, 28 they have empty amplitudes"
adams.flow.transformer.AppendName -suffix raw/inputs/immersed_probe_spectra_absorbance_nm.tab -use-forward-slashes true
adams.flow.transformer.SpreadSheetFileReader -reader "adams.data.io.input.CsvSpreadSheetReader -data-row-type adams.data.spreadsheet.DenseDataRow -spreadsheet-type adams.data.spreadsheet.DefaultSpreadSheet -separator \\t -no-header true"
adams.flow.transformer.SpreadSheetRemoveRow -no-copy true -position 26,28
adams.flow.transformer.Convert -conversion "adams.data.conversion.SpreadSheetRowsToSpectra -row-wave-number @{wavenumber_header_row} -wave-number-regexp @{wavenumber_regex} -cols-amplitude @{columns_with_wavenumbers} -rows-amplitude @{rows_with_samples} -row-sampledata-names @{row_sample_data_names} -cols-sampledata @{columns_with_meta} -col-id @{id_column}"
adams.flow.transformer.ArrayToSequence
adams.flow.control.TryCatch -error-post-processors adams.flow.control.errorpostprocessor.Null
adams.flow.control.SubProcess -name try
adams.flow.transformer.MergeSampleDataFromMap -storage-name sampledata_map
adams.flow.transformer.SetStorageValue -storage-name spectrum
adams.flow.control.Tee -name display
adams.flow.sink.CallableSink -callable Substrate
adams.flow.control.SubProcess -name catch
adams.flow.control.Tee -name display
adams.flow.sink.CallableSink -callable Substrate
adams.flow.transformer.PassThrough -skip true
adams.flow.control.Tee -name "Read Spectra Lab"
adams.flow.transformer.AppendName -suffix raw/inputs/lab_spectrometer_spectra_absorbance_nm.tab -use-forward-slashes true
adams.flow.transformer.SpreadSheetFileReader -reader "adams.data.io.input.CsvSpreadSheetReader -data-row-type adams.data.spreadsheet.DenseDataRow -spreadsheet-type adams.data.spreadsheet.DefaultSpreadSheet -separator \\t -no-header true"
adams.flow.transformer.Convert -conversion "adams.data.conversion.SpreadSheetRowsToSpectra -row-wave-number @{wavenumber_header_row} -wave-number-regexp @{wavenumber_regex} -cols-amplitude @{columns_with_wavenumbers} -rows-amplitude @{rows_with_samples} -row-sampledata-names @{row_sample_data_names} -cols-sampledata @{columns_with_meta} -col-id @{id_column}"
adams.flow.control.ArrayProcess
adams.flow.control.TryCatch -error-post-processors adams.flow.control.errorpostprocessor.Null
adams.flow.control.SubProcess -name try
adams.flow.transformer.MergeSampleDataFromMap -storage-name sampledata_map
adams.flow.transformer.SetStorageValue -storage-name spectrum
adams.flow.control.Tee -name display
adams.flow.sink.CallableSink -callable Substrate
adams.flow.control.SubProcess -name catch
adams.flow.control.Tee -name display
adams.flow.sink.CallableSink -callable Substrate
adams.flow.transformer.PassThrough -skip true
adams.flow.control.Tee -name save
adams.flow.transformer.SetVariable -var-name name -var-value Lab
adams.flow.sink.CallableSink -callable "save CSV"
adams.flow.control.Tee -name "Read Spectra Micro"
adams.flow.transformer.AppendName -suffix raw/inputs/microspectrometer_spectra_absorbance_nm.tab -use-forward-slashes true
adams.flow.transformer.SpreadSheetFileReader -reader "adams.data.io.input.CsvSpreadSheetReader -data-row-type adams.data.spreadsheet.DenseDataRow -spreadsheet-type adams.data.spreadsheet.DefaultSpreadSheet -separator \\t -no-header true"
adams.flow.transformer.SpreadSheetRemoveRow -skip true -no-copy true -position 26,28
adams.flow.transformer.Convert -conversion "adams.data.conversion.SpreadSheetRowsToSpectra -row-wave-number @{wavenumber_header_row} -wave-number-regexp @{wavenumber_regex} -cols-amplitude @{columns_with_wavenumbers} -rows-amplitude @{rows_with_samples} -row-sampledata-names @{row_sample_data_names} -cols-sampledata @{columns_with_meta} -col-id @{id_column}"
adams.flow.control.ArrayProcess
adams.flow.control.TryCatch -error-post-processors adams.flow.control.errorpostprocessor.Null
adams.flow.control.SubProcess -name try
adams.flow.transformer.MergeSampleDataFromMap -storage-name sampledata_map
adams.flow.transformer.SetStorageValue -storage-name spectrum
adams.flow.control.Tee -name display
adams.flow.sink.CallableSink -callable Substrate
adams.flow.control.SubProcess -name catch
adams.flow.control.Tee -name display
adams.flow.sink.CallableSink -callable Substrate
adams.flow.transformer.PassThrough -skip true
adams.flow.control.Tee -name save
adams.flow.transformer.SetVariable -var-name name -var-value Micro
adams.flow.sink.CallableSink -callable "save CSV"
adams.flow.control.Tee -name "Read Spectra Polis Rbs"
adams.flow.transformer.AppendName -suffix raw/inputs/Polis_Rbs_spectra_reflectance_nm.tab -use-forward-slashes true
adams.flow.transformer.SpreadSheetFileReader -reader "adams.data.io.input.CsvSpreadSheetReader -data-row-type adams.data.spreadsheet.DenseDataRow -spreadsheet-type adams.data.spreadsheet.DefaultSpreadSheet -separator \\t -no-header true"
adams.flow.transformer.SpreadSheetRemoveRow -skip true -no-copy true -position 26,28
adams.flow.transformer.Convert -conversion "adams.data.conversion.SpreadSheetRowsToSpectra -row-wave-number @{wavenumber_header_row} -wave-number-regexp @{wavenumber_regex} -cols-amplitude @{columns_with_wavenumbers} -rows-amplitude @{rows_with_samples} -row-sampledata-names @{row_sample_data_names} -cols-sampledata @{columns_with_meta} -col-id @{id_column}"
adams.flow.control.ArrayProcess
adams.flow.control.TryCatch -error-post-processors adams.flow.control.errorpostprocessor.Null
adams.flow.control.SubProcess -name try
adams.flow.transformer.MergeSampleDataFromMap -storage-name sampledata_map
adams.flow.transformer.SetStorageValue -storage-name spectrum
adams.flow.control.Tee -name display
adams.flow.sink.CallableSink -callable Substrate
adams.flow.control.SubProcess -name catch
adams.flow.control.Tee -name display
adams.flow.sink.CallableSink -callable Substrate
adams.flow.transformer.PassThrough -skip true
adams.flow.control.Tee -name save
adams.flow.transformer.SetVariable -var-name name -var-value "Polis Rbs"
adams.flow.sink.CallableSink -callable "save CSV"
adams.flow.control.Tee -name "Read Spectra Polis Rms"
adams.flow.transformer.AppendName -suffix raw/inputs/Polis_Rms_spectra_reflectance_nm.tab -use-forward-slashes true
adams.flow.transformer.SpreadSheetFileReader -reader "adams.data.io.input.CsvSpreadSheetReader -data-row-type adams.data.spreadsheet.DenseDataRow -spreadsheet-type adams.data.spreadsheet.DefaultSpreadSheet -separator \\t -no-header true"
adams.flow.transformer.SpreadSheetRemoveRow -skip true -no-copy true -position 26,28
adams.flow.transformer.Convert -conversion "adams.data.conversion.SpreadSheetRowsToSpectra -row-wave-number @{wavenumber_header_row} -wave-number-regexp @{wavenumber_regex} -cols-amplitude @{columns_with_wavenumbers} -rows-amplitude @{rows_with_samples} -row-sampledata-names @{row_sample_data_names} -cols-sampledata @{columns_with_meta} -col-id @{id_column}"
adams.flow.control.ArrayProcess
adams.flow.control.TryCatch -error-post-processors adams.flow.control.errorpostprocessor.Null
adams.flow.control.SubProcess -name try
adams.flow.transformer.MergeSampleDataFromMap -storage-name sampledata_map
adams.flow.transformer.SetStorageValue -storage-name spectrum
adams.flow.control.Tee -name display
adams.flow.sink.CallableSink -callable Substrate
adams.flow.control.SubProcess -name catch
adams.flow.control.Tee -name display
adams.flow.sink.CallableSink -callable Substrate
adams.flow.transformer.PassThrough -skip true
adams.flow.control.Tee -name save
adams.flow.transformer.SetVariable -var-name name -var-value "Polis Rms"
adams.flow.sink.CallableSink -callable "save CSV"
adams.flow.control.Tee -name "Read Spectra Polis Rss"
adams.flow.transformer.AppendName -suffix raw/inputs/Polis_Rss_spectra_reflectance_nm.tab -use-forward-slashes true
adams.flow.transformer.SpreadSheetFileReader -reader "adams.data.io.input.CsvSpreadSheetReader -data-row-type adams.data.spreadsheet.DenseDataRow -spreadsheet-type adams.data.spreadsheet.DefaultSpreadSheet -separator \\t -no-header true"
adams.flow.transformer.SpreadSheetRemoveRow -skip true -no-copy true -position 26,28
adams.flow.transformer.Convert -conversion "adams.data.conversion.SpreadSheetRowsToSpectra -row-wave-number @{wavenumber_header_row} -wave-number-regexp @{wavenumber_regex} -cols-amplitude @{columns_with_wavenumbers} -rows-amplitude @{rows_with_samples} -row-sampledata-names @{row_sample_data_names} -cols-sampledata @{columns_with_meta} -col-id @{id_column}"
adams.flow.control.ArrayProcess
adams.flow.control.TryCatch -error-post-processors adams.flow.control.errorpostprocessor.Null
adams.flow.control.SubProcess -name try
adams.flow.transformer.MergeSampleDataFromMap -storage-name sampledata_map
adams.flow.transformer.SetStorageValue -storage-name spectrum
adams.flow.control.Tee -name display
adams.flow.sink.CallableSink -callable Substrate
adams.flow.control.SubProcess -name catch
adams.flow.control.Tee -name display
adams.flow.sink.CallableSink -callable Substrate
adams.flow.transformer.PassThrough -skip true
adams.flow.control.Tee -name save
adams.flow.transformer.SetVariable -var-name name -var-value "Polis Rss"
adams.flow.sink.CallableSink -callable "save CSV"
2 changes: 2 additions & 0 deletions output/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
*
!.gitignore
2 changes: 2 additions & 0 deletions raw/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
*
!.gitignore

0 comments on commit d7d3a20

Please sign in to comment.