Pyatoa v0.3.0 (#39)

* bugfix: pinning cartopy and proj dependency versions as this was causing some issue with proj versions on systems that did not already have proj * bugfixing manager.write not able to write windows and adjoint sources if config.save_to_ds is set to False. This is unintended because write() is called explicitely and should force write, whereas save_to_ds is used to stop passive writing during the processing phase. Also threw in a check for ASDFDataSets opened in read-only during write() which would throw an error * bugfix changing a print statement in read_sem to a logger warning * removed pyaflowa from core objects, removed its tests and docs pages/references. Pyaflowa functionality has been completely shifted to SeisFlows * moved docs up to highest level directory to remove it from main dir and to match seisflows structure changed docs conf to point to the correct location updated changelog with most recent changes * Update README.md * udpating readthedocs.yaml to point to new docs location in package * pinning SciPy<=1.8.0 because SciPy==1.9.0 causes 'ValueErro: Unknown window type' because they changed name 'Hanning' -> 'Hann' causing errors in ObsPy * fixing setup procedure to avoid installing unnecessary requirements via pip * unpinning lowest scipy version due to Conda UnsatisfiedError package dependency conflicts during requirements installation * bugfixing time offset and waveform misfit naming * critical bug fix with filtering behavior for highpass and lowpass flipped from expected behavior. i.e., setting only min period used to set a highpass which is NOT what we want. This has been fixed * Remove external data gathering and I/O routines from Pyatoa (#23) * stripped away all moment tensor related functionality from the gatherer and plugins * further removed moment tensor related functionality and moved into pysep * stripped out I/O read and write functions related to SPECFEM which were redundant with respect to PySEP. replaced internal import statements with imports to pysep * stripped out all references to FDSN client in Pyatoa and all gathering that queried FDSN. This functionality will be shifted into PySEP * Removed now defunct docs related to gathering, updated changelog to reflect major changes made in this branch, removed failing tests that relied on removed functionality * update readme to remove reference of data fetchign * fixed missing comma in setup file * Update docs and install (#28) * removed setup.py and .cfg, implemented new pyproject.toml file replacing setup.py. pointed readthedocs to only requirements.txt file in docs directory. removed changelog in preference to set changelog in docs page * removed requirements.txt file * fixed missing suffixes in git dependencies of pyproject.toml file * docs added readme, moved notebooks into own directory, updated convert script. updating docs wording * updated overview page (slimmed down considerably) and bumped version number in conf * setting default filter periods to None and removing float requirement on input * pyflex requires a min period so setting to 1-100 rather than previous 10-30 * moved unused plot scripts out of repo (into simutil), removed unnecessary mgmt plot and moved into manager.plot() function * updated docs environment and main environment * update readme to match seisflows readme * removed manager and config notebooks and converted to a direct rst document added first glance to replace getting started * throws in a '*' on synthetic data gathering to deal with specfem3d_globe synthetics which suffix with 'ascii' * added data discovery rst doc page which is meant to replace the gatherer notebook and page * removed gatherer notebook and rst page * renames storage notebook * manually edited inspector rst doc file to be shorter and remove the notebook dependence. figures will be moved into the gallery * changed core_func name to misfit, for misfit quantifictaion * removed inspector notebook in favor of hand edited inspector doc page * updated naming standards page * cleaned out the scripts directory which had a lot of scripts that were unfinished or not used. ones that were important were moved into docs * bugfix inspector raypath plot checking incorrect logic * fixed bugfix * removed old image files from notebooks that are no longer required added an inspector gallery notebook and added rst to gallery page further cleaned up script repository added a load example inspector script * removed logging docs page and notebook, moved a short section into the misfit docs page * manually edited the storage rst file to be more concise and to move away from the notebook configuration. * removed storage notebook in favor of new .rst storage file * added outputs to code cells in inspector doc where relevant, changed ASDFDataSet example reading function to read only to not affect test data * added inspector figure text into gallery and removed text from inspector rst file * removing warnings from insp_gallery * renamed insp gallery notebook to avoid conversion deleting the manually edited version * condensed changelog to remove 0.3.0 changelog since we haven't even bumped to 0.2.0 removed make figures scripts directory * added github relevant files including different issue templates added contributing page modfieid from Pygmt * editing contributing document added cross-referencing into the misfit docs page * fixing typos misfit doc page * added cross-referencing to the storage docs page * finished adding cross-referencing into all docs pages of relevance * removed unnused 'read_seisflows_yaml' from Config class * major config and preprocessing reworking to allow for data-data misfit: removed unnused parameters 'start_pad', 'end_pad' from config, as well as filter corners removed seisflows_yaml and _par reading functionality from config, not necessary preprocess function now written more generally, does not take manager as input but rather takes stream and a few other arguments. manager preprocessing function changed to match * finished updating preprocessing function cleaned up logging statements, shortened and exchanged some 'info' for 'debug' statments * working data data example with some TA array data * renamed some files, cleaned up some docs text * last minute doc fixes * update changelog * fix tests which changed due to a change in default config parameters, removed some ununused test data, upated baseline images * bumping version number 0.2.0 for soft release * removing setup * removing setup requirement from readthedocs * doc fixed incorrect URL * Accomodate Pyadjoint API change (#30) * Pyadjoint v0.2.0 updates some API, fixing within Pyatoa tests, Config and Manager * renaming some internal Config parameters to match with Pyadjoint naming schema. Fixed texts for new Config system in PYadjoint * added new test dataset and script to make it. updated tests to reflect new dataset * added Manager.flow_multiband() function which takes multiple period bands as input and returns an averaged adjoint source to address #24 (#26) * Feature mpi (#31) * added an MPI data processing script and related data * added docs page with mpi code snippet and explanation * Revert "added docs page with mpi code snippet and explanation" This reverts commit 8a86339. * Revert "Revert "added docs page with mpi code snippet and explanation"" This reverts commit 91dfdb1. * removing autoapi * cleaning up mpi example doc page * divy -> divvy as per NG * update changelog * Update ex_w_mpi.rst added shebang to mpi example script * Update ex_w_mpi.rst Updates MPI example script with information about where results are stored, and fix in the code about number of events/stations * bugfix reduced number of ev-sta paris in mpi example problem script * Update ex_w_mpi.rst typo fix * updates dependencies of pysep and pyadjoint to point to pip installs rather than github links * add missing comma * remove adjtomo github links from environment yaml file * update changelog * remove pyflex github link, point straight to PyPi version * bump version 0.2.1 * bugfix: gatherer was using a removed config parameter causing it to fail when looking for observed data * update changelog and bump version number * installation: resolving deprecation warning: pypdf2 -> pypdf * bugfix incorrect import case for pypdf * update deprecated function name * update readme pypi locaiton * update changelog * Bugfix numerical noise (#35) * added new test to catch Issue \#34, which describes introduction of sub-sample time shifts when resampling data that already has the correct sampling rate * fixed test filtering above nyquist, low freq example still works added a flow control statement to skip resampling if sampling rates are already the same to prevent unncessary resampling of data which can cause sub-sample time shifts * update changelog and add ridvan to contributors * Refactor API for simplicity and to remove unnecessary abstraction (#38) * added feature to allow amplitude normalization during the standardization procedure * GATHERER OVERHAUL: core.gatherer -> utils.gather, demoted from core class to utility package utils.gather functions are being stripped down to bare essentials, no more reliance on internal path attributes etc. manager.gather -> manager.gather_from_dataset, used to get internal data from a dataset the underlying motivation here is to make data gathering much more explicit because it is currently very stupidly implicit and difficult to track/manipulate * fully stripped out all unncessary gathering routines which were just redundant fluff and weird abstractions that did not require their own class. the leftovers are two main functions which are used to read events that are not acceptable in ObsPy, and to gather waveform data directly from SEED structured directories * moved gather routines into the already existing 'read' utility function * remove gatherer import from package init * removed manager.gather_from_dataset function because I realized that load already does this * fully removed any reference to 'paths' from Config class as this feature of the package was pretty abstract and not really useful other than in a very specific working case * removed config parameters 'save_to_ds' because gatherer has been cut out removed 'pyflex_preset' config parameter because this was not a useful setup reworked config pyflex and pyadjoint config setting to be much simpler as it was sort of abstracted behind functions before, now its just directly calling the underlying config objects changed default values for config min and max period to 1-100 * removed lingering references to 'pyflex_preset' Config attribute * remove lingering calls to 'pyflex_preset' REMOVED all saving to dataset that occurs during processing, this is saved completely for the 'write' function which is renamed 'write_to_dataset' for clarity removed 'save' argument in window and measure to reflect point 1' * added choices parameter to Manager write_to_dataset function to allow selectively saving. also by default this function writes config object now * RESTRUCTURE preprocessing to move default preprocessing directly into the Manager.preprocess function, rather than obscuring it behind a utility function. parameters are set directly in the preprocessing function to be more exlicit. response removal is turned OFF by default changed some function names and allowed both st_obs and st_syn to run through response removal and STF convolution depending on their data type * fully migrated read functions OUT of Pyatoa and into PySEP (or SeisFlows). the intention here is that Pyatoa is simply a misfit quantification package, anything to do with reading files should be left to PySEP * map maker was fetching lat/lon values from inventory at the channel level but Inventories read from SPECFEM STATIONS files will not have the channel level. reduced this to station level which will also have lat and lon values which should anyways be the same as the channel level * map maker was trying to access magnitude information which is not always present in an Event object. now allows a logic loop to ignore magnitude information * removed unncessary comment mapmaker * converted pyflex_presets script into a docs page and removed gatherer docs page from index TOC because gatherer has been nixed * starting to fix tests but many still broken from refactoring * bump version number 0.3.0 because this PR will have backwards incompatible changes removed lingering deps. on PySEP by adding back one formatting function that had been copied over to PySEP * update CHANGELOG with v0.3.0 changes * updating windows documentation for better formatting * remove lingering references to gatherer class * fixed ASDF util tests * fixed wave maker tests * fixed Config tests * changed baseline images for wave maker plot * fixing manager tests * removed 'force' argument from flow and flow_multiband Manager functions, and moved kwargs into args to make things more explicit * overhauled 'flow_multiband' in Manager to mimic behavior of 'flow', which is to return internal attributes 'windows' and 'adjsrcs' which can be used for later misfit assessment previously this function returned dictionaries of dictionaries which needed to be manipulated, now the function averages all adjoint sources from all period bands, and also collects all windows, and puts them in the same format as the expected 'windows' and 'adjsrcs' attributes so that the results of flow multiband can be accessed the same way as the results of 'flow' * all tests passing * small update docs based on recent changes * updates RTD Yaml file to comply with build.image deprecation * attempt fix failing RTD build by swapping Python for Mamba * updates changelog doc file * removed pysep import call in test, Pyatoa no longer relies on PySEP as a dep * trimmed down docs/environment.yml because RTD no longer needs to build notebooks remotely, created new environment_local.yml file for those that need a local conda environment to build docs * fixing some Sphinx errors on docs building * BUGFIX: Pandas>=2.0 changed default value of groupby().mean() such that string only columns were not dropped automatically, needed to set a flag to get this to resume previous behavior, caught by test * removing context manager usage of ASDFDataSet from Manager tests as they were throwing RuntimeErrors in tests * remove one additional ASDF context case from test * lower-case changelog docs page --------- Co-authored-by: Bryant Chow <bchow@login3.frontera.tacc.utexas.edu>
adjtomo · Aug 29, 2023 · 6eae6c9 · 6eae6c9
1 parent abb9bd4
commit 6eae6c9
Show file tree

Hide file tree

Showing 41 changed files with 1,225 additions and 2,378 deletions.
diff --git a/.readthedocs.yaml b/.readthedocs.yaml
@@ -1,8 +1,13 @@
 version: 2
 
+build:
+    os: "ubuntu-22.04"
+    tools:
+        python: "mambaforge-22.9"
+
 sphinx:
     builder: html
-    configuration: ./docs/conf.py
+    configuration: docs/conf.py
 
 conda:
-    environment: ./docs/environment.yml
+    environment: docs/environment.yml
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,5 +1,62 @@
 # CHANGELOG
 
+## v0.3.0
+
+>__Note__: The motivation behind the changes in v0.3.0 were that the original 
+> data gathering setup used by Pyatoa was very abstract, opaque, and 
+> unncessarily rigid (e.g.,, building path strings out of various components of
+> filenames and internal attributes). The new approach to data gathering is to
+> use PySEP to perform all data gathering once-and-for-all, including one time
+> tasks like instrument removal. The resulting SAC files can then be read in 
+> with ObsPy and directly fed into the Manager class for misfit quantification.
+> This also gives the User much more control over their data gathering without
+> getting confused by Pyatoa's internal data gathering system. 
+
+- Removed ``pyatoa.core.gatherer.Gatherer`` class from package entirely, all 
+  data gathering capabilities have been migrated to PySEP, Pyatoa will now only 
+  accept input data as already-defined ObsPy objects
+- Removed Gatherer-related tests and documentation from package
+- Removed ``paths`` attribute from ``pyatoa.core.config.Config`` and all 
+  references to the paths attribute throughout the package as these were only
+  accessed by the now removed ``Gatherer`` class
+- Changed Pyflex and Pyadjoint configuration building procedure in
+  ``pyatoa.core.config.Config`` as it was previously abstracted behind a few 
+  unncessary functions. ``Config`` now accepts parameters ``pyflex_parameters``
+  and ``pyadjoint_parameters`` (dictionaries) that overwrite default Config
+  parameters in the underlying Config objects
+- Changed ``pyatoa.core.manager.Manager.write()`` to ``write_to_dataset`` to be
+  clearer in explaning it's role
+- Exposed the default preprocessing procedures directly in the
+  ``Manager.preprocess`` function, rather than having it hidden behind a 
+  function call to a utility script. Users who want to overwrite the  
+  preprocessing need only skip the call to preprocess and perform their own
+  tasks on the internally defined ``st_obs`` and ``st_syn`` attributes.
+- Removed ``pyatoa.core.manager.Manager``'s ability to save to ASDFDataSet mid
+  workflow (i.e., during window and measure). Manager must now use the 
+  ``write_to_dataset`` function if it wants to save data to an ASDFDataSet
+- Removed the ``pyatoa/plugins`` directory which only contained the pyflex
+  preset dictionaries. These were not very flexible, instead they have been
+  converted to a docs page for easier accessibility.
+- Created Docs page for Pyflex presets that can be copy-pasted into misfit 
+  quantification routines
+- Added a ``plt.close('all')`` to the end of the Manager's plot routine as
+  as a final precaution against leaving an excessive number of Matplotlib 
+  figures open
+- Overhauled ``pyatoa.core.manager.Manager.flow_multiband`` to mimic behavior 
+  the standard behavior of ``Manager.flow``, that is: return internal attributes
+  ``windows`` and ``adjsrcs`` which are component-wise dictionaries that each
+  contain Pyflex Windows and Pyadjoint AdjointSource objects, respectively. 
+  Previously this function returned dictionaries of dictionaries which needed 
+  to be further manipulated, now the function averages all adjoint sources 
+  from all period bands, and also collects all windows.
+- Adjusted and fixed tests based on all the above changes.
+
+## v0.2.2
+
+- Bugfix: Gatherer attempting to access a removed Config parameter
+- Resolve PyPDF2 -> PyPDF dependency deprecation warning
+- Bugfix: Manager.standardize() only resamples if required, otherwise small time shifting is introduced (Issue \#34)
+
 ## v0.2.1
 
 - Updated internal call structures to deal with Pyadjoint v0.2.1 API changes

diff --git a/CONTRIBUTORS.txt b/CONTRIBUTORS.txt
@@ -1 +1,2 @@
 Chow, Bryant
+Örsvuran, Ridvan
diff --git a/docs/README.md b/docs/README.md
@@ -13,13 +13,16 @@ In order to build the Docs locally, you will first need to create a separate
 Conda environment with a few packages, you can do this by running:
 
 ``` bash
-conda env create --file environment.yaml
+conda env create --file environment_local.yml
 conda activate pyatoa-docs
 ```
 
 You can then run the make command to generate the .html files. You can find your 
 local docs in the *_build/html* directory
 
+Note that the file ``environment.yml`` is used for building docs on ReadTheDocs,
+but does not contain all required dependencies to build locally.
+
 ```bash
 make html
 ```

diff --git a/docs/changelog.rst b/docs/changelog.rst
@@ -1,8 +1,85 @@
-Change Log
-==============
-
-Version 0.2.0
-~~~~~~~~~~~~~~~
+Changelog
+=========
+
+v0.3.0
+------
+
+   **Note**: The motivation behind the changes in v0.3.0 were that the
+   original data gathering setup used by Pyatoa was very abstract,
+   opaque, and unncessarily rigid (e.g.,, building path strings out of
+   various components of filenames and internal attributes). The new
+   approach to data gathering is to use PySEP to perform all data
+   gathering once-and-for-all, including one time tasks like instrument
+   removal. The resulting SAC files can then be read in with ObsPy and
+   directly fed into the Manager class for misfit quantification. This
+   also gives the User much more control over their data gathering
+   without getting confused by Pyatoa’s internal data gathering system.
+
+-  Removed ``pyatoa.core.gatherer.Gatherer`` class from package
+   entirely, all data gathering capabilities have been migrated to
+   PySEP, Pyatoa will now only accept input data as already-defined
+   ObsPy objects
+-  Removed Gatherer-related tests and documentation from package
+-  Removed ``paths`` attribute from ``pyatoa.core.config.Config`` and
+   all references to the paths attribute throughout the package as these
+   were only accessed by the now removed ``Gatherer`` class
+-  Changed Pyflex and Pyadjoint configuration building procedure in
+   ``pyatoa.core.config.Config`` as it was previously abstracted behind
+   a few unncessary functions. ``Config`` now accepts parameters
+   ``pyflex_parameters`` and ``pyadjoint_parameters`` (dictionaries)
+   that overwrite default Config parameters in the underlying Config
+   objects
+-  Changed ``pyatoa.core.manager.Manager.write()`` to
+   ``write_to_dataset`` to be clearer in explaning it’s role
+-  Exposed the default preprocessing procedures directly in the
+   ``Manager.preprocess`` function, rather than having it hidden behind
+   a function call to a utility script. Users who want to overwrite the
+   preprocessing need only skip the call to preprocess and perform their
+   own tasks on the internally defined ``st_obs`` and ``st_syn``
+   attributes.
+-  Removed ``pyatoa.core.manager.Manager``\ ’s ability to save to
+   ASDFDataSet mid workflow (i.e., during window and measure). Manager
+   must now use the ``write_to_dataset`` function if it wants to save
+   data to an ASDFDataSet
+-  Removed the ``pyatoa/plugins`` directory which only contained the
+   pyflex preset dictionaries. These were not very flexible, instead
+   they have been converted to a docs page for easier accessibility.
+-  Created Docs page for Pyflex presets that can be copy-pasted into
+   misfit quantification routines
+-  Added a ``plt.close('all')`` to the end of the Manager’s plot routine
+   as as a final precaution against leaving an excessive number of
+   Matplotlib figures open
+-  Overhauled ``pyatoa.core.manager.Manager.flow_multiband`` to mimic
+   behavior the standard behavior of ``Manager.flow``, that is: return
+   internal attributes ``windows`` and ``adjsrcs`` which are
+   component-wise dictionaries that each contain Pyflex Windows and
+   Pyadjoint AdjointSource objects, respectively. Previously this
+   function returned dictionaries of dictionaries which needed to be
+   further manipulated, now the function averages all adjoint sources
+   from all period bands, and also collects all windows.
+-  Adjusted and fixed tests based on all the above changes.
+
+v0.2.2
+------
+
+-  Bugfix: Gatherer attempting to access a removed Config parameter
+-  Resolve PyPDF2 -> PyPDF dependency deprecation warning
+-  Bugfix: Manager.standardize() only resamples if required, otherwise
+   small time shifting is introduced (Issue #34)
+
+v0.2.1
+------
+
+-  Updated internal call structures to deal with Pyadjoint v0.2.1 API
+   changes
+-  Changed internal test ASDFDataSet and created a script to generate
+   new dataset because the old one had no way of being remade.
+-  New Docs + Example + Example data: Processing data with Pyatoa and
+   MPI
+-  Remove GitHub Pip install links for PySEP, Pyflex and Pyadjoint
+
+v0.2.0
+------
 - Renamed 'Quickstart' doc to 'A short example', created a new 'Quickstart' doc which has a short code snippet that creates a figure.
 
 - Revamped documentation, switched to new style of building documentation using only .rst files (rather than building off of Jupyter notebooks directly in RTD, which was high in memory consumption)

diff --git a/docs/conf.py b/docs/conf.py
@@ -25,7 +25,13 @@
 author = 'adjTomo Dev Team'
 
 # The short X.Y version
-version = '0.2.2'
+# Grab version number from 'pyproject.toml'
+with open("../pyproject.toml", "r") as f:
+    _lines = f.readlines()
+for _line in _lines:
+    if _line.startswith("version"):
+        version = _line.split('"')[1].strip()
+
 # The full version, including alpha/beta/rc tags
 release = ''
 
@@ -75,7 +81,7 @@
 #
 # This is also used if you do content translation via gettext catalogs.
 # Usually you set "language" from the command line for these cases.
-language = None
+language = "en"
 
 # List of patterns, relative to source directory, that match files and
 # directories to ignore when looking for source files.