-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New data format #502
Open
RemingtonRohel
wants to merge
31
commits into
develop
Choose a base branch
from
new_data_format
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
New data format #502
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* Store static data as top-level metadata of the file
* `data_write.SliceData` -> `utils.file_formats.SliceData` * Refactored the general HDF5 file format: - Top level "metadata" group - All data stored as `Dataset`, with `"description"` metadata attached - File-level metadata is hard-linked within each record group to the top-level "metadata" group entry
* Added labels for dimensions of vector fields in the data files, to aid in data interpretation and usage. * Created HDF5Writer class to handle turning SliceData dataclass into the correct types for writing to Borealis HDF5 files. * Removed ability to write JSON files * Started refactoring DataWrite.output_data() to remove the internal functions * DataWrite.__init__() now instantiates sockets internally (avoids passing sockets to threads, which is explicitly recommended against by zmq documentation) * Functionality to send rawacf record to realtime is now internal to write_correlations() function * Added `rx_main_phases` and `rx_intf_phases` fields * Removed `--file-type` option to data_write.py script (only hdf5 supported) * Replaced useless `assert` statement (ignored when script run with `-O` flag, which `steamed_hams.py` uses for release mode)
* Added support for writing rawacf files directly as DMAP * Added dimension scales to certain fields for HDF5 files. These are datasets associated with a dimension of another dataset, e.g. associating the `sqn_timestamps` with the "sequence" dimension of `data`. * Added units metadata for fields in HDF5 files. * Added "rawacf_format" field to config files, specifying the default format to use when writing rawacf files. * Added support to overwrite the rawacf format files are written with the "--rawacf-format" argument to steamed_hams.py * Added `darn-dmap` as a dependency, and fixed numpy and pydarnio versions.
* New `antennas` field which holds all antenna information - `main_locations`: {index: [x, y, z]} for each main antenna - `intf_locations`: {index: [x, y, z]} for each intf antenna - `main_antenna_count`: number of main array antennas - `intf_antenna_count`: number of intf array antennas - `main_antenna_spacing`: uniform spacing between main-array antennas - `intf_antenna_spacing`: uniform spacing between intf-array antennas - `standard_positions`: flag indicating whether array antennas follow the standard linear configuration. If so, verifies that positions are parallel to x-axis and equally spaced by [main|intf]_antenna_spacing * Added tests for the new fields * Updated config files for each site
* Each type (antennas_iq, bfiq, etc.) has its own data field (antennas_iq_data, bfiq_data, etc.) * `antenna_locations` field added containing [x, y, z] locations of each antenna * `antenna_arrays` field added for bfiq files containing descriptors for the array dimension of the data (e.g. ["main", "intf"] * `required` added to metadata, indicating whether it is an error for a field to be missing or not. * `data` field removed * `[main|intf]_antenna_count` fields removed * `lags` field renamed to `lag_pulses` * `num_ranges` and `num_samps` fields removed * `range_gates` field added, simply an array of the range gates for the file (e.g. 0-74) * `rx_antennas`, ``rx_main_antennas`, `rx_intf_antennas`, and `tx_antennas` fields added, giving indices into `antenna_locations` of the antennas used for the experiment * `station_location` field added, giving lat, lon, altitude of the radar * Refactored `get_phase_shift()` in `signals.py` to use the antenna positions and interferometer array offsets for beamforming
* also refactored some variable names for simplicity
* Created script file_docs_builder.py to generate .rst files for each file type * Changed file name when writing DMAP files directly (ensuring slice_id is written as a letter instead of a number)
…ata, bfiq_data, rawrf_data final dimension. * Given as an array of ints, representing the time of measurement relative to the first pulse in the sequence. Microseconds.
* `tests/simulators/steamed_sham.py` will call a simulator instead of usrp_driver.cpp * `tests/simulators/driver_sim.py` mocks usrp_driver.cpp, generating noise instead of data (and not currently adding the pulse data to the noise)
* Updated record name format to use hyphens, like `YYYYMMDD-HHMM-SS.fffff` * Removed `dim_labels` from `lag_pulses` field * Correctly format SliceData object as DMAP * Use new pydarnio functions for converting rawacf records * Fix dmap filename convention * Serve data for all slices to realtime from data_write * Replaced test data for realtime sim with single-record dmap
… xarray * pyDARNio implementation of array-structured fields being added in parallel to this branch. This implementation will only allow in-memory array-structuring, and will not support writing of array-structured data files.
Doreban
approved these changes
Sep 19, 2024
… into new_data_format * Pulling in the merge conflict commit Draven added
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR introduces a new format for Borealis-produced HDF5 files.
Features
metadata
group, and linked to in each recordantenna_arrays
: descriptors["main", "intf"]
for bfiq files (or just["main"]
if intf array not present)antenna_locations
: [x, y, z] coordinates of each antenna relative to the midpoint of the main arrayantennas_iq_data
: replacesdata
for antennas_iq filesbfiq_data
: replacesdata
for bfiq fileslag_numbers
: The lag numbers for each unique pair of pulses in the experiment, in units oftau_spacing
lag_pulses
: replaceslags
, for a more descriptive name for the data inside (which is a 2D array of the unique pulse pairs)pulse_timing
: replacespulse_timing_us
range_gates
: array of the range gates used for this experimentrawrf_data
: replacesdata
for rawrf filesrx_antennas
: indices intoantenna_locations
corresponding to a dimension ofantennas_iq_data
orrawrf_data
rx_intf_antennas
: indices intoantenna_locations
of all interferometer array antennas used for receivingrx_main_antennas
: indices intoantenna_locations
of all main array antennas used for receivingrx_intf_phases
: complex phases for beamforming each interferometer-array antenna streamrx_main_phases
: complex phases for beamforming each main-array antenna streamstation_location
: [latitude, longitude, altitude] of the radartx_antennas
: indices intoantenna_locations
of all antennas used for transmissiondata
lags
num_ranges
num_samps
antenna_arrays_order
data_descriptors
data_dimensions
noise_at_freq
pulse_timing_us
cfs_freqs
will not be present if the experiment does not conduct a clear frequency search)antennas_iq
file can be processed torawacf
while still maintaining the originalantennas_iq
datasets. Rather than having two or more files to handle, the single file would then contain both the low-level and higher-level data products, with all accompanying metadata.Config file changes
The fields related to antenna specifications have been moved under a single field
antennas
, containing:[main|intf]_locations
: a map of{ "antenna index" : [x, y, z] coordinates }
giving the coordinates of each[main or intf]
-array antenna relative to the midpoint of the main antenna array.[main|intf]_antenna_count
: number of antennas in the respective array[main|intf]_antenna_spacing
: separation in meters between adjacent antennas in the respective arraystandard_positions
: flag indicating standard array positioning (i.e. arrays of equally-spaced antennas arranged on a line parallel to the x-axis)