Skip to content

Commit

Permalink
Merge branch 'pre-release'
Browse files Browse the repository at this point in the history
  • Loading branch information
BrianLondon committed Aug 30, 2024
2 parents b6c4c53 + fb9e3b3 commit e3d63c6
Show file tree
Hide file tree
Showing 21 changed files with 705 additions and 172 deletions.
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
## Development releases

### 0.1.0 (Aug, 2024)
- Initial release
7 changes: 7 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,13 @@
name = "fixcol"
version = "0.1.0"
edition = "2021"
license = "MIT"
keywords = ["fixed", "column", "serialization", "parse", "file"]
categories = ["encoding", "parsing"]
homepage = "https://github.com/BrianLondon/fixcol"
repository = "https://github.com/BrianLondon/fixcol"
readme = "README.md"
description = "A library for reading and writing fixed width / column delimited data files."

[features]
experimental-write = []
Expand Down
18 changes: 18 additions & 0 deletions LICENSE.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
Copyright (c) 2024 Brian London

Permission is hereby granted, free of charge, to any person obtaining a copy of
this software and associated documentation files (the "Software"), to deal in the
Software without restriction, including without limitation the rights to use, copy,
modify, merge, publish, distribute, sublicense, and/or sell copies of the Software,
and to permit persons to whom the Software is furnished to do so, subject to the
following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
76 changes: 57 additions & 19 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,30 +2,68 @@

A library for reading fixed width / column delimited data files.

## Strict Mode

We'll start by enabling it only on the field level and then allowing cascades
as a future enhancement. There's currently no parsing of attributes on the
`struct` level, so that also provides an impediment to the cascade behavior.

What strict should enable
- require last field of line to be full length when reading
- require written `Full` aligned text columns to be the correct length
- require `Left` and `Right` aligned text columns to not overflow <!-- TODO: need test coverage for this -->
- require unread columns to contain only whitespace
- require no whitespace in numeric `Full` columns
- left aligned fields cannot start with white space
- right aligned fields cannot end with white space
- error on integer width overflow on write <!-- TODO: need test coverage for this -->
## Basic Usage

Consider the following data file:
```text
Tokyo 13515271 35.689 139.692
Delhi 16753235 28.610 77.230
Shanghai 24870895 31.229 121.475
São Paulo 12252023 -23.550 -46.333
Mexico City 9209944 19.433 -99.133
```

We can create a basic data structure corresponding to the records in the file
and then read the data file as shown.

```rust
use fixcol::ReadFixed;
use std::fs::File;

#[derive(ReadFixed)]
struct City {
#[fixcol(width = 12)]
name: String,
#[fixcol(width = 8, align = "right")]
population: u64,
#[fixcol(skip = 1, width = 8, align = "right")]
lat: f32,
#[fixcol(skip = 1, width = 8, align = "right")]
lon: f32,
}

let mut file = File::open("cities.txt");
let cities: Vec<City> = City::read_fixed_all(file).map(|res| match res {
Ok(city) => city,
Err(err) => {
eprintln!("{}", err);
std::process::exit(1);
}
}).collect();
```

Please see the official documentation for complete usage guidance.

<!--
TODO: need test coverage for:
require `Left` and `Right` aligned text columns to not overflow in strict mode
TODO: need test coverage for:
error on overflow on write (esp. integers)
-->


## Wishlist of new features

- Fixed column offsets
- Error messages for writing operations
- Strict mode
- Add an option for padding enum variants to all be the same length
- Also support shorter than expected lines gracefully
- Better error messages for writing operations
- Make param list data rather than code to support dynamic lists of
valid parameters.
- Allow a function based custom deserialization on individual columns
- Clear error messages of location of error on read errors
- Enable the `ignore_others` parameter

## License

Licensed under the MIT license. See: [LICENSE.txt].
53 changes: 35 additions & 18 deletions examples/habsburgs/alg.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
//! Genealogy Algorithms
//!
//!
//! Does some calculations on genealogy records. The functionality
//! here is not important for the serialization example, but is used
//! to transform the data from the input format to an output format
Expand Down Expand Up @@ -34,7 +34,10 @@ pub(crate) fn coi_for_data_set(records: Vec<Record>) -> Vec<OutputRecord> {
.iter()
.map(|id| {
parents.get(id).map(|(a, b)| {
(*idx_from_id.get(a).unwrap() as usize, *idx_from_id.get(b).unwrap() as usize)
(
*idx_from_id.get(a).unwrap() as usize,
*idx_from_id.get(b).unwrap() as usize,
)
})
})
.collect();
Expand All @@ -46,36 +49,37 @@ pub(crate) fn coi_for_data_set(records: Vec<Record>) -> Vec<OutputRecord> {

let mut matrix = zeros(ids.len());

for row in 0 .. matrix.len() {
for row in 0..matrix.len() {
// Diag element: a_jj = 1 + 0.5 * a_pq
if let Some((p1, p2)) = parents[row] {
matrix[row][row] = 1.0 + 0.5 * matrix[p1][p2]
} else {
matrix[row][row] = 1.0;
}

for col in row + 1 .. matrix.len() {
for col in row + 1..matrix.len() {
// Other elements: a_ij = 0.5(a_ip + a_iq)
if let Some((p1, p2)) = parents[col] {
let f = 0.5 * (matrix[row][p1] + matrix[row][p2]);
matrix[row][col] = f;
matrix[col][row] = f;
}
}
}
}

let coi_values = diag_minus_one(matrix);

let mut out: Vec<OutputRecord> = Vec::new();
for i in 0 .. coi_values.len() {
for i in 0..coi_values.len() {
let coi = coi_values[i];
let name = ordered_people[i].name.clone();

out.push(OutputRecord { name, coi });
}

out.sort_by(|a, b| {
b.coi.partial_cmp(&a.coi)
b.coi
.partial_cmp(&a.coi)
.unwrap_or(a.name.cmp(&b.name))
.then(a.name.cmp(&b.name))
});
Expand All @@ -88,19 +92,25 @@ fn records_to_genealogy(records: Vec<Record>) -> HashMap<u8, Person> {
// Note we assume relations always come after the referenced person records
for record in records {
match record {
Record::Person { id, name, regnal_number, birth: _, death: _ } => {
Record::Person {
id,
name,
regnal_number,
birth: _,
death: _,
} => {
let person = Person {
name: cat_name(&name, &regnal_number),
id: id,
children: Vec::new(),
};
people.insert(id, person);
},
}
Record::Relation { rel_type, from, to } => {
if rel_type == RelationType::ParentChild {
people.get_mut(&from).unwrap().children.push(to);
}
},
}
}
}
}

Expand All @@ -112,18 +122,22 @@ fn get_parents(people: &HashMap<u8, Person>) -> HashMap<u8, (u8, u8)> {

for (_, parent) in people {
for child in &parent.children {
map.entry(*child).and_modify(|r| r.1 = Some(parent.id)).or_insert((parent.id, None));
map.entry(*child)
.and_modify(|r| r.1 = Some(parent.id))
.or_insert((parent.id, None));
}
}

map.into_iter().map(|(k, v)| (k, (v.0, v.1.unwrap()))).collect()
map.into_iter()
.map(|(k, v)| (k, (v.0, v.1.unwrap())))
.collect()
}

fn zeros(size: usize) -> Vec<Vec<f32>> {
let mut matrix = Vec::with_capacity(size);
for _ in 0 .. size {
for _ in 0..size {
let mut row = Vec::new();
for _ in 0 .. size {
for _ in 0..size {
row.push(0.0);
}
matrix.push(row);
Expand All @@ -132,7 +146,10 @@ fn zeros(size: usize) -> Vec<Vec<f32>> {
}

fn diag_minus_one(data: Vec<Vec<f32>>) -> Vec<f32> {
data.iter().enumerate().map(|(r, row)| row[r] - 1.0).collect()
data.iter()
.enumerate()
.map(|(r, row)| row[r] - 1.0)
.collect()
}

// concatenates two strings inserting a space if the second is not empty
Expand Down
9 changes: 6 additions & 3 deletions examples/habsburgs/main.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
use std::{fs::File, io};
use std::fs::File;
use std::io;
use std::path::Path;

use alg::coi_for_data_set;
Expand All @@ -20,7 +21,7 @@ enum RelationType {
#[fixcol(key_width = 1)]
enum Record {
#[fixcol(key = "P")]
Person{
Person {
#[fixcol(width = 3)]
id: u8,
#[fixcol(width = 11, align = "right")]
Expand Down Expand Up @@ -74,7 +75,9 @@ pub fn main() {
.collect();

// Run the coi calculation
let results = coi_for_data_set(records).into_iter().filter(|r| r.coi > 0.0);
let results = coi_for_data_set(records)
.into_iter()
.filter(|r| r.coi > 0.0);

// Write the serialized output to STDOUT
let mut stdout = io::stdout();
Expand Down
23 changes: 15 additions & 8 deletions fixcol-derive/src/attrs.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,8 @@ use std::str::FromStr;

use proc_macro2::{Literal, Span, TokenStream, TokenTree};
use quote::quote;
use syn::{spanned::Spanned, Attribute, Ident, Meta, Path};
use syn::spanned::Spanned;
use syn::{Attribute, Ident, Meta, Path};

use crate::error::MacroError;

Expand Down Expand Up @@ -322,7 +323,12 @@ struct FieldConfigBuilder {

impl FieldConfigBuilder {
fn new() -> Self {
Self { width: None, skip: None, align: None, strict: None }
Self {
width: None,
skip: None,
align: None,
strict: None,
}
}
}

Expand Down Expand Up @@ -385,7 +391,6 @@ pub(crate) fn parse_field_attributes(
.map_err(|_| MacroError::new(err, param.value_span()))?;
let old = conf.strict.replace(val);
check_none("strict", param.key_span(), old)?;

}
key => {
return Err(MacroError::new(
Expand Down Expand Up @@ -415,7 +420,7 @@ pub(crate) fn parse_field_attributes(
}

// TODO: confirm these need to be public
struct StructConfigBuilder {
struct StructConfigBuilder {
strict: Option<bool>,
}

Expand All @@ -429,9 +434,7 @@ pub(crate) struct StructConfig {
strict: bool,
}

pub(crate) fn parse_struct_attributes(
attrs: &Vec<Attribute>,
) -> Result<StructConfig, MacroError> {
pub(crate) fn parse_struct_attributes(attrs: &Vec<Attribute>) -> Result<StructConfig, MacroError> {
let params = parse_attributes(attrs)?;
let mut conf = StructConfigBuilder::new();

Expand Down Expand Up @@ -471,7 +474,11 @@ struct EnumConfigBuilder {

impl EnumConfigBuilder {
pub fn new() -> Self {
Self { ignore_others: None, key_width: None, strict: None }
Self {
ignore_others: None,
key_width: None,
strict: None,
}
}
}

Expand Down
Loading

0 comments on commit e3d63c6

Please sign in to comment.