Releases: Widen/tabitha
Releases · Widen/tabitha
0.5.2
0.5.1
Changed
- Add
toString
as an explicit abstract method toVariant
to make its usage and behavior more clear. (#35) - Upgrade Gradle from 4 to 6.
- Update release publishing to publish to Maven Central directly instead of Bintray since Bintray is being retired by JFrog. All existing releases are also available in Maven Central independent of JCenter. (#36)
- Migrate CI to GitHub Actions.
0.5.0
Added
- You can now convert a
Row
to aHeader
withHeader.fromRow(Row)
. - You can now get the name of the page a row was found in with
Row.pageName()
, if the source file supports it. - Added
RowReader.withSequentialIndexes()
. - Added
RowReader.withBlankRows()
. - New
tabitha-json
module, which provides a plugin that allows Tabitha to read newline separated JSON streams.
Changed
- Renamed
Row.page()
toRow.pageIndex()
. - Multiple unnamed columns are now permitted inside a header, and the "inline headers" option will no longer throw a
DuplicateColumnException
when multiple blank column names are encountered. - The "inline headers" option no longer shifts row indexes by one.
- The
Row
API has changed, with most constructors being replaced with easier to use static factory methods.
0.4.0
Changed
- Split Tabitha into multiple modules. Tabitha is now distributed as a
tabitha-core
module and additional plugin modules that add support for additional file formats. The oldtabitha
package is now deprecated and will not be updated. The new packages are distributed under thecom.widen
group ID.
0.3.0
Added
RowReader
can now be transformed into an async reactive stream of incoming rows by calling the appropriately-namedrows()
method. This can be used to implement map/reduce, transforms, and parallelism into your data processing with a few simple operators. These operations are provided by RxJava, which is now a dependency of Tabitha.- Readers now take a
ReaderOptions
which make it easier to customize runtime options for reading. - The page-related methods have been removed from
RowReader
and files are treated as a continuous stream of rows across all pages. To get data for specific pages, you can emulate the old behavior easily withrows()
and either grouping by or filtering on the page number. - All
Row
s from a reader now "remember" their position in the source file. Check the page index and row index of the row using thepage()
andindex()
methods, respectively.
Changed
- Quite a few classes have been renamed or moved around packages. The "entrypoint" classes
RowReaderFactory
andRowWriterFactory
, have been shortened toRowReaders
andRowWriters
. - Row writers no longer work in terms of
Row
s, but instead writeList<Variant>
as rows. This makes it much easier to generate data in the right format for writing. - Creating a writer with an ambiguous format no longer assumes CSV; the format must be explicit.
Fixed
- Fixed a bug in the XLSX reader for text cells with inline data instead of using the string table.
Removed
DataFrame
has been removed.- Parallel processing utilities have been removed. This can be done using
rows()
, which exposes RxJava's much more powerful parallel processing abilities.
0.2.1
0.2.0
Working runner + bugfix
Not much changed in this release in regards to lines-of-code, but the changes are pretty important.
- Bugfix:
DelimitedRowReader
andDelimitedRowWriter
were not handling theclose()
method properly. This especially was an issue for writing, which did not guarantee to flush all rows written when closed. - Feature: tabitha-runner is now versioned and set up correctly for distribution. The runner is now packaged as a shadow jar and can be run independently. Distribution zips will now also be included here for regular releases.
First non-alpha release
A few things were cleaned up before the full 0.1.0 release, as well as a few features added that were in progress.
- Added a command-line script runner. The runner can run any Groovy script, which will be able to use all Tabitha classes.
- Rows can be copied much easier with the addition of
Row#copyOf()
. - It is now easier to apply a function to a whole row with
Row#map()
. RowReader#EMPTY
was renamed toRowReader#VOID
andRowWriter#NULL
renamed toRowWriter#VOID
to improve consistency.- Added
RowWriter#tee()
for writting to multiple outputs simultaneously. - Fix errors when reading from boolean and blank Excel cell types.
- Excel reader gives much more helpful error messages.
- Updated code styling and JavaDoc comments.
This release is meant to be used to gather interest in Tabitha's development, though using it for critical applications is not recommended.
First alpha release
First Tabitha release! This release includes the following features implemented:
- Buffered and in-memory data creation and reading using
DataFrame
- Column and row schema types
- Reading from and writing to multiple types of data sets using
RowReader
andRowWriter
- Functional combinators for row readers
- Multithreaded row reader processing
- Support for the following formats: CSV, TSV, XLSX, XLS
This is a development release and is not recommended for production environments. There could be significant issues in the API or bugs.