Releases: SciRuby/daru
Improvements and bug fixes
Bug fixes :
- Fix reindex vector on argument error #470 (by @Yuki-Inoue)
- DataFrame#set_index can take column name array, which results in multi-index #471 (by @Yuki-Inoue)
- implements DataFrame#reset_index #473 (by @Yuki-Inoue)
- Make DataFrame.from_activerecord faster #464 (by @paisible-wanderer )
- Optimize aggregation #464 (by @paisible-wanderer)
- Added access_row_tuples_by_indexs method #463 (by @Prakriti-nith )
- Index#dup should copy reference to name too #477 (by @Yuki-Inoue)
- Should support bundler version 2.x.x #483 (by @Shekharrajak )
- fix table style #489 (by @kojix2 )
Small fixes and improvements
-
Minor Enhancements
- Allow pasing singular Symbol to CSV converters option (@takkanm)
- Support calling GroupBy#each_group w/o blocks (@hibariya)
- Refactor grouping and aggregation (@paisible-wanderer)
- Add String Converter to Daru::IO::CSV::CONVERTERS (@takkanm)
- Fix annoying missing libraries warning
- Remove post-install message (nice yet useless)
-
Fixes
- Fix group_by for DataFrame with single row (@baarkerlounger)
#rolling_fillna!
bugfixes onDaru::Vector
andDaru::DataFrame
(@mhammiche)- Fixes
#include?
on multiindex (@rohitner)
Gradual improvement on a road to 1.0
We are currently working hard on a proper version 1.0 release, with daru-io integration, full codebase cleanup and a lot of cool things.
In the meantime, here is 0.2.0!
-
Major Enhancements
- Add
DataFrame#which
query DSL (experimental! @rainchen) - Add
DataFrame/Vector#rolling_fillna
(@baarkerlounger) - Add
GroupBy#aggregate
(@Shekharrajak) - Add
DataFrame#uniq
(@baarkerlounger)
- Add
-
Minor Enhancements
- Allow
Vector#count
to be called without param for category type Vector (@rainchen) - Add option to
DataFrame#vector_sum
to skip nils (@parthm) - Add installation instructions to README.md (@koishimasato)
- Add release policy documentation (@baarkerlounger)
- Set index as DataFrame's default x axis for nyaplot (@matugm)
- Allow
-
Fixes
- Fix
DataFrame/Vector#to_s
when name is a symbol (@baarkerlounger) - Force
Vector#proportions
to return float (@rainchen) DataFrame#new
creates empty DataFrame when given empty hash (@parthm)- Remove unnecessary backports dependencies (@zverok)
- Specify minimum packable dependency (@zverok)
- Preserve key/column order when creating DataFrame from hash (@baarkerlounger)
- Fix
DataFrame#add_row
for DF with multi-index (@zverok) - Fix
Vector#min,
#max,
#index_of_min,
#index_of_max` (0.1.6 regression) (@athityakumar) - Integrate yard-junk into CI (@rohitner)
- Remove Travis spec restriction (@zverok)
- Fix tuple sorting for DataFrames with nils (@baarkerlounger)
- Fix merge on index dropping default index (@rohitner)
- Fix
Major IO upgrades, fixes and minor API changes.
Here's the full changelog of this release (all thanks to @baarkerlounger for the hard work!):
-
Major Enhancements
- Add support for reading HTML tables into DataFrames (@athityakumar)
- Add support for importing remote CSVs (@athityakumar, @anshuman23)
- Allow named indexes (@Shekharrajak)
- DataFrame GroupBy returns MultiIndex DataFrame (@Shekharrajak)
- Add new functions to Vector: max, min, index_of_max, index_of_min, max_by, min_by, index_of_max_by, index_of_min_by (@athityakumar)
- Add summary to DataFrame and Vector without reportbuilder (@ananyo2012)
- Add support for missing data for where clause (@athityakumar)
-
Minor Enhancements
- Allow inserting or updating DataFrame vectors with single values (@baarkerlounger)
- Add a boolean converter to the CSV importer (@baarkerlounger)
- Fix documentation of replace_values method (@kojix2)
- Improve HTML table code of DataFrame and Vector (@Shekharrajak )
- Support CSV files with empty rows (@baarkerlounger)
- Better DataFrame and Vector to_s methods (@baarkerlounger)
- Add support for histogram to Vector moving average convergence-divergence (@parthm)
- Add support for negative arguments to Vector.lag (@parthm)
- Return Nyaplot instance instead of nil for Nyaplot Vector, Category and DataFrame (@Shekharrajak)
- Add global configurable error stream which allows error stream to be silenced (@sivagollapalli)
- Rubocop update and cleanup (@zverok)
- Improve performance of DataFrame covariance (@genya0407)
- Index [] to only take index value as argument (@ananyo2012)
- Better error raised when Vector is missing from DataFrame (@sivagollapalli)
- Add default order for DataFrame (@athityakumar)
- Add is_values to Index (@Shekharrajak)
- Improve spec style in IO/SQL data source spec (@dshvimer)
- Open SQLite databases by bath (@dshvimer)
- Remove unnecessary whitespace (@Shekharrajak)
- Remove the .svg from Travis CI build link (@athityakumar)
- Fix Travis CI icon in README (@athityakumar)
- Replace is_nil?, not_nil? with is_values (@lokeshh)
- Update contributing documentation (@v0dro)
-
Fixes
- Fix missing axis labels for categorized scatter plot with Gruff (@xprazak2)
- Fix NMatrix Vector initialization when Vector has nils and no nm_type is given (@baarkerlounger)
- Fix head/tail methods on DataFrames with DateTime indexes and on Vector_at splat calls (@baarkerlounger)
- Fix empty DateTime Index (@zverok)
- Fix where clause when data contains missing/undefined values (@Shekharrajak)
- Fix apply_scalar_operator spec (@athityakumar)
- Change nil check to respond_to operator check for apply_scalar_operator (@athityakumar)
- Make where compatible with is_values (@athityakumar)
- Fix vector is_values method (@athityakumar)
Bug fixes, enhancements and some API changes.
This release introduces the following changes:
- Major Enhancements
- Minor Enhancements
- Added a join indicator. (@gnilrets)
- Support an enumerable value as an index of a vector. (Yuichiro Kaneko)
- Add test case for
NegativeDateOffset
. (Yuichiro Kaneko) - Add test case for
#on_offset?
. (Yuichiro Kaneko) NegativeDateOffset#-
returnsDateOffset
. (Yuichiro Kaneko)- Make
Vector#resort_index
private because its only use was for internal usage inVector#sort
. (Yuichiro Kaneko) - Add
DataFrame#order=
method to reorder vectors in a dataframe. (@lokeshh) - Use
Integer
instead ofFixnum
throughout the gem. (Yuichiro Kaneko) - Improve error message of
Daru::Vector#index=
. (@lokeshh) - Deprecate
freqs
and makefrequencies
return aDaru::Vector
. (@lokeshh) DataFrame#access_row
with integer index. (Yusuke Sangenya)- Add method alias for comparison operator. (Yusuke Sangenya)
- Update Nokogiri version. (Yusuke Sangenya)
- Return
Daru::Vector
for multiple modal values forDaru::Vector#mode
. (baarkerlounger)
- Fixes
- Fix many to one joins. The prior version was shifting values in the left dataframe before checking whether values in the right dataframe should be shifted. They both need to be checked at the same time before shifting either. (@gnilrets)
- Support formatting empty dataframes. They were returning an error before. (@gnilrets)
- method_missing in Daru::DataFrame would not detect the correct vector if it was a String. Fixed that. (@lokeshh)
- Fix docs of contrast_code to specify that the default value is false. (@v0dro)
- Fix occurence of SystemStackError due to faulty arguement passing to Array#values_at. (@v0dro)
- Fix
DataFrame#pivot_table
regression that raised an ArgumentError if the:index
option was not specified. (@zverok) - Fix
DateFrame.rows
to accept empty argument. (@zverok) - Fix bug with false values on dataframe create. DataFrame from an Array of hashes wasn't being created properly when some of the values were
false
. (@gnilrets) - Fix
Vector#reorder!
method. (Yusuke Sangenya) - Fix
DataFrame#group_by
for numeric indexes. (@zverok) - Make
DataFrame#index=
accept onlyDaru::Index
. (Yusuke Sangenya) DataFrame#vectors=
now changes the name of vectors contained in the internal@data
variable. (Yusuke Sangenya)
Categorical data support. Performance improvements.
0.1.4 (19 August 2016)
- Major Enhancements
- Added new dependency 'backports' to support #to_h in Ruby 2.0. (@lokeshh)
- Greatly improve code test coverage. (@zverok)
- Greatly refactor code and make some methods faster, smaller and more readable. (@zverok)
- Add support for categorical data with different coding schemes and several methods for in built categorical data support. Add a new index 'Daru::CategoricalIndex'. (@lokeshh)
- Removed runtime dependencies on 'spreadsheet' and 'reportbuilder'. They are now loaded if the libraries are already present in the system. (@v0dro)
- Minor enhancements
- Update SqlDataSource to improve the performance of DataFrame.from_sql. (@dansbits)
- Remove default DataFrame name. Now DataFrames will no name by default. (@zverok)
- Better looking #inspect for Vector and DataFrame. (@zverok)
- Better looking #to_html for Vector and DataFrame. Also better #to_html for MultiIndex. (@zverok)
- Remove monkey patching on Array and add those methods to Daru::ArrayHelper. (@zverok)
- Add a rake task for running RSpec for every Ruby version with a single command. (@lokeshh)
- Add rake tasks for easily setting up and testing test harness. (@lokeshh)
- Added
Daru::Vector#to_nmatrix
. - Remove the 'metadata' feature introduced in v0.1.3. (@gnilrets)
- Added
DataFrame#to_df
andVector#to_df
. (@gnilrets)
- Fixes
- DataFrame#clone preserves order and name. (@wlevine)
- Vector#where preserves name. (@v0dro)
- Fix bug in DataFrame#pivot_table that prevented anything other than Array or Symbol to be specified in the :values option. (@v0dro)
- Daru::Index#each returns an Enumerator if block is not specified. (@v0dro)
- Fixes bug where joins failed when nils were in join keys. (@gnilrets)
- DataFrame#merge now preserves the vector name type when merging. (@lokeshh)
- Deprecations
Many more code quality and speed enhancements. Lots of bug fixes.
This release incorporates many many new enhancements and bug fixes from numerous contributors. Some of the salient features of this release are:
- Sorting is now MUCH faster and can sort data with
nil
present. - Statistics with Missing Data is now supported by all methods.
- The code now conforms to the standards laid down by Rubocop.
- Major performance improvements in various methods like join, merge and concat.
For a complete changelog see the HISTORY.md file.
The major contributors for this release were:
Lots of bug fixes and better IO
This release mostly consists of bug fixes or enhancements to various methods from a range of contributors.
Among the new features in this release are:
- A new method
DataFrame.from_activerecord
to load data from Ruby on Rails. - Better loading of SQL data and abstraction of SQL specific features to
Daru::IO::SqlDataSource
. - Latest development dependencies and a few more optional run time dependencies like
bloomfilter-rb
.
See the History.md file for a full changelog.
Time series manipulation and arel-like query syntax
This new release brings in lots of new functionality:
- A new index DateTimeIndex for time series manipulations
- Many new time series functions for manipulating time series based data
- Arel-like query syntax
- Joins and concat
- Many new methods for various operations
- Lots of speedups and bug fixes
Complete support for statsample, improved plotting and more functionality.
This release makes daru completely compatible with statsample and statsample-glm for statistical analysis of data by introducing it as a dependency in these gems. Thus you can now use daru data structures in tandem with statsample for statistical analysis.
Apart from this, some salient features of this release are as follows:
- Many new iterators - map, filter, each, recode, collect and their destructive versions.
- Much improved wrapper over nyaplot for plotting.
- Many new statistics functions.
- More functions to deal with missing data.
- Loading and writing to many file formats like CSV files, Excel spreadsheets, plain text and SQL databases.
- Added a new wrapper to wrap over GSL::Vector for super fast computations and optimum storage.
- Several bug fixes
- Better documentation and extensive usage examples.