v0.7.6
🚀 Streaming v0.7.6
Streaming v0.7.6
is released! Install via pip
:
pip install --upgrade mosaicml-streaming==0.7.6
💎 New Features
1. device_per_stream
batching method
Users can now construct batches such that each device sees only samples from a single stream. This is very useful in cases where different data sources have samples/tensors of different sizes, but the model should still see samples from these different data sources at each optimizer step.
- Adding
device_per_stream
batching by @snarayan21 in #661
2. Add ndarray
type for Spark dataframes.
Enable parsing Spark's ArrayType (of ShortType, LongType, IntegerType, FloatType, DoubleType) when converting a Spark dataframe to MDS.
- Add ndarray type by @XiaohanZhangCMU in #623
3. Support for Alipan storage
Adds support for Alipan, Alibaba's cloud storage service.
- Add support for Alipan Storage backend by @PeterDing in #651
What's Changed
- Bump fastapi from 0.110.0 to 0.110.2 by @dependabot in #660
- Bump pydantic from 2.6.4 to 2.7.0 by @dependabot in #653
- Bump pydantic from 2.7.0 to 2.7.1 by @dependabot in #666
- Bump pytest from 8.1.1 to 8.2.0 by @dependabot in #664
- Bump databricks-sdk from 0.23.0 to 0.27.0 by @dependabot in #667
- Version bump to v0.7.6 by @snarayan21 in #669
New Contributors
- @PeterDing made their first contribution in #651
Full Changelog: v0.7.5...v0.7.6