Skip to content

Commit

Permalink
[DAPHNE-daphne-eu#752] oneHot and bin functions in DaphneLib (daphne-…
Browse files Browse the repository at this point in the history
…eu#760)

This PR closes daphne-eu#752 and makes the oneHot and bin built-in functions available on matrices in DaphneLib.

Features:
- Use oneHot(info) to apply one-hot-encoding on matrices in DaphneLib.
- Use bin(numBins, Min, Max) or bin(numBins) to apply binning on matrices in DaphneLib.

Changes:
- Add oneHot and bin built-in functions for matrices in DaphneLib.
- Script-level test case is added in test/api/python/, ensuring that execution works as expected.
- Update doc/DaphneLib/APIRef.md to add oneHot and bin built-in functions for matrices.

Furthermore, as a follow-up to daphne-eu#666, the inc parameter of the seq function is now also optional in DaphneLib.
- Update doc/DaphneLib/APIRef.md for seq() function with optional inc argument.
- Make argument inc optional with default value 1 in seq() function in DaphneLib.
  • Loading branch information
saminbassiri authored Jun 21, 2024
1 parent ec1835d commit ee40181
Show file tree
Hide file tree
Showing 6 changed files with 71 additions and 2 deletions.
6 changes: 5 additions & 1 deletion doc/DaphneLib/APIRef.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ However, as the methods largely map to DaphneDSL built-in functions, you can fin
**Generating data in DAPHNE:**

- **`fill`**`(arg, rows:int, cols:int) -> Matrix`
- **`seq`**`(start, end, inc) -> Matrix`
- **`seq`**`(start, end, inc = 1) -> Matrix`
- **`rand`**`(rows: int, cols: int, min: Union[float, int] = None, max: Union[float, int] = None, sparsity: Union[float, int] = 0, seed: Union[float, int] = 0) -> Matrix`
- **`createFrame`**`(columns: List[Matrix], labels: List[str] = None) -> 'Frame'`
- **`diagMatrix`**`(self, arg: Matrix) -> 'Matrix'`
Expand Down Expand Up @@ -148,6 +148,10 @@ In the following, we describe only the latter.
- **`replace`**`(pattern, replacement)`
- **`order`**`(colIdxs: List[int], ascs: List[bool], returnIndexes: bool)`

**Data preprocessing:**
- **`oneHot`**`(info:matrix)`
- **`bin`**`(numBins:int, Min = None, Max = None)`

**Other matrix operations:**

- **`diagVector`**`()`
Expand Down
2 changes: 1 addition & 1 deletion src/api/python/daphne/context/daphne_context.py
Original file line number Diff line number Diff line change
Expand Up @@ -366,7 +366,7 @@ def createFrame(self, columns: List[Matrix], labels:List[str] = None) -> 'Frame'

return Frame(self, 'createFrame', [*columns, *labels])

def seq(self, start, end, inc) -> Matrix:
def seq(self, start, end, inc = 1) -> Matrix:
named_input_nodes = {'start':start, 'end':end, 'inc':inc}
return Matrix(self, 'seq', [], named_input_nodes=named_input_nodes)

Expand Down
11 changes: 11 additions & 0 deletions src/api/python/daphne/operator/nodes/matrix.py
Original file line number Diff line number Diff line change
Expand Up @@ -394,6 +394,17 @@ def outerGt(self, other: 'Matrix') -> 'Matrix':
def outerGe(self, other: 'Matrix') -> 'Matrix':
return Matrix(self.daphne_context, 'outerGe', [self, other])

def oneHot(self, other: 'Matrix') -> 'Matrix':
return Matrix(self.daphne_context, 'oneHot', [self, other])

def bin(self, numBins, Min = None, Max = None) -> 'Matrix':
if (Max is None and Min is not None ) or (Min is None and Max is not None):
raise RuntimeError("bin: both min and max should be set, or both should be None")
if Max and Min:
return Matrix(self.daphne_context, 'bin', [self, numBins, Min, Max])
else:
return Matrix(self.daphne_context, 'bin', [self, numBins])

def order(self, colIdxs: List[int], ascs: List[bool], returnIndexes: bool) -> 'Matrix':
if len(colIdxs) != len(ascs):
raise RuntimeError("order: the lists given for parameters colIdxs and ascs must have the same length")
Expand Down
1 change: 1 addition & 0 deletions test/api/python/DaphneLibTest.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,7 @@ MAKE_TEST_CASE("matrix_outerbinary")
MAKE_TEST_CASE("matrix_agg")
MAKE_TEST_CASE("matrix_reorg")
MAKE_TEST_CASE("matrix_other")
MAKE_TEST_CASE("matrix_preprocessing")
MAKE_TEST_CASE_SCALAR("numpy_matrix_ops")
MAKE_TEST_CASE_SCALAR("numpy_matrix_ops_extended")
MAKE_TEST_CASE("numpy_matrix_ops_replace")
Expand Down
24 changes: 24 additions & 0 deletions test/api/python/matrix_preprocessing.daphne
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# Copyright 2023 The DAPHNE Consortium
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

arg_m = reshape(seq(-2, 5), 2, 4);
info_m = [-1, 0, 5, 6](1, 4);
print(oneHot(arg_m, info_m));

arg_m_2 = reshape(seq(10, 70, 10), 1, 7);
print(bin(arg_m_2, 3));
print(bin(arg_m_2, 3, 10, 70));

arg_m_3 = t([5.0, 20.0, nan, 40.0, inf, 60.0, 100.0]);
print(bin(arg_m_3, 3, 10.0, 70.0));
29 changes: 29 additions & 0 deletions test/api/python/matrix_preprocessing.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Copyright 2023 The DAPHNE Consortium
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import math
from daphne.context.daphne_context import DaphneContext

dc = DaphneContext()

arg_m_1 = dc.seq(-2, 5).reshape(2, 4)
info_m = dc.seq(-1, 0).rbind(dc.seq(5, 6)).reshape(1, 4)
arg_m_1.oneHot(info_m).print().compute()

arg_m_2 = dc.seq(10, 70, 10).reshape(1, 7)
arg_m_2.bin(3).print().compute()
arg_m_2.bin(3, 10, 70).print().compute()

arg_m_3 = dc.seq(5.0, 20.0, 15).rbind(dc.fill(math.nan, 1, 1)).rbind(dc.fill(40.0, 1, 1)).rbind(dc.fill(math.inf, 1, 1)).rbind(dc.seq(60.0, 100.0, 40))
arg_m_3.reshape(1, 7).bin(3, 10.0, 70.0).print().compute()

0 comments on commit ee40181

Please sign in to comment.