Skip to content

Commit

Permalink
Merge branch 'master' into release/1.4.0
Browse files Browse the repository at this point in the history
  • Loading branch information
ralphrass authored Aug 21, 2024
2 parents 77539ae + dd8cefe commit 69293a4
Show file tree
Hide file tree
Showing 7 changed files with 194 additions and 1 deletion.
30 changes: 30 additions & 0 deletions .checklist.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
apiVersion: quintoandar.com.br/checklist/v2
kind: ServiceChecklist
metadata:
name: butterfree
spec:
description: >-
A solution for Feature Stores.
costCenter: C055
department: engineering
lifecycle: production
docs: true

ownership:
team: data_products_mlops
line: tech_platform
owner: otavio.cals@quintoandar.com.br

libraries:
- name: butterfree
type: common-usage
path: https://quintoandar.github.io/python-package-server/
description: A lib to build Feature Stores.
registries:
- github-packages
tier: T0

channels:
squad: 'mlops'
alerts: 'data-products-reports'
17 changes: 17 additions & 0 deletions .github/workflows/skip_lint.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# This step is used only because we want to mark the runner-linter check as required
# for PRs to develop, but not for the merge queue to merge into develop,
# github does not have this functionality yet

name: 'Skip github-actions/runner-linter check at merge queue'

on:
merge_group:

jobs:
empty_job:
name: 'github-actions/runner-linter'
runs-on: github-actions-developers-runner
steps:
- name: Skip github-actions/runner-linter check at merge queue
run: |
echo "Done"
2 changes: 1 addition & 1 deletion butterfree/load/writers/historical_feature_store_writer.py
Original file line number Diff line number Diff line change
Expand Up @@ -242,7 +242,7 @@ def validate(
if self.interval_mode and not self.debug_mode
else spark_client.read_table(table_name).count()
)

dataframe_count = dataframe.count()

self._assert_validation_count(table_name, written_count, dataframe_count)
Expand Down
20 changes: 20 additions & 0 deletions docs/source/butterfree.configs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,26 @@ butterfree.configs.environment module
butterfree.configs.logger module
--------------------------------

.. automodule:: butterfree.configs.logger
:members:
:undoc-members:
:show-inheritance:

.. automodule:: butterfree.configs.logger
:members:
:undoc-members:
:show-inheritance:

.. automodule:: butterfree.configs.logger
:members:
:undoc-members:
:show-inheritance:

.. automodule:: butterfree.configs.logger
:members:
:undoc-members:
:show-inheritance:

.. automodule:: butterfree.configs.logger
:members:
:undoc-members:
Expand Down
43 changes: 43 additions & 0 deletions docs/source/butterfree.constants.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,29 @@ butterfree.constants.migrations module
butterfree.constants.spark\_constants module
--------------------------------------------

.. automodule:: butterfree.constants.migrations
:members:
:undoc-members:
:show-inheritance:


.. automodule:: butterfree.constants.migrations
:members:
:undoc-members:
:show-inheritance:


.. automodule:: butterfree.constants.migrations
:members:
:undoc-members:
:show-inheritance:


.. automodule:: butterfree.constants.migrations
:members:
:undoc-members:
:show-inheritance:

.. automodule:: butterfree.constants.spark_constants
:members:
:undoc-members:
Expand All @@ -39,6 +62,26 @@ butterfree.constants.spark\_constants module
butterfree.constants.window\_definitions module
-----------------------------------------------

.. automodule:: butterfree.constants.window_definitions
:members:
:undoc-members:
:show-inheritance:

.. automodule:: butterfree.constants.window_definitions
:members:
:undoc-members:
:show-inheritance:

.. automodule:: butterfree.constants.window_definitions
:members:
:undoc-members:
:show-inheritance:

.. automodule:: butterfree.constants.window_definitions
:members:
:undoc-members:
:show-inheritance:

.. automodule:: butterfree.constants.window_definitions
:members:
:undoc-members:
Expand Down
5 changes: 5 additions & 0 deletions docs/source/butterfree.dataframe_service.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,11 @@ butterfree.dataframe\_service.partitioning module
butterfree.dataframe\_service.repartition module
------------------------------------------------

.. automodule:: butterfree.dataframe_service.repartition
:members:
:undoc-members:
:show-inheritance:

.. automodule:: butterfree.dataframe_service.repartition
:members:
:undoc-members:
Expand Down
78 changes: 78 additions & 0 deletions tests/unit/butterfree/transform/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -209,6 +209,84 @@ def make_multiple_rolling_windows_hour_slide_agg_dataframe(
return df


def make_rolling_windows_hour_slide_agg_dataframe(spark_context, spark_session):
data = [
{
"id": 1,
"timestamp": "2016-04-11 12:00:00",
"feature1__avg_over_1_day_rolling_windows": 266.6666666666667,
"feature2__avg_over_1_day_rolling_windows": 300.0,
},
{
"id": 1,
"timestamp": "2016-04-12 00:00:00",
"feature1__avg_over_1_day_rolling_windows": 300.0,
"feature2__avg_over_1_day_rolling_windows": 350.0,
},
{
"id": 1,
"timestamp": "2016-04-12 12:00:00",
"feature1__avg_over_1_day_rolling_windows": 400.0,
"feature2__avg_over_1_day_rolling_windows": 500.0,
},
]
df = spark_session.read.json(
spark_context.parallelize(data).map(lambda x: json.dumps(x))
)
df = df.withColumn("timestamp", df.timestamp.cast(DataType.TIMESTAMP.spark))

return df


def make_multiple_rolling_windows_hour_slide_agg_dataframe(
spark_context, spark_session
):
data = [
{
"id": 1,
"timestamp": "2016-04-11 12:00:00",
"feature1__avg_over_2_days_rolling_windows": 266.6666666666667,
"feature1__avg_over_3_days_rolling_windows": 266.6666666666667,
"feature2__avg_over_2_days_rolling_windows": 300.0,
"feature2__avg_over_3_days_rolling_windows": 300.0,
},
{
"id": 1,
"timestamp": "2016-04-12 00:00:00",
"feature1__avg_over_2_days_rolling_windows": 300.0,
"feature1__avg_over_3_days_rolling_windows": 300.0,
"feature2__avg_over_2_days_rolling_windows": 350.0,
"feature2__avg_over_3_days_rolling_windows": 350.0,
},
{
"id": 1,
"timestamp": "2016-04-13 12:00:00",
"feature1__avg_over_2_days_rolling_windows": 400.0,
"feature1__avg_over_3_days_rolling_windows": 300.0,
"feature2__avg_over_2_days_rolling_windows": 500.0,
"feature2__avg_over_3_days_rolling_windows": 350.0,
},
{
"id": 1,
"timestamp": "2016-04-14 00:00:00",
"feature1__avg_over_3_days_rolling_windows": 300.0,
"feature2__avg_over_3_days_rolling_windows": 350.0,
},
{
"id": 1,
"timestamp": "2016-04-14 12:00:00",
"feature1__avg_over_3_days_rolling_windows": 400.0,
"feature2__avg_over_3_days_rolling_windows": 500.0,
},
]
df = spark_session.read.json(
spark_context.parallelize(data).map(lambda x: json.dumps(x))
)
df = df.withColumn("timestamp", df.timestamp.cast(DataType.TIMESTAMP.spark))

return df


def make_fs(spark_context, spark_session):
df = make_dataframe(spark_context, spark_session)
df = (
Expand Down

1 comment on commit 69293a4

@chip-n-dale
Copy link

@chip-n-dale chip-n-dale bot commented on 69293a4 Aug 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ralphrass!

The GitLeaks SecTool reported some possibly exposed credentials/secrets, how about giving them a look?

GitLeaks Alert Sync

In case of false positives, more information is available on GitLeaks FAQ
If you have any other problem or question during this process, contact us in the Security space on GChat!

Please sign in to comment.