feat: import datasets

closes #6
elimu-ai · Sep 1, 2024 · 28bbca3 · 28bbca3
1 parent 9cb50fc
commit 28bbca3
Show file tree

Hide file tree

Showing 6 changed files with 65 additions and 72 deletions.
diff --git a/.github/run-all-steps.yml b/.github/run-all-steps.yml
@@ -0,0 +1,37 @@
+name: Run all steps
+
+on:
+  push:
+    branches: [ "main" ]
+  pull_request:
+    branches: [ "main" ]
+
+jobs:
+  run_all_steps:
+    runs-on: ubuntu-latest
+    strategy:
+      matrix:
+        python-version: ["3.9", "3.10", "3.11"]
+    steps:
+    - uses: actions/checkout@v4
+    - name: Set up Python ${{ matrix.python-version }}
+      uses: actions/setup-python@v3
+      with:
+        python-version: ${{ matrix.python-version }}
+    - name: Install dependencies
+      run: |
+        python -m pip install --upgrade pip
+        pip install flake8 pytest
+        if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
+    - name: Lint with flake8
+      run: |
+        # stop the build if there are Python syntax errors or undefined names
+        flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics
+        # exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide
+        flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics
+    - name: Test with pytest
+      run: |
+        pytest
+    - name: Run All Steps (1-3)
+      run: |
+        python run_all_steps.py
diff --git a/README.md b/README.md
@@ -1,37 +1,34 @@
-# ml-storybooks-recommender 🤖📚
+# ML: Storybook Recommender 🤖📚
 
-Machine learning model which predicts the rating of unread storybooks based on the student's previously read storybooks.
+> Machine learning model which predicts the likability of unread storybooks based on a child's previously read 
+> storybooks.
 
-One model will be trained per language.
+> [!IMPORTANT]
+> This learning model will be used by Android applications, so it needs to be stored in a format standard compatible 
+> with the Android technology, e.g. [ONNX](https://onnx.ai).
 
 
 ## 1. Prepare the Data
 
-To prepare the data, follow these steps:
-  * Open `prepare_data.py` and select environment and language.
-  * Go to the website corresponding to the chosen environment and language, e.g. https://eng.test.elimu.ai.
-  * Download `storybooks.csv` from https://eng.test.elimu.ai/content/storybook/list.
-  * Download `storybook-learning-events.csv` from https://eng.test.elimu.ai/analytics/storybook-learning-event/list.
-  * Add the two datasets to `RAW_DATA_DIR`.
-  * Execute the script: `python prepare_data.py`
+See [`step1_prepare`](./step1_prepare)
 
 
 ## 2. Train the Model
 
-TODO
+See [`step2_train`](./step2_train)
 
 
 ## 3. Make Predictions on New Samples
 
-TODO
+See [`step3_predict`](./step3_predict)
 
 ---
 
 <p align="center">
   <img src="https://github.com/elimu-ai/webapp/blob/main/src/main/webapp/static/img/logo-text-256x78.png" />
 </p>
 <p align="center">
-  elimu.ai - Free open-source learning software for out-of-school children ✨🚀
+  elimu.ai - Free open-source learning software for out-of-school children 🚀✨
 </p>
 <p align="center">
   <a href="https://elimu.ai">Website 🌐</a>

diff --git a/env-TEST/lang-ENG/data/storybook-learning-events.csv b/env-TEST/lang-ENG/data/storybook-learning-events.csv