Website is deployed in Kubernetes cluster. A deployment contains the following containers:
- website: A Flask app with static files complied by Webpack.
- mixer: A Data Commons API server.
- esp: Google Extensive Service Proxy used for endpoints management.
mixer is a submodule of this Git
repo. The exact commit of the submodule is deployed together with the website so
it may not be the same version as in https://api.datacommons.org/version
.
Make sure to update and track the mixer changes for a new deployment:
git submodule foreach git pull origin master
git submodule update --init --recursive
For changes that do not test GCP deployment or involve mixer changes, one can simply run in a local environment (Mac or Linux machine). This way the local Flask app talks to the autopush mixer.
Note: the autopush mixer
contains the latest data and mixer code changes. It
is necessary to update the mixer submodule if compatibility is required between
website and mixer changes.
WARNING: Make sure to go through each of the following steps.
-
Python
Confirm the Python3 version is 3.11 or above. Otherwise install/upgrade your Python and confirm the version:
python3 --version
Set up your Python environment and update packages with:
./run_test.sh --setup_python
If using version 3.12.x or above, you also need to run the following command, on macOs:
brew install python-setuptools
or for linux:
pip install python-setuptools
-
Node.js 18.4.0
Install
nodejs
and nvm. Run the following command to use Node.js 18.4.0:nvm install 18.4.0 nvm use 18.4.0
-
Protoc 3.21.9
-
[Optional] gcloud
gcloud is required to make the place search working locallly. This requires installation of
gcloud
.Then ask Data Commons team to grant you permission for the Google Maps API key access.
Finally authenticate locally with
gcloud auth application-default login
./run_npm.sh
This will watch static files change and re-build on code edit.
NOTE: On macOS machines with a M1 chip, run the following command before running the above command. See this for more details.
brew install pkg-config cairo pango libpng jpeg giflib librsvg
Start the flask webserver locally at localhost:8080
./run_server.sh
If you don't have access to DataCommons maps API, can bring up website without place search functionality
./run_server.sh -e lite
There are multiple environments for the server, specified by -e
options.
For example, custom
is for custom data commons and iitm
is
for iitm data commons.
To start multiple instances, bind each server instance to a different port. The following example will start localhost on port 8081. The default is 8080.
Please note the strict syntax requirements for the script, and leave a space
after the flag. So: ./run_server.sh -p 8081
but not ./run_server.sh -p=8081
.
To enable language models
./run_server.sh -m
Natural language models are hosted on a separate server. For features that depend on it (all NL-based interfaces and endpoints), the NL server needs to be brought up locally (in a separate process):
./run_nl_server.sh -p 6060
By default the NL server runs on port 6060.
If you run into problems starting the server, try running these commands before restarting the server:
./run_test.sh --setup_python
rm -rf ~/.datacommons
rm -rf /tmp/datcom-nl-models
rm -rf /tmp/datcom-nl-models-dev
If local mixer is needed, can start it locally by following this instruction. This allows development with custom BigTable or mixer code change. Make sure to also run ESP locally.
Then start the Flask server with -l
option to let it use the local mixer:
./run_server.sh -l
Commit all changes locally, so the local change is identified by a git hash. Then run
gcloud auth login
gcloud auth configure-docker
./scripts/push_image.sh
./scripts/deploy_gke_helm.sh -e dev
The script builds docker image locally and tags it with the local git commit hash at HEAD, then deploys to dev instance in GKE.
View the deployoment at link.
❗IMPORTANT: Make sure that your ChromeDriver version is compatible with your local Google Chrome version.
Before running the tests, install the browser and webdriver. Here we recommend you use Google Chrome browser and ChromeDriver.
-
Chrome browser can be downloaded here.
-
ChromeDriver can be downloaded here, or you can download it using package manager directly:
npm install chromedriver
You can view the latest ChromeDriver version here. Also make sure PATH is updated with ChromeDriver location.
If using Linux system, you can run the following commands to download Chrome browser and ChromeDriver, this will also include the path setup:
wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
sudo dpkg -i google-chrome-stable_current_amd64.deb; sudo apt-get -fy install
CHROMEDRIVERV=$(curl https://chromedriver.storage.googleapis.com/LATEST_RELEASE)
wget https://chromedriver.storage.googleapis.com/${CHROMEDRIVERV}/chromedriver_linux64.zip
unset CHROMEDRIVERV
unzip chromedriver_linux64.zip
sudo mv chromedriver /usr/bin/chromedriver
sudo chown root:root /usr/bin/chromedriver
sudo chmod +x /usr/bin/chromedriver
❗ NOTE: If using MacOS with an ARM processor (M1 chip), run local NL server before running the tests:
./run_nl_server.sh -p 6060
./run_test.sh -a
cd static
npm test . -- -u
The autopush instance(autopush.datacommons.org) always has the latest code and data. For this to happen in other dev/demo instance, in a clean git checkout, simply run:
./script/deploy_latest.sh <ENV_NAME> <REGION>
-
[Optional] Update variables in 'env' of 'Flask' configurations in .vscode/launch.json as needed.
-
In the left hand side menu of VS Code, click on "Run and Debug".
-
On top of the "Run and Debug" pane, select "DC Website Flask" and click on the green "Play" button.
-
In "DEBUG CONSOLE" (not "TERMINAL"), check the server logs show up.
This brings up Flask server from the debugger. Now you can set break point and inspect variables from the debugger pane.
TIPS: you can inspect variable in the botton of "DEBUG CONSOLE" window.
A full tutorial of debugging Flask app in Visual Studio Code is in here.
-
Update server/config/chart_config/
<category>.json
with the new chart.{ "category": "", // The top level category this chart belongs to. Order of charts in the spec matters. "topic": "", // Strongly encouraged - A page-level grouping for this chart. "titleId": "", // Strictly for translation purposes. "title": "", // Default (EN) display string "description": "", // Strictly for translation purposes. "statsVars": [""], // List of stat vars to include in the chart "isOverview": true, // Optional - default false. If the chart should be added to the overview page. "isChoropleth": true, // Optional - default false. If a map should be used to display the data "unit": "", "scaling": 100, "relatedChart": { // Defined if there should be comparison charts added // All chart fields from above can be specified. If unspecified, it will be inherited. } }
-
Update related files.
-
If adding a new category, create a new config file in server/chart_config and add the new category to:
-
If a new stat var is introduced, also update:
- Labels that appear as chips under comparison charts: static/js/i18n/strings/en/stats_var_labels.json
- Titles on ranking pages: static/js/i18n/strings/en/stats_var_titles.json
- New stat vars which have not been cached: NEW_STAT_VARS
-
If a new unit is required, update:
- static/js/i18n/i18n.tsx
- static/js/i18n/strings/*/units.json (with display names and labels for the unit in ALL languages)
Note: Please add very detailed descriptions to guide our translators. See localization.md for more details.
-
-
Run these commands:
./scripts/extract_messages.sh ./scripts/compile_messages.sh
-
IMPORTANT: Manually restart Flask to reload the config and translations. Most likely, this means re-running
run_server.py
-
Test the data on a place page!
-
Disable headless mode in webdriver to follow the test in Chrome. Chrome features like the dev inspector are available in this mode which is useful combined with
sleep()
to give you time to inspect the page. To enter this mode, comment out this line in base.py:chrome_options.add_argument('--headless')
-
Another option is to save a screenshot at various points of the test:
self.driver.save_screenshot(filename)
The GKE configuration is stored here.
Redis memcache is used for production deployment. Each cluster has a Redis instance located in the same region.
To test .yaml cloudbuild files, you can use cloud-build-local to dry run the file before actually pushing. Find documentation for how to install and use cloud-build-local here.