Add custom html elements in SchemaSpy pages using Python scripts
Keywords: Python, Web scraping, CSV, HTML, automation
- To automate my process of inserting descriptions for multiple tables and fields in the data dictionary.
- I also want to optimize my workflow in writing the descriptions for more than a thousand fields by retrieving the unique fields instead.
-
Add table descriptions dynamically in SchemaSpy index page Original index.html
Resulting index.html with table descriptions for all 139 tables -
Add field descriptions dynamically in each SchemaSpy table page Original cfg_billing_id.html
Resulting cfg_billing_id.html with its table description (same with the index) and field descriptions
-
Install pip. How to install in mac
-
Install BeautifulSoup4.
cd beautifulsoup4-4.6.0
python setup.py install
- Install Pathlib
sudo pip install pathlib
-
Add Schemaspy folder to
Data
folder. -
[For Settlement]
Result
folder should have the folowing subfolders:
- settlement
- settlement_csv
- settlement_table_desc
- Date: July 13, 2017
- Table Count: 139
- Field Count: 2,862
- Note: Forgot to publish release (Sorry!)
- Complete Prerequisites
- Export table list to CSV (c/o Google Sheets)
- Update table descriptions of index and table pages using
writetabledescriptiontohtml.py
- If no fields yet:
- Retrieve and save all unique fields to CSV using
retrieveuniquefields.py
- Write fields and field descriptions to CSV using
writefieldstocsv.py
- Retrieve and save all unique fields to CSV using
- If field descriptions are complete in CSV: Update field descriptions of all tables using
writefielddescriptionstohtml.py
- Date: October 16, 2017
- Table Count: 186
- Field Count: 4,066
- Note: Applied web scrapping to new tables and fields
- Complete Prerequisites
- Export table list with descriptions to CSV (c/o Google Sheets)
- Update table descriptions of index and table pages using
writeTableDescriptiontoHTML.py
- If no more fields yet:
- Retain SOW9 unique fields
unique_fields-sow9.csv
- Retrieve all new SOW10 fields to CSV using
retrievenewfields.py
- Write SOW9 and SOW10 fields to CSV using
writeFieldsToCSV.py
- Retain SOW9 unique fields
- If field descriptions are complete in CSV: Update field descriptions of all tables using
writeFieldDescriptionsToHTML.py
-
Export table masterlist with descriptions to csv (Google Sheets).
- Default Directory:
../../Google Drive/Python/CSV_dump/Settlement-Tables-Descriptions.csv
- Default Directory:
-
Run
writetabledescriptiontohtml.py
- Write table description to each table html page
- Result:
Result/settlement_tables_desc/tables/
-
Run
retrieveuniquefields.py
- Retrieve all common and unique fields from all table html pages. Save to CSV
- Result:
Result/settlement_csv/unique_fields.csv
-
Update
unique_fields.csv
- User can add description to all unique fields in just one CSV file
-
Run
writefieldstocsv.py
- Retrieve fields from table html. Add descriptions of common and unique fields from
unique_fields.csv
- Result:
Result/settlement_csv/*
- Retrieve fields from table html. Add descriptions of common and unique fields from
-
(Optional) Update
Result/settlement_csv/*
csv files- User can modify descriptions for specific table CSV files
-
Run
writefielddescriptionstohtml.py
- HTML Source:
Result/settlement_tables_desc/tables/
- Content Source:
Result/settlement_csv/*
- Write field descriptions from table csv to each table html.
- Result:
Result/settlement/tables
- HTML Source:
- Google Apps Scripts
- Export Tables Masterlist to CSV
- Script not in this repository
- Python
- Read html files from SchemaSpy folder (BeautifulSoup)
- Retrieve select items from html pages (BeautifulSoup)
- Modify html tag attributes (BeautifulSoup)
- Read CSV files
- Write CSV files
- Write HTML in HTML Files based on CSV content (Pathlib)
Check out my logs!
Done
- Line 92: Add the following tag
<!----Table Description---->
<br>
<div><strong>Description: </strong> {Insert description here from csv source}</div>
<br>
<!----Table Description---->
Done
- Line 40: Add checked
for comments
<label for='showComments'><input type=checkbox checked id='showComments'>Comments</label>