This project is about building and analysing a Data Warehouse prototype with a Semi Stream Join (INLJ) implementation.
The operation of the data warehouse can be broken down into 3 major steps:
- Creation of the Data Warehouse.
- Extraction, Transformation, and Loading using the INLJ algorithm.
- Creation of Reports using OLAP Queries.
The CUSTOMERS
, PRODUCTS
, and TRANSACTIONS
tables should already be loaded in the database.
- Open SQLDeveloper.
- Run the SQL script file createDW.sql which will create the necessary tables for the data warehouse according to the star-schema.
All existing tables with the same name will be dropped and replaced by empty tables named
D_CUSTOMERS
,D_PRODUCTS
,D_STORES
,D_SUPPLIERS
, andD_TIME
(dimension tables); andW_FACTS
(fact table).
Once completed, the above 6 tables would have been created.
- With SQLDeveloper open, open and run the INLJ.sql PL-SQL file which will implement the INLJ algorithm.
- The algorithm involves the creation of the
TRANSACTIONSCURSOR
which is a cursor that will be used to read the tuples from theTRANSACTIONS
table. - The
BULK COLLECT...LIMIT=100
statement limits the number of tuples per batch read to 100. - It is possible to change this by replacing the number "100" by any number less than the total number of records in the
TRANSACTIONS
table.
Once completed, the script would have loaded the data in the relevant Data Warehouse tables.
- With SQLDeveloper open, select which report is to be created.
- Select the lines relevant to the report to be created - the file contains separations between each query.
- Select the lines from the comment denoted by
/*....*/
to the end of the code which is denoted by a;
and precedes another comment denoting the lines used to generate the next report.