Skip to content

Latest commit

 

History

History
141 lines (126 loc) · 4.22 KB

README.md

File metadata and controls

141 lines (126 loc) · 4.22 KB

This is the best calendar_table with the most columns of interesting date dimensions you'll ever find!

Most recent successful run was on June 7, 2023 using Python 3.7.3, pandas 0.25.1 ; numpy 1.16.2

PURPOSE:

Quickly and easily generate a calendar table with many columns of date dimensions and metadata. Output to dataframe or CSV to ingest into a database or for use in an application like Excel or PowerBI.

WHAT KIND OF COLUMNS?

First of all, you'll find the usual dimensions:
  • year number, month number, day number, yearmonth, yearquarter
  • month name, day name, full date w ordinal suffix (e.g., "January 3rd, 2012")
  • etc.

Additionally, some interesting and unique data elements include:

  • holiday identification including Easter in any year (including pre-1583)
  • moon phase identification
  • sunrise/sunset for UTC as well as user-provided lat/lon coordinates
  • length of daylight / darkness each day/evening

For more information about the date dimensions and fields available, see these supporting documents:

WHY CREATE A CALENDAR TABLE / DATE DIMENSION TABLE?

Calendar tables have myriad uses. Examples:
  • PROCESS FLOW CONTROLS
    Check today's date. If it is a Sunday or the last day of the month, then run a code block.
  • SLICING DATA FOR ANALYSIS
    From the revenue dataset, show me all sales that fall between Thanksgiving and Christmas, and determine which days be best for best door- buster promotions
  • FEATURES FOR MACHINE LEARNING MODELS
    Using a polynomial linear regression, predict the end-of-month sales using the month-to-date sales along with the knowledge of what percent through the month the data is, comparing to similar data from prior months.
  • DATA VISUALIZATION
    In Microsoft PowerBI, or in Excel, or in Tableau, or in any tool... Use the workday of the month as the x-axis in a bar chart that shows the change in time clocked by employees on a given project in a given month
  • AND MORE?
    Send me more examples to add!

INSTALLATION/SETUP:

Clone, checkout, or download the .py and run it in Python 3.7+. No special imports will be needed outside of standard libraries.

NOTES:

  • Example CSV output is provided in git repo
  • When this code is run for a span of 5 years:
    • ~2,200 rows of data are created with ~110 columns (~240k cells)
    • the resulting CSV's filesize is ~1.5 MiB
    • the script takes ~28 seconds on my raspberry pi 4B (very low specs)
    • code will run much faster on a modern laptop or desktop
  • No special import statements needed besides standard Python 3.7+ packages
  • 32-bit OS and hardware may cause limitations in the ability to generate table data for years far in the future (2040) and throw "OverflowError: timestamp out of range for platform time_t"; no limitation has been observed on 64-bit systems.

TO DO:

  • modularize codes with classes/functions
  • use config file to set main params (start, end, holiday rules, lat/lon for localized data like moon phase, sunrise, etc.)
  • add columns
    • percent through w / ym / y / etc
    • add season based on official start/end of seasons? (equinox, etc)
  • fix bugs in the following:
    • night duration utc / night duration local

AUTHOR:

This file was authored by TeneoPython01 with select code excerpts leveraged from other authors where noted. In some cases the noted authors' code has been materially modified. In other cases it has been used verbatim. All other code is original.

ABOUT AUTHOR:

Contact me at https://github.com/TeneoPython01

COLLABORATION?

Collaboration is sought and encouraged on this project!

LICENSE:

See license information in git repo. The software is released in the public domain as-is without any warranty. More details can be found in the license file.