We were tasked to understand supply chain disruptions in the form of late deliveries and to train a model product delays based on historical inbound/outbound orders of the company. In practical terms, the task is to provide supply chain resilience in the form of disaster prevention and damage control, as well as value by reevaluating business partnerships in the iterative process of supply chain optimization.
What is it expected from the technical challenge?
- Provide data insights based on descriptive analytics of historical data
- Regression model to predict likelihood of order delay
What will be measured?
-
Business case presentation outlining descriptive insights of main drivers’ toward order delays. How those insights can be translated to business actions and value proposition.
-
Predictive model to get likelihood of order delay. ROC Curve (AUC) evaluation metric for given test data set. A Kaggle Competition have been created for teams to submit and test their model results. Please sign-up to the competition and follow the instructions
Some information about the data given in the Kaggle competition.
Transactional historical data of the company supply chain inbound/outbound shipments
order_id
(string): unique identifier of transactional order from port inbound to final destination. Primary key of data set.origin_port
(string): location of port where order imports arrives.3pl
(string): Third-party logistic company id used for distribution, warehousing, and fulfillment services.customs_procedure
(string): Type of procedure to be used in the imports legal processlogistic_hub
(string): city name of company logistic hub address. Intermediate step between origin_port and customercustomer
(string): city name of customer destination addressproduct_id
(string): unique identifier of final productunits
(integer): order size quantitylate_order
(boolean): target variable, if 1 the order_id have been tagged as a late delivery, 0 is on-time
Master data of product unit weight
product_id
(string): unique identifier of final productweight
(integer): product weight per 1 unit in gramsmaterial_handling
(integer): Classification id for product safety risk and risk of damage e.g. fragile, toxic, flammable.
Geographic coordinates of cities involve in the supply chain. Including distance between pair of cities
city_from_name
(string): City of location starting pointcity_to_name
(string): City of destination locationcity_from_coord
(tuple): Coordinates in (latitude, longitude) of city_fromcity_to_coord
(tuple): Coordinates in (latitude, longitude) of city_todistance
(float): kilometers between the pair cities
Same as orders.csv
but variable late_order
has been truncated.
This is the target variable