This repository contains 3 different examples of data extracting from an html page using jsoup. I created this before for a Turkish News Website and can't share the cpanel pages because of legal reasons.
CaptionGenerator extracts data from shared news page and generates a list of shared news topics for the last day in alphabetical order. Plus ordering them by views in another section.
CaptionGeneratorMonthly generates monthly shared news for a single user and orders them by view. Completely same logic with CaptionGenerator but it generates the user's most viewed news too.
CaptionTakerPlagiarism extracts chosen range of news links from the shared news panel. It will visit all of those news pages after that and generate their plagiarism rates.