LinkedIn Profile: https://www.linkedin.com/in/rohan-raj-8ba608b8/
- A Visual Narrative of Ramayana using Extractive Summarisation, Topic Modeling and NER tagging. link
- Building pipeline to process the real-time data using Spark and Mongodb/postgresql. link
- ETL workflow and data analysis. ETL-workflow using prefect and pygrametl (SCD, slow changing dimension). Product classification based on product name. link
- Data-Driven-Storytelling-Old-Car-Price-Prediction link
Thesis Title: End to End UX Analytics Framework.
The focus of research is to build an infrastructure designed to facilitate new levels of analytical insights derived from exploiting all relevant data. The platform will cover various forms of data and analytics: transactional data, order data, App usage data, user data, and so forth. It will also establish an adaptable, scalable IT infrastructure, tuned for a complex data environment, and it will be designed to benefit from the cloud technologies. The end product will be to perform data analytics for taking insightful business and design decisions.
Key terms: Data Infrastructure, Data Warehouse, Business Intelligence, ETL, Continuous Integration and Continuous Deployment (CI/CD) Pipeline, Docker, Kubernetes (OKD), GitLab, PostgreSQL, Tableau, Python, AWS services, GCP services, Azure services, Cloud computing
Thesis Title: New avenues in opinion mining : Considering dual sentiment analysis.
To address this problem for sentiment classifi cation, Dual sentiment analysis (DSA) has been expanded from a 2 facet classifi cation to a 3 facet classifi cation which considers neutral reviews from the data set as well for better accuracy and understanding. For each training and test review, a novel data expansion technique is being proposed that will use opposite class labels of positive and negative sentiments in one to one correspondence for a dual training and dual prediction algorithm. A corpus method based pseudo-antonym dictionary has also been proposed to remove the single language (English) based restriction and to maintain domain consistency as it will be pairing up words on the basis of sentiment strength.
paper