Kirill Rubashevskiy kirill-rubashevskiy

Hi! I'm Kirill Rubashevskiy

About Me

I'm a Machine Learning graduate student at the Higher School of Economics, passionate about developing data-driven solutions. Based in Berlin, I'm seeking entry-level position or internship in Data Science and Data Analysis.

Skills

Programming & Data Tools

Machine Learning & Deep Learning

Web Scraping & Automation

MLOps & DevOps

Professional Goals

Currently seeking entry-level position or internship in Data Science and Data Analysis in Berlin, with a focus on:

Machine Learning Engineering
Data Analysis
MLOps & Data Pipeline Development

Education

Master's in Machine Learning and High Load Systems

Higher School of Economics, 2023-2025

Languages

🇷🇺 Russian (Native)
🇬🇧 English (Fluent)
🇩🇪 German (A2)

Work in Progress

Graildient Descent

A machine learning project that predicts sold prices for high-end, pre-owned, and limited edition fashion items on Grailed marketplace. The project demonstrates a complete ML system featuring:

🔄 Data Pipeline

Selenium-based web scraper for Grailed listings
Apache Airflow ETL pipeline for automated data collection
Comprehensive data cleaning and preprocessing workflow

🔍 ML Development

Multimodal approach combining tabular and text features
Extensive ML experiments with various models and architectures
Best model: CatBoost achieving 37.1% improvement over baseline (RMSLE: 0.64)

🚀 Production System

FastAPI service providing real-time price predictions
Interactive Streamlit app for model exploration and predictions
CI/CD pipeline with GitHub Actions

⚙️ Tech Stack: Python, FastAPI, Streamlit, Apache Airflow, Weights & Biases, scikit-learn, CatBoost

🔗 Live Demo | GitHub Repository

Completed Projects

Authorship Identification

A machine learning service that identifies the authorship of texts among ten prominent Russian 19th-century writers. The project showcases a complete ML pipeline featuring:

🔄 Data Pipeline

Web scraping of literary texts from open sources
Comprehensive text preprocessing and cleaning workflow
Feature engineering pipeline for text data

🔍 ML Development

Text feature extraction using statistical metrics and Bag-of-Words
Extensive experimentation with various NLP approaches
Best model: Logistic Regression achieving 0.85 weighted F1 score

🚀 Production System

FastAPI service for real-time authorship predictions
Telegram bot for user-friendly interaction
Cloud storage integration for model artifacts

⚙️ Tech Stack: Python, FastAPI, scikit-learn, NLTK, Docker, Telegram API

Note: This project is primarily in Russian.

🔗 GitHub Repository

Provide feedback

Saved searches

Use saved searches to filter your results more quickly