Skip to content
View kirill-rubashevskiy's full-sized avatar

Block or report kirill-rubashevskiy

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Hi! I'm Kirill Rubashevskiy

About Me

I'm a Machine Learning graduate student at the Higher School of Economics, passionate about developing data-driven solutions. Based in Berlin, I'm seeking entry-level position or internship in Data Science and Data Analysis.

Skills

Programming & Data Tools

Python Pandas NumPy SciPy PostgreSQL Matplotlib

Machine Learning & Deep Learning

scikit-learn PyTorch

Web Scraping & Automation

Selenium

MLOps & DevOps

Apache Airflow MLflow Weights & Biases Docker FastAPI Git Amazon S3

Professional Goals

Currently seeking entry-level position or internship in Data Science and Data Analysis in Berlin, with a focus on:

  • Machine Learning Engineering
  • Data Analysis
  • MLOps & Data Pipeline Development

Education

Master's in Machine Learning and High Load Systems

Higher School of Economics, 2023-2025

Languages

  • πŸ‡·πŸ‡Ί Russian (Native)
  • πŸ‡¬πŸ‡§ English (Fluent)
  • πŸ‡©πŸ‡ͺ German (A2)

Work in Progress

Graildient Descent

A machine learning project that predicts sold prices for high-end, pre-owned, and limited edition fashion items on Grailed marketplace. The project demonstrates a complete ML system featuring:

πŸ”„ Data Pipeline

  • Selenium-based web scraper for Grailed listings
  • Apache Airflow ETL pipeline for automated data collection
  • Comprehensive data cleaning and preprocessing workflow

πŸ” ML Development

  • Multimodal approach combining tabular and text features
  • Extensive ML experiments with various models and architectures
  • Best model: CatBoost achieving 37.1% improvement over baseline (RMSLE: 0.64)

πŸš€ Production System

  • FastAPI service providing real-time price predictions
  • Interactive Streamlit app for model exploration and predictions
  • CI/CD pipeline with GitHub Actions

βš™οΈ Tech Stack: Python, FastAPI, Streamlit, Apache Airflow, Weights & Biases, scikit-learn, CatBoost

πŸ”— Live Demo | GitHub Repository

Completed Projects

Authorship Identification

A machine learning service that identifies the authorship of texts among ten prominent Russian 19th-century writers. The project showcases a complete ML pipeline featuring:

πŸ”„ Data Pipeline

  • Web scraping of literary texts from open sources
  • Comprehensive text preprocessing and cleaning workflow
  • Feature engineering pipeline for text data

πŸ” ML Development

  • Text feature extraction using statistical metrics and Bag-of-Words
  • Extensive experimentation with various NLP approaches
  • Best model: Logistic Regression achieving 0.85 weighted F1 score

πŸš€ Production System

  • FastAPI service for real-time authorship predictions
  • Telegram bot for user-friendly interaction
  • Cloud storage integration for model artifacts

βš™οΈ Tech Stack: Python, FastAPI, scikit-learn, NLTK, Docker, Telegram API

Note: This project is primarily in Russian.

πŸ”— GitHub Repository


Pinned Loading

  1. graildient-descent graildient-descent Public

    A machine learning project to predict sold prices of items on Grailed, an online marketplace for high-end, pre-owned, and limited edition fashion.

    Python 1

  2. mlds23-authorship-identification mlds23-authorship-identification Public

    ML service designed to identify the authorship of texts among ten prominent Russian 19th century writers

    Jupyter Notebook