I'm a Machine Learning graduate student at the Higher School of Economics, passionate about developing data-driven solutions. Based in Berlin, I'm seeking entry-level position or internship in Data Science and Data Analysis.
Currently seeking entry-level position or internship in Data Science and Data Analysis in Berlin, with a focus on:
- Machine Learning Engineering
- Data Analysis
- MLOps & Data Pipeline Development
Master's in Machine Learning and High Load Systems
Higher School of Economics, 2023-2025
- π·πΊ Russian (Native)
- π¬π§ English (Fluent)
- π©πͺ German (A2)
A machine learning project that predicts sold prices for high-end, pre-owned, and limited edition fashion items on Grailed marketplace. The project demonstrates a complete ML system featuring:
π Data Pipeline
- Selenium-based web scraper for Grailed listings
- Apache Airflow ETL pipeline for automated data collection
- Comprehensive data cleaning and preprocessing workflow
π ML Development
- Multimodal approach combining tabular and text features
- Extensive ML experiments with various models and architectures
- Best model: CatBoost achieving 37.1% improvement over baseline (RMSLE: 0.64)
π Production System
- FastAPI service providing real-time price predictions
- Interactive Streamlit app for model exploration and predictions
- CI/CD pipeline with GitHub Actions
βοΈ Tech Stack: Python, FastAPI, Streamlit, Apache Airflow, Weights & Biases, scikit-learn, CatBoost
π Live Demo | GitHub Repository
A machine learning service that identifies the authorship of texts among ten prominent Russian 19th-century writers. The project showcases a complete ML pipeline featuring:
π Data Pipeline
- Web scraping of literary texts from open sources
- Comprehensive text preprocessing and cleaning workflow
- Feature engineering pipeline for text data
π ML Development
- Text feature extraction using statistical metrics and Bag-of-Words
- Extensive experimentation with various NLP approaches
- Best model: Logistic Regression achieving 0.85 weighted F1 score
π Production System
- FastAPI service for real-time authorship predictions
- Telegram bot for user-friendly interaction
- Cloud storage integration for model artifacts
βοΈ Tech Stack: Python, FastAPI, scikit-learn, NLTK, Docker, Telegram API
Note: This project is primarily in Russian.
π GitHub Repository