To excel as a data scientist and effectively manage code, it's crucial to develop strong software engineering (SWE) skills. Here are the skills I am developing.
- Ensure code is easily interpretable by all team members.
- Prioritize reproducibility, cleanliness, and thorough documentation.
- Aim for scalable code and understand system design principles.
- Extract, Transform, Load:
- Understand the principles of ETL processes for data integration and manipulation.
- Libraries:
- TensorFlow, PyTorch, Keras
- Topics:
- Neural networks, convolutional neural networks (CNNs), recurrent neural networks (RNNs), transformers
- Practice:
- Apply deep learning techniques to various data science tasks.
- Clean Code:
- Understand Python code optimization, SOLID principles, and PEP-20.
- Design Patterns:
- Learn common design patterns for solving recurring design problems, working through neet code library that has common leet code problems.
- System Design:
- Study architectural patterns for scalability and performance.
- ML Library Development:
- Develop ML libraries and types of objects in ML flow.
- Big Data Processing:
- Explore Spark, Dask, and DAG workflows for scalable data processing.
- Optimization:
- Learn performance profiling and optimization techniques.
- Concurrency and Parallelism:
- Understand multi-threading, parallel processing, and asynchronous programming.
- Simulation:
- Study simulation techniques for modeling complex systems.
- Version Control:
- Master Git and understand best practices for version control.
- Testing:
- Write unit tests, integration tests, and practice test-driven development (TDD).
- CI/CD:
- Familiarize myself with continuous integration and deployment pipelines. Continously update repos.
- Database Optimization:
- Learn indexing, query optimization, and database normalization.
- Community and Forums:
- Engage with classmates, YouTube channels, and forums to stay updated and collaborate on projects.
- Such as MSDS class.
- HackerRank Profile: My HackerRank Profile
- 30 Days of Code Badge: