diff --git a/examples/Sort_Google_Scholar_No_Code_Version.ipynb b/examples/Sort_Google_Scholar_No_Code_Version.ipynb new file mode 100644 index 0000000..c098247 --- /dev/null +++ b/examples/Sort_Google_Scholar_No_Code_Version.ipynb @@ -0,0 +1,929 @@ +{ + "nbformat": 4, + "nbformat_minor": 0, + "metadata": { + "colab": { + "provenance": [], + "include_colab_link": true + }, + "kernelspec": { + "name": "python3", + "display_name": "Python 3" + } + }, + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "view-in-github", + "colab_type": "text" + }, + "source": [ + "" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "M36VFanosbkb" + }, + "source": [ + "# Sort Google Scholar - No Code Version\n", + "\n", + "\n", + "## 1. Type or Paste your search query bellow (including special Google Scholar rules like AND/OR or 'exact keyword')\n", + "For more keyword details, please refer to [this reference](https://guides.library.ucsc.edu/c.php?g=745384&p=5361954).\n", + "\n" + ] + }, + { + "cell_type": "code", + "source": [ + "search_query = \"large language models\" # @param {type:\"string\"}" + ], + "metadata": { + "cellView": "form", + "id": "xlpCibrIV4Nk" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "\n", + "Some examples:\n", + "\n", + "- `Large Language Models` → General search\n", + "- `\"Large Language Models\"` → Exact phrase search\n", + "- `Large Language Models -transformer` → Exclude specific term\n", + "- `Large Language Models author:\"Geoffrey Hinton\"` → Search by author\n", + "- `Large Language Models source:Nature` → Search within a specific publication\n", + "- `(\"Large Language Models\" OR \"Transformer Models\") AND (GPT OR BERT)` → Boolean search\n", + "- `intitle:\"Large Language Models\"` → Search in the title only\n" + ], + "metadata": { + "id": "fawjia86vL63" + } + }, + { + "cell_type": "markdown", + "source": [ + "### Optional Parameters" + ], + "metadata": { + "id": "MnCQ9KOmYD3z" + } + }, + { + "cell_type": "code", + "source": [ + "# Expanded form with extra parameters\n", + "sortby = \"cit/year\" # @param [\"Citations\", \"cit/year\"] {type:\"string\"}\n", + "nresults = 100 # @param {type:\"number\"}\n", + "startyear = None # @param {type:\"string\"}\n", + "endyear = None # @param {type:\"string\"}\n", + "\n", + "# Constructing the base command\n", + "cmd = f\"sortgs '{search_query}' --sortby '{sortby}' --nresults {nresults}\"\n", + "\n", + "if startyear:\n", + " cmd += f\" --startyear {startyear}\"\n", + "\n", + "if endyear:\n", + " cmd += f\" --endyear {endyear}\"\n", + "\n", + "\n" + ], + "metadata": { + "cellView": "form", + "id": "VbVaoz3wYGQY" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "# 2. Next, click in Runtime > Run All" + ], + "metadata": { + "id": "HOM1xu6daIGD" + } + }, + { + "cell_type": "code", + "metadata": { + "id": "oPot8aWcsfei", + "cellView": "form" + }, + "source": [ + "# @title\n", + "!pip install sortgs --quiet" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "# @title\n", + "!{cmd}" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "cellView": "form", + "id": "qgfQT7i2XrPf", + "outputId": "a3ea8b37-154f-420e-95df-72471e0a53c0" + }, + "execution_count": null, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Running with the following parameters:\n", + "Keyword: large language models, Number of results: 100, Save database: True, Path: /content, Sort by: cit/year, Plot results: False, Start year: None, End year: 2024, Debug: False\n", + "Loading next 10 results\n", + "Loading next 20 results\n", + "Loading next 30 results\n", + "Loading next 40 results\n", + "Loading next 50 results\n", + "Loading next 60 results\n", + "Loading next 70 results\n", + "Loading next 80 results\n", + "Loading next 90 results\n", + "Loading next 100 results\n", + " Author ... cit/year\n", + "Rank ... \n", + "57 Wei, X Wang, D Schuurmans… ... 2406\n", + "80 Hu, Y Shen, P Wallis, Z Alle ... 1652\n", + "1 Kasneci, K Seßler, S Küchemann, M Bannert… ... 1390\n", + "78 Yao, D Yu, J Zhao, I Shafran… ... 1304\n", + "3 Chang, X Wang, J Wang, Y Wu, L Yang… ... 1198\n", + "... ... ... ...\n", + "84 Maatouk, N Piovesan, F Ayed… ... 29\n", + "45 Li, L Xia, J Tang, Y Xu, L Shi, L Xia, D Yin… ... 23\n", + "82 Cheng, S Huang, F Wei ... 22\n", + "50 Zhu, Q Zhao, H Chen, J Wang, X Xie ... 14\n", + "73 Ren, J Tang, D Yin, N Chawla, C Huang ... 6\n", + "\n", + "[100 rows x 8 columns]\n", + "Results saved to /content/large_language_models.csv\n" + ] + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "s_nuxpy_s_9c" + }, + "source": [ + "> _**NOTE:** It is normal to get some warnings, for example year not found or author not found. However, if you get the robot checking warning, then it might not work anymore in the IP that you have on Google Colab. You can try going in 'Runtime' > 'Disconnect and delete runtime' to get a new IP. If the problem persists, then you will have to run locally using selenium and solve the captchas manually. Make sure to avoid running this code too often to avoid the robot checking problem._" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "QQIb9oYou9GM" + }, + "source": [ + "# 3. Download the results" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "09OFwDdM2K5h" + }, + "source": [ + "\n", + "To download the `.csv` file, click the **folder icon** on the left to open the **Files** panel, locate the file with the same name as your search keyword, click the **three dots** next to the file, and select **Download** from the options menu.\n", + "\n", + "\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "source": [ + "You can also visualize the top results here:" + ], + "metadata": { + "id": "LsZmVT5rd7-8" + } + }, + { + "cell_type": "code", + "metadata": { + "id": "pM_Bb4MH14eI", + "colab": { + "base_uri": "https://localhost:8080/", + "height": 2951 + }, + "outputId": "8cb7847d-28b4-47cd-ad16-21f20e72b745", + "cellView": "form" + }, + "source": [ + "# @title\n", + "import pandas as pd\n", + "results = pd.read_csv(search_query.replace(' ', '_')+'.csv')\n", + "results" + ], + "execution_count": null, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/plain": [ + " Rank Author \\\n", + "0 57 Wei, X Wang, D Schuurmans… \n", + "1 80 Hu, Y Shen, P Wallis, Z Alle \n", + "2 1 Kasneci, K Seßler, S Küchemann, M Bannert… \n", + "3 78 Yao, D Yu, J Zhao, I Shafran… \n", + "4 3 Chang, X Wang, J Wang, Y Wu, L Yang… \n", + ".. ... ... \n", + "95 84 Maatouk, N Piovesan, F Ayed… \n", + "96 45 Li, L Xia, J Tang, Y Xu, L Shi, L Xia, D Yin… \n", + "97 82 Cheng, S Huang, F Wei \n", + "98 50 Zhu, Q Zhao, H Chen, J Wang, X Xie \n", + "99 73 Ren, J Tang, D Yin, N Chawla, C Huang \n", + "\n", + " Title Citations Year \\\n", + "0 Chain-of-thought prompting elicits reasoning i... 7219 2022 \n", + "1 Lora: Low-rank adaptation of large language mo... 6608 2021 \n", + "2 ChatGPT for good? On opportunities and challen... 2779 2023 \n", + "3 Tree of thoughts: Deliberate problem solving w... 1304 2024 \n", + "4 A survey on evaluation of large language models 1198 2024 \n", + ".. ... ... ... \n", + "95 Large language models for telecom: Forthcoming... 29 2024 \n", + "96 Urbangpt: Spatio-temporal large language models 23 2024 \n", + "97 Adapting large language models via reading com... 44 2023 \n", + "98 Promptbench: A unified library for evaluation ... 14 2024 \n", + "99 A survey of large language models for graphs 6 2024 \n", + "\n", + " Publisher Venue \\\n", + "0 proceedings.neurips.cc Advances in neural … \n", + "1 arxiv.org arXiv preprint arXiv … \n", + "2 Elsevier Learning and individual … \n", + "3 proceedings.neurips.cc Advances in … \n", + "4 dl.acm.org ACM Transactions on … \n", + ".. ... ... \n", + "95 ieeexplore.ieee.org IEEE … \n", + "96 dl.acm.org Proceedings of the 30th … \n", + "97 openreview.net The Twelfth International Conference on … \n", + "98 jmlr.org Journal of Machine Learning … \n", + "99 dl.acm.org Proceedings of the 30th … \n", + "\n", + " Source cit/year \n", + "0 https://proceedings.neurips.cc/paper_files/pap... 2406 \n", + "1 https://arxiv.org/abs/2106.09685 1652 \n", + "2 https://www.sciencedirect.com/science/article/... 1390 \n", + "3 https://proceedings.neurips.cc/paper_files/pap... 1304 \n", + "4 https://dl.acm.org/doi/abs/10.1145/3641289 1198 \n", + ".. ... ... \n", + "95 https://ieeexplore.ieee.org/abstract/document/... 29 \n", + "96 https://dl.acm.org/doi/abs/10.1145/3637528.367... 23 \n", + "97 https://openreview.net/forum?id=y886UXPEZ0 22 \n", + "98 https://www.jmlr.org/papers/v25/24-0023.html 14 \n", + "99 https://dl.acm.org/doi/abs/10.1145/3637528.367... 6 \n", + "\n", + "[100 rows x 9 columns]" + ], + "text/html": [ + "\n", + "
\n", + " | Rank | \n", + "Author | \n", + "Title | \n", + "Citations | \n", + "Year | \n", + "Publisher | \n", + "Venue | \n", + "Source | \n", + "cit/year | \n", + "
---|---|---|---|---|---|---|---|---|---|
0 | \n", + "57 | \n", + "Wei, X Wang, D Schuurmans… | \n", + "Chain-of-thought prompting elicits reasoning i... | \n", + "7219 | \n", + "2022 | \n", + "proceedings.neurips.cc | \n", + "Advances in neural … | \n", + "https://proceedings.neurips.cc/paper_files/pap... | \n", + "2406 | \n", + "
1 | \n", + "80 | \n", + "Hu, Y Shen, P Wallis, Z Alle | \n", + "Lora: Low-rank adaptation of large language mo... | \n", + "6608 | \n", + "2021 | \n", + "arxiv.org | \n", + "arXiv preprint arXiv … | \n", + "https://arxiv.org/abs/2106.09685 | \n", + "1652 | \n", + "
2 | \n", + "1 | \n", + "Kasneci, K Seßler, S Küchemann, M Bannert… | \n", + "ChatGPT for good? On opportunities and challen... | \n", + "2779 | \n", + "2023 | \n", + "Elsevier | \n", + "Learning and individual … | \n", + "https://www.sciencedirect.com/science/article/... | \n", + "1390 | \n", + "
3 | \n", + "78 | \n", + "Yao, D Yu, J Zhao, I Shafran… | \n", + "Tree of thoughts: Deliberate problem solving w... | \n", + "1304 | \n", + "2024 | \n", + "proceedings.neurips.cc | \n", + "Advances in … | \n", + "https://proceedings.neurips.cc/paper_files/pap... | \n", + "1304 | \n", + "
4 | \n", + "3 | \n", + "Chang, X Wang, J Wang, Y Wu, L Yang… | \n", + "A survey on evaluation of large language models | \n", + "1198 | \n", + "2024 | \n", + "dl.acm.org | \n", + "ACM Transactions on … | \n", + "https://dl.acm.org/doi/abs/10.1145/3641289 | \n", + "1198 | \n", + "
... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "
95 | \n", + "84 | \n", + "Maatouk, N Piovesan, F Ayed… | \n", + "Large language models for telecom: Forthcoming... | \n", + "29 | \n", + "2024 | \n", + "ieeexplore.ieee.org | \n", + "IEEE … | \n", + "https://ieeexplore.ieee.org/abstract/document/... | \n", + "29 | \n", + "
96 | \n", + "45 | \n", + "Li, L Xia, J Tang, Y Xu, L Shi, L Xia, D Yin… | \n", + "Urbangpt: Spatio-temporal large language models | \n", + "23 | \n", + "2024 | \n", + "dl.acm.org | \n", + "Proceedings of the 30th … | \n", + "https://dl.acm.org/doi/abs/10.1145/3637528.367... | \n", + "23 | \n", + "
97 | \n", + "82 | \n", + "Cheng, S Huang, F Wei | \n", + "Adapting large language models via reading com... | \n", + "44 | \n", + "2023 | \n", + "openreview.net | \n", + "The Twelfth International Conference on … | \n", + "https://openreview.net/forum?id=y886UXPEZ0 | \n", + "22 | \n", + "
98 | \n", + "50 | \n", + "Zhu, Q Zhao, H Chen, J Wang, X Xie | \n", + "Promptbench: A unified library for evaluation ... | \n", + "14 | \n", + "2024 | \n", + "jmlr.org | \n", + "Journal of Machine Learning … | \n", + "https://www.jmlr.org/papers/v25/24-0023.html | \n", + "14 | \n", + "
99 | \n", + "73 | \n", + "Ren, J Tang, D Yin, N Chawla, C Huang | \n", + "A survey of large language models for graphs | \n", + "6 | \n", + "2024 | \n", + "dl.acm.org | \n", + "Proceedings of the 30th … | \n", + "https://dl.acm.org/doi/abs/10.1145/3637528.367... | \n", + "6 | \n", + "
100 rows × 9 columns
\n", + "