Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

started process to get formatted refs only once and not every time in the loop #1021

Merged
merged 4 commits into from
Jul 21, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions news/patch_read_lists.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
**Added:**

* <news item>

**Changed:**

* <news item>

**Deprecated:**

* <news item>

**Removed:**

* <news item>

**Fixed:**

* Changed how the reading-list builder fetches the references from Crossref so that it only fetches each needed reference once.

**Security:**

* <news item>
28 changes: 19 additions & 9 deletions regolith/builders/readinglistsbuilder.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,22 @@
"""Render latex template"""
rc = self.rc

# build the collection of formatted references so that we only go
# and fetch the formatted references once per doi
dois, formatted_refs = [], {}
for rlist in self.gtx["reading_lists"]:
for paper in rlist['papers']:
dois.append(paper.get('doi', ''))
dois = list(set(dois))
for item in ['tbd', '']:
if item in dois:
dois.remove('tbd')
dois.remove('')
for doi in dois:
ref_and_date = get_formatted_crossref_reference(doi)
formatted_refs.update({doi: ref_and_date})

Check warning on line 57 in regolith/builders/readinglistsbuilder.py

View check run for this annotation

Codecov / codecov/patch

regolith/builders/readinglistsbuilder.py#L46-L57

Added lines #L46 - L57 were not covered by tests

# loop through the reading lists to build the files
for rlist in self.gtx["reading_lists"]:
listid = rlist["_id"]
outfile_bib = listid
Expand All @@ -48,20 +64,14 @@
n = 1
for paper in rlist['papers']:
doi = paper.get('doi')
paper['text'] = paper['text'].strip('.').strip()

Check warning on line 67 in regolith/builders/readinglistsbuilder.py

View check run for this annotation

Codecov / codecov/patch

regolith/builders/readinglistsbuilder.py#L67

Added line #L67 was not covered by tests
if doi == 'tbd' or doi == '':
doi = None
url = paper.get('url')
if doi:
# if rc.verbose:
# print(f"getting {doi} for {paper.get('tite')}")
ref, ref_date = get_formatted_crossref_reference(doi)
# if rc.verbose:
# try:
# print(f"got ref: {ref}")
# except:
# print("obtained ref but print error")
paper.update({'reference': ref, 'ref_date': ref_date, 'n': n, 'label': 'DOI'})
paper.update({'reference': formatted_refs.get(doi)[0],

Check warning on line 72 in regolith/builders/readinglistsbuilder.py

View check run for this annotation

Codecov / codecov/patch

regolith/builders/readinglistsbuilder.py#L72

Added line #L72 was not covered by tests
'ref_date': formatted_refs.get(doi)[1],
'n': n, 'label': 'DOI'})
n += 1
elif url:
paper['doi'] = url
Expand Down
2 changes: 1 addition & 1 deletion regolith/tools.py
Original file line number Diff line number Diff line change
Expand Up @@ -1980,7 +1980,7 @@ def get_formatted_crossref_reference(doi):

authorlist = [
f"{a['given'].strip()} {a['family'].strip()}"
for a in article.get('message').get('author')]
for a in article.get('message',{}).get('author','')]
try:
journal = \
article.get('message').get('short-container-title')[0]
Expand Down
1 change: 1 addition & 0 deletions tests/test_builders.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
"grantreport",
"resume",
"review-man",
# reading-lists need tests for this
"reimb"
]
db_srcs = ["mongo", "fs"]
Expand Down