Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARXIVCE-1264: group by ip, when counting pdf downloads #297

Merged
merged 5 commits into from
Aug 6, 2024

Conversation

bmaltzan
Copy link
Contributor

Create a table subquery to filter out partial-content pdf downloads, then group by paper id.

@bmaltzan bmaltzan requested review from cbf66, kyokukou and a team July 22, 2024 15:34
Copy link
Contributor

@bdc34 bdc34 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned about over counting conference proceedings due to the /html/ in the regex.

@bmaltzan
Copy link
Contributor Author

I'm concerned about over counting conference proceedings due to the /html/ in the regex.

It's possible for Erin to also lookup the format, but I don't think we should.

I think the distinction of someone choosing to view teX generated html over choosing to view the pdf is interesting, and worthwhile for us to extract, but I think the primary thing people are looking at is the number of people reading the paper, so this is not misleading, if a little bit off.

@bmaltzan bmaltzan requested review from kyokukou and bdc34 August 1, 2024 14:58
@bmaltzan bmaltzan merged commit 661952f into develop Aug 6, 2024
1 check passed
@bmaltzan bmaltzan deleted the ARXIVCE-1264-count-pdf-downloads-by-country branch August 6, 2024 14:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants