Changes in this version:
- Fix bug in definition of
Service
class inservices/base.py
. - Update some package versions in
requirements.txt
. - Pin more versions of packages in
requirements.txt
. - Change name of runnable program to have dash between the name
eprints2archives
and the version number. - Updated copyright year in file headers and other places.
Changes in this version:
- Fix handling of occasionally unexpected timemap result from InternetArchive.
- Update versions of some dependencies in
requirements.txt
. - Add of
CITATION.cff
file.
- Add requirement for a package imported by another requirement but that does not, for some reason, get imported properly when a clean virtual environment is created in the process of making pyz apps.
- Update all requirement versions to latest versions of packages. This was not done for over a year and it showed...
- Define
console_scripts
for setuptools to produce a better wrapper script.
- Fix a mixhandled exception when a server returns a code 500.
- Improve catching interrupts on Windows.
- Use CommonPy network utilities, file utilities, and data utilities instead of internal copies.
- Use Bun user interface code instead of internal copy. Also change some colors of messages printed by
eprints2archives
. - Update internal imports and some requirements.
- Update copyright year.
- Use Sidetrack instead of internal
debug.py
version of the same. - Internally, use different approach to recording version number and other metadata.
- Use updated release procedure codified in
Makefile
.
- Check that URLs obtained from EPrints records appear to be valid URLs, before trying to send them to web archives. (This is mostly to catch bad values in the
official_url
record field.) - Be more careful about which
/view/X/N.html
pages are sent. - Do a better job with HTTP code 400 from Internet Archive.
- Do some more internal network code refactoring.
- Add some more debug log statements.
- Retry network operations one time if get HTTP code 400.
- Internal network code refactoring.
- Add missing
requirements.txt
dependency forh2
package. - Make parsing of malformed id ranges slightly more robust.
- Fix incorrect pluralization of an info message.
- Remove accidentally left-in invocation of
pdb
upon errors even if debugging not enabled. - Edit the README.md file slightly.
- In addition to the record pages,
eprints2archives
now also harvests general URLs from the server, including the top-level URL and/view
and 2 levels of pages underneath it. However, if a subset of records is requested, only gets those particular/view/X/N.html
pages rather than all pages under/view/X/
. - Internal changes allow it to use protocol HTTP/2, which was necessary to communicate with Archive.Today (because it appears to have stopped accepting save requests unless HTTP2 is used).
- Now tries to add
https://
orhttp://
if the user forgets to provide it, and also removes/eprint
and adds/rest
if needed. This makes it possible for the user to just provide a host name andeprints2archives
will figure out the rest. - Minor improvements to some of the run-time status messages.
- More progress bars!
- Improvements to debug logging.
- Improvements to README.md.
- Internal code refactoring.
- Include the top-level server URL among the URLs sent to archives, as well as
/view
and two levels of pages under/view
. - Make sure the set of URLs sent to archives is unique.
- Improve debug logging from low-level network module.
- Clarify some things in the README file.
First working version. Supports sending EPrints pages to the Internet Archive and Archive.Today. Runs with parallel threads and handles rate limits automatically. Currently implements a command-line interface only.