Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default toc length config #2042

Merged
merged 18 commits into from
Sep 4, 2024
Merged

Default toc length config #2042

merged 18 commits into from
Sep 4, 2024

Conversation

voisardf
Copy link
Collaborator

@voisardf voisardf commented Aug 27, 2024

A new parameter default_toc_length in the print configuration allows to deactivate the compute_toc_pages function and set a default toc page length which applies in almost every case (exceptions, see description)

  • new print config parameter(s)
  • tests

The TOC page breaks depend on the overall number of published topics, the settings regarding the land register elements and the individual disclaimer options (QR code).
In Neuchâtel, with only one or two concerned themes, we get one TOC page, with three or more concerned topics the TOC will be at least two pages.

@voisardf voisardf added enhancement mapfish print usergroup work This lable is used to mark issues which will be done by usergroup members labels Aug 27, 2024
@voisardf voisardf self-assigned this Aug 27, 2024
Copy link

codecov bot commented Aug 28, 2024

Codecov Report

Attention: Patch coverage is 59.25926% with 11 lines in your changes missing coverage. Please review.

Project coverage is 85.45%. Comparing base (86fe195) to head (ea54626).
Report is 19 commits behind head on master.

Files with missing lines Patch % Lines
...contrib/print_proxy/mapfish_print/mapfish_print.py 59.25% 11 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2042      +/-   ##
==========================================
- Coverage   85.51%   85.45%   -0.07%     
==========================================
  Files         120      120              
  Lines        5276     5281       +5     
==========================================
+ Hits         4512     4513       +1     
- Misses        764      768       +4     
Flag Coverage Δ
unittests 85.45% <59.25%> (-0.07%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@voisardf voisardf marked this pull request as ready for review August 29, 2024 10:20
@@ -116,6 +119,7 @@ def __call__(self, value, system):
data=json.dumps(spec)
)
try:
log.debug('Validation of the TOC length with compute_toc_pages set to {} and default_toc_length set to {}'.format(print_config.get('compute_toc_pages'), print_config.get('default_toc_length'))) # noqa
if Config.get('print', {}).get('compute_toc_pages', False):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can leave the if statement away.
The check if the toc pages from the actual pdf fits the extract_as_dict['nbTocPages'] we've set above should be done anyways and will be the same for all three cases.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree.

@svamaa
Copy link
Collaborator

svamaa commented Aug 29, 2024

I've tested this PR with compute_toc_pages=false and default_toc_length=2 and found a 30% faster pdf extract!
Great work! thank you @voisardf

@@ -88,6 +88,16 @@ pyramid_oereb:
# more time to generate the PDF. If set to false, it will assume that only one TOC page exists, and this can
# lead to wrong numbering in the TOC.
compute_toc_pages: true
# To avoid the potentially time consuming second computing of the PDF extract and skip the the computation
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe the in the comment, you could point out, that if the compute_toc_pages is set to true, it will always overwrite the default_toc_length.

@@ -88,6 +88,16 @@ pyramid_oereb:
# more time to generate the PDF. If set to false, it will assume that only one TOC page exists, and this can
# lead to wrong numbering in the TOC.
compute_toc_pages: true
# To avoid the potentially time consuming second computing of the PDF extract and skip the the computation
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe:

In order to skip the computation of the estimated number of TOC pages which might return an erroneous result for your setting, you can specify a default for the number of TOC pages. For most of the cantons the number of TOC pages is pretty constant unless a real estate is concerned by none or a huge number of restrictions.
In both cases (computing an estimate or setting a default for the number of TOC pages) the exact number of TOC pages is extracted from the created PDF and if it differs from the expected value the PDF is created a second time with the correct page numbers.
Note that if "compute_toc_pages" is set true the "default_toc_length" is not taken into account.

@@ -6,6 +6,13 @@ Changes/Hints for migration
This chapter will give you hints on how to handle version migration, in particular regarding what you may need
to adapt in your project configuration, database etc. when upgrading to a new version.

Version 2.6.0
-------------
* New parameter 'default_toc_length' allows to define a default table of content pages number avoiding a second
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would omit the statemtents about the 95% and the default setting. I guess default is to not set this parameter.

# In both cases (computing an estimated length or setting a default length) the exact number of TOC pages is
# extracted from the created PDF and if it is different from the expected value the PDF extract is called a
# second time with the correct page numbers.
default_toc_length: 2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the name of the variable could be more precise. E.g. "most_frequent_toc_length"

@@ -73,17 +73,20 @@ def __call__(self, value, system):
extract_as_dict = self._render(extract_record, value[1])
feature_geometry = mapping(extract_record.real_estate.limit)

print_config = Config.get('print', {})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the variable 'print_config' is defined, it should thoroughly be used in subsequent parts of the code.

if Config.get('print', {}).get('compute_toc_pages', False):
extract_as_dict['nbTocPages'] = TocPages(extract_as_dict).getNbPages()
else:
extract_as_dict['nbTocPages'] = 1
if Config.get('print', {}).get('default_toc_length', False):
extract_as_dict['nbTocPages'] = print_config.get('default_toc_length', 2)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not specify a default, but test if it is a positive number.

@@ -131,14 +135,43 @@ def __call__(self, value, system):
except ValueError:
true_nb_of_toc = 1

log.debug('True number of TOC pages is {}'.format(true_nb_of_toc))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the expected number can also be written to the log (in case it is computed).

@@ -116,6 +119,7 @@ def __call__(self, value, system):
data=json.dumps(spec)
)
try:
log.debug('Validation of the TOC length with compute_toc_pages set to {} and default_toc_length set to {}'.format(print_config.get('compute_toc_pages'), print_config.get('default_toc_length'))) # noqa
if Config.get('print', {}).get('compute_toc_pages', False):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree.

@michmuel
Copy link
Collaborator

Thanks for the pull request! Looks good to me. I added some remarks, some of them are discussible.

@voisardf
Copy link
Collaborator Author

@svamaa @michmuel I adapted the code with most of your suggestions. Thanks!

Copy link
Collaborator

@lopo977 lopo977 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@svamaa
Copy link
Collaborator

svamaa commented Aug 30, 2024

Looks good to me! Thank you @voisardf !

Copy link
Collaborator

@michmuel michmuel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thanks! I noted one small thing (maybe its good to change it here in order to not forget it).

* New parameter 'expected_toc_length' allows to define a default table of content pages number avoiding a second
call for the pdf extract in most cases. This value should be set if most of the PDF extracts have the same number
of TOC pages.
Default setting: 2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is omitted now.

A change in this pull request is that the pdf is built a second time whenever the numbers of toc pages disagree between extract_as_dict['nbTocPages'] (computed/expected/1, depending on the setting) and the one in the returned pdf.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current version it is only rebuild if 'compute_toc_pages' is true.

@voisardf
Copy link
Collaborator Author

voisardf commented Sep 4, 2024

thanks for all your input!

@voisardf voisardf merged commit a6c972f into master Sep 4, 2024
10 of 12 checks passed
@voisardf voisardf deleted the default_toc_length_config branch September 4, 2024 07:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement mapfish print usergroup work This lable is used to mark issues which will be done by usergroup members
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants