Specifying special characters in "insert_htmlbox" #3606
-
Description of the bugTrying to write this page.insert_htmlbox( # page is a PDF Page object The text doesn't getting rendered tired to open the PDF is Acrobat reader and Chrome Browser as well, the text though present is not visible Need help on this. Thanks How to reproduce the bugTrying to write this page.insert_htmlbox( # page is a PDF Page object The text doesn't getting rendered tired to open the PDF is Acrobat reader and Chrome Browser as well, the text though present is not visible Need help on this. Thanks PyMuPDF version1.24.5 Operating systemWindows Python version3.10 |
Beta Was this translation helpful? Give feedback.
Replies: 8 comments 7 replies
-
I have also used page.clean_contents(sanitize=True) after each insert_htmlbox, still it does not print this line properly |
Beta Was this translation helpful? Give feedback.
-
"अधिक जानकारी के लिए customerservice@axismf.com**।** निवेशकों को केवल पंजीकृत म्यूचुअल फंड से ही लेनदेन करना चाहिए, जिसका विवरण www.sebi.gov.in पर उपलब्ध है -" Its because of the highlighted in bold character in the text string. |
Beta Was this translation helpful? Give feedback.
-
The pipe character if present in the html text to be inserted replaced it with - Alternatively, use one of the HTML entities for the pipe character, e.g. | (or the more meaningful | Now its printing the characters correctly as given below. Is there any other way to handle special characters in the insert_htmlbox while printing text with special characters. Thanks |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
I must confess that I am confused. In any case I see no problem caused by any of PyMuPDF or MuPDF. At best, this is a Discussions item therefore, I am going to transfer this. |
Beta Was this translation helpful? Give feedback.
-
Here is an easy solution with one of the fonts in package pymupdf-fonts: import pymupdf
doc=pymupdf.open()
page=doc.new_page()
rect=pymupdf.Rect(100,100,300,300)
text="अधिक जानकारी के लिए customerservice@axismf.com। निवेशकों को केवल पंजीकृत म्यूचुअल फंड से ही लेनदेन करना चाहिए, जिसका विवरण www.sebi.gov.in पर उपलब्ध है -"
arch=pymupdf.Archive()
css=pymupdf.css_for_pymupdf_font("figo", archive=arch, name="sans-serif")
css += "* {font-family: sans-serif;}"
page.insert_htmlbox(rect,text,css=css,archive=arch)
doc.ez_save("x.pdf") |
Beta Was this translation helpful? Give feedback.
-
Thanks its working for the languages that i am handling...the font to be chosen for each text to be written is dynamic and is coming from a source, Noto Sans is one of them. Need to check if i can proceed with the font that you have suggested as its handling the languages being used. Much Appreciate your help on this. |
Beta Was this translation helpful? Give feedback.
-
Using this instead now:
its printing the invisible characters , except that "danda" is coming as a box , Thank you much for your inputs and suggestions . Totally appreciate it. Will have to check on this. |
Beta Was this translation helpful? Give feedback.
|
or|
using either of the two html entities instead of the pipe symbol solves the issue for now, finding the symbol in the text and doing a text replace with either of the above html entites solves the issue as of now.