From 93d0bde9115f4d3c54ea83c66b631afd1e741bc7 Mon Sep 17 00:00:00 2001 From: decalage2 Date: Fri, 26 Jan 2018 22:27:51 +0100 Subject: [PATCH] doc: added FAQ --- doc/FAQ.rst | 27 +++++++++++++++++++++++++++ doc/index.rst | 1 + 2 files changed, 28 insertions(+) create mode 100644 doc/FAQ.rst diff --git a/doc/FAQ.rst b/doc/FAQ.rst new file mode 100644 index 0000000..190651c --- /dev/null +++ b/doc/FAQ.rst @@ -0,0 +1,27 @@ +========================== +Frequently Asked Questions +========================== + +Can I extract all images from MS OLE2 documents with olefile? +------------------------------------------------------------- + +Not directly: images are not always stored the same way, and it also depends on the format. + +For example in Powerpoint presentations, you may find a stream named "Pictures" +when running "olefile yourfile.ppt". You may extract the stream by using the +openstream() method on the OleFileIO object, but you will usually get a binary +stream containing several picture files. You may also extract it manually using +tools such as SSView (http://www.mitec.cz/ssv.html). + +Then the only way I've found so far is to use file carving tools which are +able to determine the beginning and the end of each picture in a binary file. +These tools are not always easy to use but if you're interested have a look +at http://pypi.python.org/pypi/hachoir-subfile +and http://www.forensicswiki.org/wiki/Tools:Data_Recovery#Carving. + +If you really need to automate the process then you have to study Microsoft +specifications (at http://www.microsoft.com/interop/docs/officebinaryformats.mspx) +and find the right way to parse MS Office documents... + +A lot of people (including me) would be very interested if you find a solution! ;-) + diff --git a/doc/index.rst b/doc/index.rst index cd75c9a..e465614 100644 --- a/doc/index.rst +++ b/doc/index.rst @@ -35,6 +35,7 @@ Microscopy file formats, McAfee antivirus quarantine files, etc. Howto OLE_Overview olefile + FAQ