Group:  Microsoft Word ยป microsoft.public.word.conversions
Thread: RTF generated by Office 2005 and Wordpad

RTF generated by Office 2005 and Wordpad
miztaken <justjunktome[ at ]gmail.com> 11/11/2008 11:05:22 AM
Hi there,
What is the difference in rtf file generated by wordpad and that
generated by office 2007?

I find some differences while storing images.

Are there any other differences?

Also i want to extract image files and other embedded files from RTF
file, how is this possible?

thank you for your time
miztaken
Re: RTF generated by Office 2005 and Wordpad
"Klaus Linke" <info[ at ]fotosatz-kaufmann.de> 11/12/2008 4:05:46 AM
"miztaken" <justjunktome[ at ]gmail.com> wrote:
[Quoted Text]
> Hi there,
> What is the difference in rtf file generated by wordpad and that
> generated by office 2007?
>
> I find some differences while storing images.
>
> Are there any other differences?

Wordpad is based on a much older version of Word, and doesn't support a lot
of features even of older Word versions (97, 2000, ...).
I guess the RTF it exports will correspond to maybe RTF version 1.5 or so,
Word 2007 uses 1.9.

So if you save Word 2007 documents as RTF in WordPad, a lot of stuff will be
stripped out.

> Also i want to extract image files and other embedded files from
> RTF file, how is this possible?

You could save as HTML. The images then end up in a separate folder.
Word 2007 *.docx files are ZIP files... I have 't tried yet, but guess you
could save as *docx, look through the subfolders after unzipping it, and
probably find the images and other embedded files in some subfolder.

Regards,
Klaus

Re: RTF generated by Office 2005 and Wordpad
miztaken <justjunktome[ at ]gmail.com> 11/12/2008 6:14:58 AM
Hey Klaus,

[Quoted Text]
> Wordpad is based on a much older version of Word, and doesn't support a lot
> of features even of older Word versions (97, 2000, ...).
> I guess the RTF it exports will correspond to maybe RTF version 1.5 or so,
> Word 2007 uses 1.9.
How can i know which version of RTF file is it.
I can have bunch of RTF files.
So for different version of RTF do i have to prepare different
parsers?
If yes/no, what do you suggest me?


> You could save as HTML. The images then end up in a separate folder.
If i save those RTF as html then i believe, all the embedded objects
and attachments will be dumped in a single binary file named
oledata.mso and i have no clue how to read that.

I want to extract image file from RTF file without saving them as
HTML? Are there any other ways??

Thank you
miztaken
Re: RTF generated by Office 2005 and Wordpad
"Klaus Linke" <info[ at ]fotosatz-kaufmann.de> 11/12/2008 6:39:25 AM
"miztaken" <justjunktome[ at ]gmail.com> wrote:
[Quoted Text]
> How can i know which version of RTF file is it.

I don't think you can tell from the RTF file.
An RTF reader (say some version of Word or Wordpad) will just ignore things
it does not understand.

> I can have bunch of RTF files.
> So for different version of RTF do i have to prepare different
> parsers?
> If yes/no, what do you suggest me?

Your parser will likely ignore things you aren't interested in, so you won't
need different parsers.

If your goal is to extract images and embedded objects, I wouldn't try to
parse the RTF though. There are likely better, safer and simpler ways ...
either exporting to some other format as described in my last post, or using
VBA in Word.

>> You could save as HTML. The images then end up in a separate folder.
> If i save those RTF as html then i believe, all the embedded objects
> and attachments will be dumped in a single binary file named
> oledata.mso and i have no clue how to read that.

Have you tried to export as "HTML, filtered"?

> I want to extract image file from RTF file without saving them as
> HTML? Are there any other ways??

Maybe someone else can help... As I said, I wouldn't try it.
If you want to try using VBA, you could post in one of the VBA groups.
I suspect the code would depend on the format of the image or embedded
object though, and it might be necessary to, say, copy/paste the image into
that [graphics] program, and save it from there.

Regards,
Klaus

Re: RTF generated by Office 2005 and Wordpad
"Klaus Linke" <info[ at ]fotosatz-kaufmann.de> 11/18/2008 5:56:05 AM
[Quoted Text]
>> If i save those RTF as html then i believe, all the embedded objects
>> and attachments will be dumped in a single binary file named
>> oledata.mso and i have no clue how to read that.

Graham Mayor has an article with detailed tipps:
http://www.gmayor.com:80/extract_images_from_word.htm

Klaus

Home | Search | Terms | Imprint
Newsgroups Reader