Neil
Day
Hi
All :)
Would be grateful to hear from anyone who has successfully managed
to convert word/rtf files into HTML without using the 'save as webpage'
option provided by MS Office.
Or maybe you have used that option but then successfully managed to
use a piece of software to easily clean up the terrible HTML coding
that it seems to produce?
Eagerly waiting ;)
Cheers
Neil D
George
Chapin
Hi
Neil,
If you don't have many to do I would try to open the word doc and
select the entire doc and choose copy and paste into front page.
Front page will keep the same formatting but the html will be cleaner
then if you converted to html using word. Although front page hardly
makes clean html.
If you don't have front page and you would like me to convert some
files for you let me know.
Take care,
Neil
Day
Cheers
George
DUH!!! Don't know why I didn't think of that. Thanks for the info.
I will go and have a go at that now. It has got to produce cleaner
coding than MS Office so shouldn't take long to re-edit if needed.
Later
Neil
George
Chapin
I
have to admit I didn't try it so let me know how it goes!
Louis
Originally
posted by Neil Day
Hi All :)
Would be grateful to hear from anyone who has successfully managed
to convert word/rtf files into HTML without using the 'save as webpage'
option provided by MS Office.
Or maybe you have used that option but then successfully managed to
use a piece of software to easily clean up the terrible HTML coding
that it seems to produce?
Eagerly waiting ;)
Cheers
Neil D
Hi Neil
What I do - which is a bit of tricky process and does lose some formatting
from Word (so it's best not to put too much in, in the first place)
is:
From the "File" menu in Word I choose "Save As Web
Page" to save the document in HTML format. (This is in Word 2000.)
However the style of HTML it saves the document out as isn't clean
AT ALL.
I then use the Dreamweaver utility from the "Functions"
menu: "Clean Up Word HTML". And that cleans up a lot of
the excess HTML.
Please note however that I am using an older version of Dreamweaver
- Version 4. So perhaps this function is even more powerful now.
From the outputted HTML I then sometimes go through the code and for
any remaining messy and unnecessary code I find still in there I often
remove with a global "Search & Replace" - replacing
the text with nothing - therefore deleting it.
What I then have left is very bland HTML, but all the text is there
and the original formatting of bold, underline, italic...etc. should
remain.
I then simply apply my usual formatting to the document - choosing
the font style, adding tables...etc.
It's a bit of a convoluted process but it works well. This is in fact
what I'm going to do for my RREM March product - I've been writing
it in Word applying little formatting, and then I'll go through this
process to convert it to HTML so that the brandable PDF can be created.
I have also spent time looking at download.com for other utilities
for cleaning "Word HTML" but haven't yet found anything
close to the functionality Dreamweaver offers.
Sincerely,
Louis
Neil
Day
Hi
Louis
Thanks for your input.
Eventually I went with what George suggested by cutting and pasting
the original RTF document into Frontpage.
It still gives a bit of excess HTML coding but nowhere near as much
as Office does when you save the webpage. All I did then was to use
the search and replace facility within notepad to get rid of the unwanted
coding.
This appears to have worked OK but I wouldn't want to do it with a
lot of pages :eek2:
Cheers
Neil
Louis
Originally
posted by Neil Day
Hi Louis
Thanks for your input.
Eventually I went with what George suggested by cutting and pasting
the original RTF document into Frontpage.
It still gives a bit of excess HTML coding but nowhere near as much
as Office does when you save the webpage. All I did then was to use
the search and replace facility within notepad to get rid of the unwanted
coding.
This appears to have worked OK but I wouldn't want to do it with a
lot of pages :eek2:
Cheers
Neil
Hi Neil
If you're not too worried about losing the formatting during the conversion
from RTF to HTML, would outputting the document from Word as "Text
Only" and then opening this in FrontPage or Dreamweaver perhaps
work for you?
You'll then have all the text and paragraphs laid out, and will just
need to add bold, italic, font style...etc. to your liking.
Sincerely,
Louis
George
Chapin
If
you are using the html in the pdf creator Louis's last suggestion
is the way to go.
g