Spiga

PHP – convert Microsoft word doc to pdf

by Gabi Solomon

A lot of people are searching for solution to do this conversion on the fly. I dont know any other programing languages except PHP so i will write only about how you can do this in PHP.

Now to do this there is 2 major situations:
1. Doing it on a windows Server
2. Doing it on a linux Server

In both cases the chanses are that you need to have root access to the server in order to do all the settings required for this conversion.

In case of windows you can use COM objects to acces either Word or OpenOffice and do the conversion. Here peek of a code to access OpenOffice API to do the conversion:
[php]
set_time_limit(0);
function MakePropertyValue($name,$value,$osm){
$oStruct = $osm->Bridge_GetStruct(“com.sun.star.beans.PropertyValue”);
$oStruct->Name = $name;
$oStruct->Value = $value;
return $oStruct;
}
function word2pdf($doc_url, $output_url){
//Invoke the OpenOffice.org service manager
$osm = new COM(“com.sun.star.ServiceManager”) or die (“Please be sure that OpenOffice.org is installed.\n”);
//Set the application to remain hidden to avoid flashing the document onscreen
$args = array(MakePropertyValue(“Hidden”,true,$osm));
//Launch the desktop
$oDesktop = $osm->createInstance(“com.sun.star.frame.Desktop”);
//Load the .doc file, and pass in the “Hidden” property from above
$oWriterDoc = $oDesktop->loadComponentFromURL($doc_url,”_blank”, 0, $args);
//Set up the arguments for the PDF output
$export_args = array(MakePropertyValue(“FilterName”,”writer_pdf_Export”,$osm));
//Write out the PDF
$oWriterDoc->storeToURL($output_url,$export_args);
$oWriterDoc->close(true);
}
$output_dir = “C:/dev/openofficeintegration/docconverter/”;
$doc_file = “C:/dev/openofficeintegration/docconverter/DpmR5Reqv1.20.doc”;
$pdf_file = “DpmR5Reqv1.20.pdf”;
$output_file = $output_dir . $pdf_file;
$doc_file = “file:///” . $doc_file;
$output_file = “file:///” . $output_file;
word2pdf($doc_file,$output_file);
?>
[/php]

Now in case of linux things get a little more tricky, not by much but they do. You cant user COM objects any more but you can still install OpenOffice and its API in extension with a python script.

PyODConverter, for Python OpenDocument Converter, is a Python script that automates office document conversions from the command line using OpenOffice.org.

The script does basically the same thing as the command line tool that comes with JODConverter but is much simpler. In fact the Python script was released for the people who use JODConverter only from the command line (not as a Java library or web service) and would like a simpler alternative.

To check out the full documentation and download the script go here.

Covertor for php5

As i was doing research to do this post i found out of yet another solution, but this is only available for php5, but it works for both lynux and windows. Its a php module written in C++ called Punno.

This project is a PHP5 module written in C++ that brings the OpenOffice.org UNO Programming API to the PHP userspace.

You can use it to write scripts that create, modify, read and save OpenOffice.org documents (Writer, Spreadsheet, Drawing). Also, you can export these documents in various formats, like PDF or HTML for example.

It can be installed on any Linux/Unix or Windows platform where PHP5 and OpenOffice.org are also available.

It is released under PHP License 3.01.

That is it.

Do you know of any other solutions to do a doc to pdf conversion using php ?

  • santy

    hallo,

    I am trying this script to convert to ,doc to .pdf file but there is error like “com” object is not found and other related to “$osm = new COM(“com.sun.star.ServiceManager”) ” this line
    I am instal openoffice.org

    for proper run this script what a have to do
    can I java+php integarte to run “com.sun.star.ServiceManager”???

    please suggest me to get errorless output as soon as possible.

    thank u

  • http://www.gsdesign.ro/ Gabi Solomon

    are you on windows or linux ?

  • santy

    I have window for localhost

    and dedicated linux server for test.

    I want to run this scipt any one of this envionment no problem!!!!!

    If code for localhost window then I can do some setting for that OR
    if that is linux than I will be ready to change my linux sever setting!!!!

    please give me reply for errorless output as soon as possible.

    thank u

  • frenchfrog

    I want run the top php script in a service/webserver on Windows.

    OpenOffice need a graphical interface to run unless you specified -headless on the command line.
    (Good URL about this: http://www.oooninja.com/2008/02/batch-command-line-file-conversion-with.html)

    So I changed the COM in the Windows registry:

    1) Start regedit.exe (a windows utility)
    2) Search for “com.sun.star.ServiceManager”
    3) Once you found the class HKEY_CLASSES_ROOTCLSID{[UUID of the class]}
    4) Change to value of HKEY_CLASSES_ROOTCLSID{[UUID of the class]}LocalServer32(Default)

    Add the following arguments at the end of the command line:
    -headless -norestore -nofirststartwizard

    ex:
    C:Program FilesOpenOffice.org 3programsoffice.exe -nodefault -nologo -headless -norestore -nofirststartwizard

    ————–

    Also to close OpenOffice: $oDesktop->terminate();

  • http://convert-wma-to-mp3.biz/convert-protected-wma.html Lesley M.

    Does anyone know some tips to convert Excel XLSX files into PDF?

  • Anonymous

    Nice post. Thanks for sharing.
    You may check out the article below on how to convert Pdf document to word or Doc file through website.
    I hope that you will find it useful. cheers.
    http://www.quertime.com/article/arn-2010-10-04-1-how-to-convert-pdf-document-to-word-or-doc-file-through-website/

    • http://twitter.com/linrx Lin Rongxiang.

      Copied and downloaded the script mentioned by
      Gabi Solomon, quertime is your script suitable for a linux environment as the openoffice library suggested by Gabi doesn’t seem to be a default on some linux servers, i gonna take a look at yours in just a couple of moments

  • zxzxzx

    did anyone actually get this to work?