Friday, August 04, 2006

OpenOffice Viewer

In posts about Crystal Reports and Word 2007 I brought up an issue of printer metrics (and page layout) so I've searched about "printer independent layout". I've discovered a document titled "Printer Independent Layout" [2003-MAR-12] on the following page: "Spec Proposals for OOo 1.1.x".

Document has an extension SXW and I have no application associated to it in my Windows environment. According to the Wikipedia, SXW is acronym for StarOffice XML Writer and it is the file extension for text files. Is there a tiny SXW viewer for Windows? Yes, it is: Visioo-writer - Visionneuse, but it is still in development (version 0.6.1). It is based on Python and here is a result (part of the first page):

Unfortunately, printing is not implemented yet ... I shall wait for version 1.0.1 ...

What happens if we want to print the SXW file via doknir? This:

In Kanotix there is no default application for SXW extension, so let's select KWord, which is currently default for DOC file type:
And this is the result:
This is much better than Visioo-Writer, but still not perfect: If we take a closer look:

we can see that letters a, i, e in word "applies" are different from the letters in "StarOffice"
To be sure, if this is correct or not, we have to install OpenOffice:

  • sux
  • apt-get update
  • apt-get install
Need to get 96.4MB of archives.
After unpacking 219MB of additional disk space will be used.

Quite a lot of megabytes just to view SXW file ;). In my case, version 2.0.3-6 was installed - here is the current changelog. Anyhow, the following screenshot shows how the above document is displayed in OpenOffice Writer, which is now the default application for SXW extension:

We can see that KWord was not able to display SXW document correctly.

BTW, here is the abstract of the above document:

Many users complain about changing layouts of their document once these documents are printed to different printers or even on different driver versions of one and the same printer.

Even if this behaviour is technically correct (due to different printer and font metrics) it is confusing for the user. This is especially true if the number of pages increases or decreases.

To avoid such re-formatting a printer independent layout mode will be introduced.

As a side-effect we will buy in better MS Office compatibility ;o)

IMHO this behaviour is technically correct if you are a printer. But I'm not a printer and for me technically correct means that page layout does not change when changing printer. Fortunately this is the default in the latest OpenOffice:


Thursday, August 03, 2006

Incoming folder

Programming is one of the most
difficult branches of applied mathematics;
the poorer mathematicians
had better remain pure mathematicians.
--- EDSGER WYBE DIJKSTRA, How do we tell truths that might hurt? (1975)

Arrogance in computer science is measured in nanodijkstras.
--- ALAN KAY, ?

Implementation of the "right click" functionality was just the first step on our journey to the Grand Unification of the Host and her Guest. Today we are going to make the second step: "incoming folder".

As we can see in "Crystal Reports" or "Word 2007" examples, the following actions are required:
  • open the folder
  • find the file
  • right-click on it
  • select 'doknir' in pop-up menu
By implementing "incoming folder" in the Windows host we will eliminate these actions. How will "incoming folder" work? Very simple, we will modify our Windows Delphi utility so that it will monitor specific subfolder [_doknir_] in "My Documents" folder. Whenever new file will appear in that folder, it will be sent to the 'doknir' virtual appliance. What follows is technical description of Delphi source changes:
  • We need to split method wodShellMenu1Click - new method is called CreateJob:

procedure TForm1.wodShellMenu1Click(ASender: TObject;
const Item: IMenuItem; const Name, Directory, Targets: WideString);
cFile: string;
if Pos(';',Targets)>0 then begin
ShowMessage('Please select only one file!');
end else begin
if UpperCase(ExtractFileExt(Targets)) = '.LNK' Then begin
end else begin
end; // wodShellMenu1Click

procedure TForm1.CreateJob(cFile: string);
SearchRec: TSearchRec;
cTmpFile, cJob, cJb, cFile2: string;
nSize: integer;
cExt, cExt2: string;
if FileExists(cFile) then begin
if not Forms.Application.Active then
if WindowState=wsMinimized then WindowState:=wsNormal;
FindFirst(cFile, faAnyFile, SearchRec); // +cExt.Caption
if (Length(SearchRec.Name)>0) and (SearchRec.Size>0) then begin
Memo1.Lines.Add('size='+IntToStr(nSize) ) ;
if nSize>50000000 then begin
if MessageDlg('File '+cFile+' is large! Do you want to continue?',
mtConfirmation, [mbYes, mbNo], 0) <> mrYes then begin
cExt:=ExtractFileExt( cFile );
if cExt<>'' then begin
cFile2:=LeftStr(cFile, Length(cFile)-Length(cExt) );
cExt2:=ExtractFileExt( cFile2 );
if cExt2<>'' then cExt:=cExt2+cExt;
// ShowMessage(cExt);
cTmpFile:=TmpFile( cTmpPath, cExt );
FileCopy(cFile, cTmpFile);
ShowMessage('Error copy: '+cFile);
if FileExists( cTmpFile ) then begin
// create job (random file name)
Memo1.Lines.Add('job='+ cJob);
with TIniFile.Create(cJob) do begin
WriteString('doknir job', 'tmp file', ExtractFileName(cTmpFile));
WriteString('doknir job', 'original file', cFile);
RenameFile(cJb, cJob);
end; // CreateJob

  • We need a function to get "My Documents" folder:

 function GetMyDocuments: string;
Res: Bool;
Path: array[0..Max_Path] of Char;
Res := ShGetSpecialFolderPath(0, Path, csidl_Personal, False);
if not Res then raise
Exception.Create('Could not determine My Documents path');
Result := Path;
end; // GetMyDocuments

  • We need two timers: "TimerMon" and "TimerChg". They are disabled and with intervals 500 and 2500 ms, respectively. The first timer monitors "incoming folder". When a new file appears, it stops and starts the second timer which checks if the size of incoming file is changing. If not it stops, creates a job to send the document to the virtual appliance and it starts the first timer.
  • Specific subfolder in "My documents" will be called [_doknir_] (so that it will be alphabetically on the top in the list of folders). The method FormCreate has these new lines:

TimerMon.Enabled:=TRUE; // start monitor cMyDok

  • And finally, here are two methods for the timer events:

procedure TForm1.TimerMonTimer(Sender: TObject);
// monitor incoming folder
nRes: integer;
if bFindFirst then begin
nRes:=FindFirst(cMyDok+'\*.*', faArchive , srDok);
end else begin
if (nRes=0) then begin
if bFindFirst then begin
Memo1.Lines.Add('findfirst: '+srDok.Name+
' size='+IntToStr(srDok.Size) );
if (srDok.Name<>'.') and (srDok.Size>0) then begin
end else begin
end else begin // nRes<>0
if not bFindFirst then begin
end; // TimerMonTimer

procedure TForm1.TimerChgTimer(Sender: TObject);
// is size of file changing ...
nSize: Int64;
fs: TFileStream;
cTo: string;
Memo1.Lines.Add('incoming: '+srDok.Name);
Memo1.Lines.Add('stream size: ' + IntToStr(nSize) );
if (srDok.Size=nSize) then begin // size ok
Memo1.Lines.Add('size ok: '+IntToStr(nSize) );
if FileExists(cTo) then
DeleteFile(cTo); // temporary ***
if RenameFile(cMyDok+'/'+srDok.Name, cTo) then begin
TimerMon.Enabled:=TRUE; // monitor for next file
end; // TimerChgTimer

So, from now on, every time we save or put a document into the subfolder [_doknir_] of folder "My Documents", it will appear in the virtual appliance.

Word 2007 (.docx) and ISO 19005 (PDF/A)

Word 2007 is the fourth generation of the most popular Windows word processor. First generation are Word 1.0, 1.1, 1.2 and 2.0. Second are Word 6.0 and Word 95 (this is 32-bit Word 6.0 with support for long file names and "red-squiggle underlined spell-checking"). Third generation (VBA replaced WordBasic) consists of Word 97, Word 2000, Word XP and Word 2003.

We can characterize Word versions by library riched20.dll. For example, Word 2000 came with version, and Word 2007 beta is using version 12.0.4017.1003. BTW, for unicode support, the file usp10.dll (Uniscribe Unicode script processor) is important.

Word default file extension used to be .doc. Let's download one document and try to open it in doknir: "Windows Vista Hardware Start Button Specification". This is the result:

Upgrading KWord to version 1.5.2 does not help - again "The application KWord (kword) crashed and caused the signal 6 (SIGABRT)". So we will need to use Word - in our case that will be version 2007 beta.

Before starting Word 2007, stop windows print spooler just like in "Crystal reports" example. Word 2007 behaves more friendly than Crystal reports, but again, only limited number of paper sizes are available (Letter, A4, Legal, A3, B4, B5) - fortunately, it is possible to define custom paper size.

There is no file menu in new Word - we must use "Office Button" instead:

For example, if we click on "Print", nothing happens, because print spooler is not running. BTW, because of the new button it is not possible to close window by double-clicking the little horizontal line icon in the upper-left corner:

That's why I've added little '×' next to the "Office Button" to close Word...

So, how to print Word file using doknir? Fortunately, Word 2007 has very useful new feature: "Save As PDF". You must select folder and enter the name of the PDF file and click on button Publish. To display PDF in 'doknir', open that folder, select newly created PDF, right-click on it and select 'doknir' in pop-up menu. Here is the first page of above document, displayed in 'doknir' - VMware virtual appliance:

What kind of joke is this ?!!! Aren't PDF documents portable? Ehm, no! Let's take a look at Word "Save As PDF" options:

Option "ISO 19005-1 compliant (PDF/A)" seems interesting - let's try it:

Aha, now it is OK. So what's the difference between the two PDF documents? If we look at the "Document Properties"->"Fonts" in Adobe Reader:

we can see that fonts in correct PDF document are embedded and this is the main difference between PDF and PDF/A:

The constraints include:

  • Audio and video content are forbidden
  • Javascript and executable file launches are prohibited
  • All fonts must be embedded and also must be legally embeddable for unlimited, universal rendering
  • Colorspaces specified in a device-independent manner
  • Encryption is disallowed
  • Use of standards-based metadata is mandated

At the end of this post, let's take a look at the new Word extension .docx. 'X' means that this is (compressed) Open XML format and I've found an excellent example with a lot of mathematical formulas. Bellow is the second page printed via 'doknir':

The quality of formulas in PDF files is still not perfect, but I hope Microsoft will correct this in the final version of Word 2007.

In the future, we will write a VBA macro to automate Word printing via 'doknir' ...


Wednesday, August 02, 2006


Simplicity is prerequisite for reliability.
--- EDSGER WYBE DIJKSTRA, How do we tell truths that might hurt? (1975)

Jeffrey Jaffe (CTO for Novell) says in his blog:
"True, Windows is far from perfect. It is not comfortable to “ctrl-alt-del” every time my printer driver gets confused and hangs the system."

Why does printer driver get confused? I don't know. Why does printer driver hang the system? Because Windows print spooler gets confused. What is print spooler?

Software that manages printing in the computer. When an application is requested to print a document, it quickly generates the output on disk and sends it to the print spooler, which feeds the print images to the printer at slower printing speeds. The printing is then done in the background while the user interacts with other applications in the foreground.

Why does print spooler get confused? Because printer driver gets confused ...

Windows Printing Arhitecture (WPA) is one of the major components of the Windows architecture and it is indeed quite complex (here are two articles: "Printing Architecture" and "How Network Printing Works"). Basically, print spooler is executable implemented as service (spoolsv.exe) that waits for an RPC call from the client side of the spooler (winspool.drv). Spoolsv.exe calls the print router (spollss.dll) that determines which print provider to call, based on a printer name or handle supplied with each function call, and passes the function call to the correct provider ...

What follow are the consequences of this complexity:

And here are two links to solutions:

Another story is printing via Remote Desktop(RDP), Terminal Services or Citrix:

Above are only partial solutions - radical approach is to stop Windows print spooler service and use alternative solution. One of those radical ways how not to use Windows print spooler is to replace Windows, as Novell CTO suggests, with Linux:

"... that the Linux desktop has more than arrived – it has become the better desktop."

Well, here is my way how to tame print spooler: continue to use Windows desktop, but stop print spooler service and use "doknir" (VMware virtual appliance) as print spooler replacement.