help files
Z505 | PasWiki | FUQ | Search | Main Docs | API Guide



Notes

International websites

While many web pages use simple ASCII characters, some international pages use UTF-8 encoding.

If you are going to analyze and work with UTF-8 strings, first convert or decode the UTF-8 into a widestring.. see the freepascal Utf8Decode and Utf8Encode functions.

The below program only works without any encoding/decoding because the StrLoadFile function is a binary based function that just zaps the file directly into a string.

Normally your web programs would do more than simply load a file directly into a string and display it. When you wish to work with UTF-8 text in a web program with functions such as Pos(), just decode the UTF-8 string into a widestring. After you are done working with the widestring and performing operations with it, convert it to an ansistring by using the UTF8Encode function that comes with freepascal. This prepares the string for webwrite or webwriteln. The character set must be set, as in the below program - otherwise your UTF-8 characters will just be ???? question marks or other gibberish.

Here is a simple example program that shows how to override the default character set with your own.

program utf8;  {$mode objfpc} {$H+}

uses
  pwinit,   // 1.7.x only
  pwmain, 
  strwrap1; // miscellaneous string functions

procedure Head;
begin
  Out(
    '<HTML>' +
    '<HEAD>' +
      '<TITLE> New Document </TITLE>' +
      '<meta http-equiv="content-type" content="text-html; charset=utf-8">' +
    '</HEAD>' +
    '<BODY>'
  );
end;

procedure Foot;
begin
  Out(
    '</BODY>' + 
    '</HTML>');
end;

procedure Test;
begin
  SetHeader('Content-Type', 'text/html; charset=utf-8');
  Head;
  Out('UTF-8 Test <hr>');
  Out(strloadfile('sample-utf8.txt')); // open a file containing UTF-8 text
  Out('<p>');
  Foot;
end;

begin
  Test;
end.





lufdoc, Powtils, fpc, freepascal, delphi, kylix, c/c++, mysql, cgi web framework docs, Z505