Giter Club home page Giter Club logo

plchars's Introduction

PLChars

DOS charset converter for polish diacritical characters (legacy).

 ┌──────────────────────────────────────────────────────────────────────────┐
 │                                                                          │
 │         ▀██▀▀▀█▄▀██▀    ▄█▀▀▀█▄ ██  C O N V E R T E R  ver 5.3
 │          ██   ██ ██     ██   ▀▀ ██▄▄▄▄   ▄▄▄▄▄▄ ▄▄ ▄▄▄▄  ▄▄▄▄▄
 │          ██   ██ ██     ██      ██   ██ ██   ██  ██     ▀█▄▄
 │          ██▀▀▀▀  ██     ██   ▄▄ ██   ██ ██   ██  ██        ▀▀█▄
 │         ▄██▄    ▄██▄▄▄█ ▀█▄▄▄█▀ ██   ██ ▀█▄▄▀██ ▄██▄     ▄▄▄▄█▀
 │
 │            F R E E ─ W A R E   F O R   F R E E   P E O P L E
 │
 │               written by Marcin Gryszkalis (c) 1997 - 1999
 │
 └───────────────┐
     DESCRIPTION │
 ┌───────────────┘
 │
 │    PLC  is  file  converter  that  converts  ASCII  text  files  from/to
 │  different  standards  of coding polish diacritical chars. It support 34
 │  standards and can recognize standard of file (using two analysers).
 │
 └───────┐
     TIP │
 ┌───────┘
 │
 │    You   can  break  standard  analysis  with  ESC  key  (PLC  will  use
 │  information collected before break).
 │
 └──────────────┐
     DISCLAIMER │
 ┌──────────────┘
 │
 │    Here goes usual disclaimer and other useless stuff... [skipped]
 │
 └─────────┐
     USAGE │
 ┌─────────┘
 │
 │  PLC filename.ext [code] [options]
 │
 │    ¨  filename.ext - name of the source text file. You can use wildcards
 │  (like  *  and  ?)  to specify multiply files. You can specify full path
 │  (converted  file  will  be  saved  in  directory  where  source file is
 │  placed). If only filename is specified PLC will show standard of it.
 │    ¨  code - three-letter code of destination polish chars standard, eg.
 │  MAZ for Mazovia or LAT for Latin-2.
 │
 │  options:
 │
 │    ¨  /2 - use alternative method of recognizing standard.
 │    ¨  /3  -  use  both  (normal  and alternative) methods of recognizing
 │  standard.
 │    ¨  /S:code - force source standard, disable auto-recognizing
 │    ¨  /A  -  (useless when no wildcards specified) recognize standard in
 │  first file ONLY and assume rest of files being the same standard.
 │    ¨  /T:.. - name of target file, if not specified filename.PLC will be
 │  saved.  You  cannot  use  wildcards  in  /T parameter argument. You can
 │  specify full path and file or a path only (with ending '\').
 │    ¨  /D - delete source file afterwards
 │    ¨  /O - overwrite (rename destination file to source file afterwards)
 │    ¨  /R  -  auto  replace (if target file exist already then it will be
 │  overwritten without asking - see the "surprise" section in this doc)
 │    ¨  /Q - quiet mode (nothing is written to the screen)
 │    ¨  /? - short help
 │
 └────────────┐
     EXAMPLES │
 ┌────────────┘
 │
 │  PLC a1.txt                     -- recognizes standard of a1 and shows
 │                                    results on screen, no convertion
 │                                    performed
 │  PLC C:\OLD\a1.txt /2           -- same as above but a1.txt is not in
 │                                    current subdirectory and alternative
 │                                    analyser is used
 │  PLC a1.txt ISO /3              -- recognizes standard of a1.txt (using
 │                                    both analyzers) and converts it to
 │                                    a1.plc as ISO-Latin2
 │  PLC a1.txt ISO /T:a2.txt /3    -- same as above but saves a2.txt
 │  PLC a1.txt ISO /T:a2.txt /D /3 -- same as above but saves a2.txt and
 │                                    erases a1.txt
 │  PLC a1.txt ISO /T:C:\NEW\      -- saves a1.plc in NEW subdirectory on C:
 │                                    drive
 │  PLC a1.txt ISO /O              -- converts a1.txt to a1.plc, erases a1.
 │                                    txt, renames a1.plc to a1.txt
 │  PLC a1.txt ISO /S:MAZ          -- doesn't perform recognition, assumes
 │                                    a1.txt being in Mazovia standard and
 │                                    converts to a1.plc in ISO-Latin2
 │  PLC *.txt ISO /A               -- recognizes standard of first file
 │                                    matching *.txt mask (for exmaple -
 │                                    Mazovia), convert it to ISO
 │                                    and convert all other files matching
 │                                    *.txt from Mazovia to ISO.
 │
 └─────────────────┐
     FOOL EXAMPLES │
 ┌─────────────────┘
 │
 │  PLC a1.txt ISO /O /D          -- gives the same result as "DEL a1.txt"
 │                                   but takes more time and disk space ;)
 │  PLC a1.txt ISO /A             -- /A switch has nothing to do here
 │                                   (because PLC will work on 1 file only)
 │  PLC a1.txt ISO /T:a2.txt /O   -- works like /O only but uses s2.txt
 │                                   as a temporary file instead of default
 │                                   a1.plc
 └─────────────┐
     STANDARDS │
 ┌─────────────┘
 │
 │    Following standards are accepted and supported:
 │
 │  01. BEZ - Bez polskich znakow
 │  02. ADB - Adobe Type Manager (old)
 │  03. AMI - AmigaPL
 │  04. ST  - Atari ST
 │  05. ST2 - Atari ST (z-z)
 │  06. COR - Corel 2.0
 │  07. CSK - Computer Studio Kajkowski
 │  08. CRD - Corel Draw (old)
 │  09. CFR - Cyfromat
 │  10. DHN - Dom Handlowy Nauki / ChiWriter pl
 │  11. EFT - Efekt
 │  12. ELW - Elwro Junior (CP/J) or Rodos
 │  13. FAT - Fat Agnus zine (amiga)
 │  14. HCT - Hector / Univex
 │  15. IEA - Instytut Energii Atomowej (IEA) Swierk
 │  16. IIN - IINTE-ISIS
 │  17. ISO - ISO 8859/2 Latin-2
 │  18. KWK - KWK Club
 │  19. LAT - Latin-2 (cp852)
 │  20. LOG - Logic
 │  21. WIN - MS Windows 3.x (cp1250)
 │  22. MAC - Macintosh v1
 │  23. MC2 - Macintosh v2
 │  24. MAZ - Mazovia (cp991)
 │  25. MFD - Mazovia - Fido net
 │  26. MIC - Microvex
 │  27. FOR - PC sp. Format
 │  28. PN3 - Polish Norm #3 (Polska Norma #3)
 │  29. SKL - Skalmierski
 │  30. TAG - TAG
 │  31. TEX - TeX.pl
 │  32. VNT - Ventura
 │  33. XJP - XJP Amiga
 │  34. XRD - XRD 2nd edition
 │
 └─────────────────────────────┐
    POSSIBLE ERRORLEVEL VALUES │
 ┌─────────────────────────────┘
 │
 │      0 - No error
 │
 │  PLC internal/user Errors:
 │
 │    241 - Wildcards used in destination filename (/T: switch)
 │    242 - Unknown standard code
 │    243 - Unknown standard code (/S: switch)
 │    244 - File not found (file search)
 │    245 - File not found (proceed)
 │
 │  Dos errors reported by PLC:
 │
 │      1 - Invalid function number
 │      2 - File not found
 │      3 - Path not found
 │      4 - Too many open files
 │      5 - File access denied
 │      6 - Invalid file handle
 │     12 - Invalid file access code
 │     15 - Invalid drive number
 │     16 - Cannot remove current directory
 │     17 - Cannot rename across drives
 │     18 - No more files
 │    100 - Disk read error
 │    101 - Disk write error
 │    102 - File not assigned
 │    103 - File not open
 │    104 - File not open for input
 │    105 - File not open for output
 │    106 - Invalid numeric format
 │    150 - Disk is write-protected
 │    151 - Bad drive request struct length
 │    152 - Drive not ready
 │    154 - CRC error in data
 │    156 - Disk seek error
 │    157 - Unknown media type
 │    158 - Sector Not Found
 │    159 - Printer out of paper
 │    160 - Device write fault
 │    161 - Device read fault
 │    162 - Hardware failure
 │    200 - Division by zero
 │    201 - Range check error
 │    202 - Stack overflow error
 │    203 - Heap overflow error
 │    204 - Invalid pointer operation
 │    205 - Floating point overflow
 │    206 - Floating point underflow
 │    207 - Invalid floating point operation
 │    208 - Overlay manager not installed
 │    209 - Overlay file read error
 │    210 - Object not initialized
 │    211 - Call to abstract method
 │    212 - Stream registration error
 │    213 - Collection index out of range
 │    214 - Collection overflow error
 │    215 - Arithmetic overflow error
 │    216 - General Protection fault
 │
 └────────────────────┐
     AUTO RECOGNIZING │
 ┌────────────────────┘
 │
 │  1. First method (default)
 │
 │    This  method  of  auto recognizing is based on how frequently each of
 │  polish  diacritical  letters  can be found in typical text. I checked a
 │  huge  set  of  different kinds of documents in Polish (18 megabytes) to
 │  obtain  the  best  factors used in auto-recognition module, although it
 │  may  fail  when trying to process a file containing frame-work or ASCII
 │  graphics.  In  such  a  case  you should use /S switch to choose source
 │  standard. I have an idea to make it a bit better but now it's dea only.
 │
 │  2. Alternative method (option '/2')
 │
 │    This method checks every char in a file and if it doesn't exist among
 │  letters  in a standard - the standard gets some 'fail points'. Standard
 │  with the smallest number of 'fail points' is probably standard of file.
 │  This  method  fails  on  docs with framework but can help in particular
 │  cases  (when  whole  file  is  written in uppercase for example). Other
 │  thing  is  that  for  this  algorithm there's no difference between two
 │  standards  of  the  same  charset  in different order (like Mazowia and
 │  Mazowia FIDO).
 │
 │    Note  that  you  can  use  both methods with '/3' option. Result is a
 │  simple average of 1st and 2nd method results.
 │
 └─────────────┐
     BENCHMARK │
 ┌─────────────┘
 │
 │    On  my  Amd  K5  100  MHz  it  takes 02:15 (more than two minutes) to
 │  convert  18  magabytes  long  file (from hard disk to nul device) using
 │  both  standard  analysers  (/3  option).  Converting the same file with
 │  source standard specified (/S: option) takes about 5 seconds - actually
 │  it is FASTER than "copy file.txt nul". Pretty good, huh?
 │
 └────────────┐
     SURPRISE │
 ┌────────────┘
 │
 │    Sometimes you may get a message :
 │  FILE.EXT exist. [O]verwrite [S]kip [R]ename [Q]uit
 │    O - overwrite existing file with new one
 │    S - don't try to convert the file
 │    Q - quit immediately
 │    R - choose a name for EXISTING file, NOT for file that will be saved.
 │
 └───────────┐
     HISTORY │
 ┌───────────┘
 │
 │  1.0 ─ First Release, 8 standards
 │  1.1 ─ Auto-recognizing, /O and /D switches added
 │  1.3 ─ Bugfixed, 14 standards
 │  1.5 ─ Bugfixed, 18 standards [no release]
 │  2.0 ─ New command line (easier/faster to use), 26 standards
 │  3.0 ─ Auto-recognizing improved, 29 standards
 │  3.1 - 30th standard added [no release]
 │  3.5 ─ Wildcards support added, /S switch added
 │  3.6 - Minor bug fixed, minor display change, 31 standards
 │  3.7 - /A switch added, /T: switch bug fixed, 32 standards
 │  3.8 - fixed bug added in 3.7 :), fixed this text, 33 standards
 │  3.9 - fixed another bug (sorry...), auto-recognizing proofed
 │  4.0 - Percentage show fixed, auto-recognizing proofed again
 │  4.1 - Small fixes, extended proofing used [no release]
 │  4.2 - "I'm alive" indicator fixed to time dependent [no release]
 │  4.3 - 34th standard added [no release]
 │  4.4 - Some floating point code fixed [no release]
 │  4.5 - Some memory optimizations [no release]
 │  5.0 - Alternative method of recognizing implemented (/2 and /3)
 │  5.1 - Code fixed to overleap Borland's CRT bug [no release]
 │  5.2 - Converting speed up (using xlation tables) [no release]
 │  5.3 - /R and /Q switches added, documentation extended
 │
 └────────────┐
     PROBLEMS │
 ┌────────────┘
 │
 │    Known problems you may have using PLC:
 │
 │    ¨  When  using  wildcards  no  more than 5041 files will be converted
 │  (this  is because names of files are stored in an array of size limited
 │  to  64K).  In most cases those 5041 should be enough but if not you can
 │  always use DOS' "for" command (I'm not really sure if it'd help...)
 │
 └────────────────┐
     FUTURE PLANS │
 ┌────────────────┘
 │
 │    ¨ Rewrite whole PLC into c++ (watcom/gcc)
 │    ¨ Third version of standard analyser (I have one pretty good idea but
 │  it's really sophisticated...)
 │    ¨ Unix/linux/vms/aix port
 │    ¨ Unknown standard converter (with this thingy PLC will convert file
 │  in a standard that doesn't match any of known standards)
 │    ¨ windoze port (?)
 │    ¨ source code release (?)
 │
 └────────────────────┐
     BLA, BLA, BLA... │
 ┌────────────────────┘
 │
 │    Feel  free  to  ask  me  any  questions.  Wishes  and bug-reports are
 │  welcome. You can use email ([email protected]) or meet me (nickname
 │  "dagoon")  on  #PolishScene  or  #Trax  (EFNet iRC). I need info on not
 │  supported standards (more, more, MORE!!! ;)
 │
 │    I'd like to thank following people for their support and help
 │    ¨ Pawel "KrawietZ" Krawczyk - author of ConvPL
 │    ¨ Maciej Haudek - author of Witaj (standards info)
 │    ¨ Artur Pietruk (docs)
 │    ¨ Artur Olech (Borland's CRT bug information & additional advices)
 │
 │    You  can  find  latest  version of PLC in SAC archive or in Cryogen's
 │  distro sites (see cryogen.nfo)
 │
 │    If  you  want some theory on standards converting and recognizing (in
 │  polish) - try these articles:
 │    ¨ Wladyslaw Majewski "Z komputerem po polsku", Komputer 10/87
 │    ¨ Marcin Borkowski "Polskie litery", Bajtek 2/91
 │    ¨ Grzegorz Eider "Nieco porzadku", Enter 9/91
 │    ¨ Stanislaw  Weslawski  "Problemy rozpoznawania i konwersji polskich
 │  znakow", Magazyn Amiga 1/97
 │
 │    Lot  of  stuff  for  all  hardware platforms and systems available at
 │  http://sunsite.icm.edu.pl/ogonki
 │
 └───────┐
     TIP │
 ┌───────┘
 │
 │    To get polish version of M$ Windows type: "PLC win.com WIN /O"
 │
 └───────┐
     SAC │
 ┌───────┘
 │
 │    You  can download latest versions of PLC (and other utilities made by
 │  me)  from  Slovak  Antivirus  Center  FTP  sites  (PLC is in /UTILTEXT/
 │  subdirectory). Here is complete list of SAC mirrors:
 │
 │    Poland          ftp.pwr.wroc.pl/pub/pc/sac
 │    Czech Republic  ftp.vse.cz/pub/mirror/ftp.elf.stuba.sk/pc
 │    Germany         ftp.cs.tu-berlin.de/pub/msdos/mirrors/stuba/pc
 │    Hungary         ftp.bke.hu/pub/mirrors/sac
 │    Italy           cert.unisa.it/pub/PC/SAC
 │    Italy           ftp2.itb.it/pub/PC/SAC
 │    Slovakia        ftp.netlab.sk/pub/sac
 │    Slovakia        ftp.sac.sk/pub/sac
 │    Slovakia        ftp.gratex.sk/sac
 │    Slovakia        ftp.uakom.sk/pub/mirrors/sac
 │    Taiwan          ftp.nsysu.edu.tw/PC/SAC
 │    U.S.A.          ftp.cdrom.com/pub/sac
 │
 └───────────┐
     CONTACT │
 ┌───────────┘
 │
 │  Marcin Gryszkalis aka Dagoon of Cryogen
 │  ul.xxxxxxxxxxxxx xx m.xx
 │  xx-xxx Lodz
 │  Poland
 │
 │  email: [email protected]
 │
 │  phone: (0-48-42) xxx-xx-xx (CET)
 │
 │  WWW: http://rs.math.uni.lodz.pl/~dagoon
 │                                                                          │
 └──────────────────────────────────────────────────────────────────────────┘

plchars's People

Contributors

marcin-gryszkalis avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.