ANN: Strings edit v3.10

The library provides string handling facilities like I/O formatting, Unicode and obsolete code pages support.

String processing for Ada

This update provides full Unicode normalization support. Unicode normalization is intended to equalize same looking glyphs in order to compare them. In particular normalization applies to diacritic marks like ü = u + ◌̈, ligatures like fi = fi, symbols like Ω = Ohm symbol, subscripts, superscripts. However normalization does not apply to Latin and Cyrillic letters nor ligatures like German ß.

Changes to the previous version:

  • Function Compare to compare arrays of code points was added to the package Strings_Edit.UTF8;
  • Function Image to convert a code points array to UTF-8 string was added to the package Strings_Edit.UTF8;
  • Function Compare to compare arrays of code points using a code points mapping was added to the package Strings_Edit.UTF8.Maps;
  • The package Strings_Edit.UTF8.Normalization was added to provide Unicode decompositions (NFD and NFKD), composition, normalization (NFC, NFKC) as well as comparisons of normalized strings. The canonical composition rules are respected;
  • The application Strings_Edit.UTF8.Normalization_Generator was added to support updates of the UnicodeData.txt data base;
  • The test case test_utf8 was added.
7 Likes

Thanks @dmitry-kazakov your work is awesome as always and much appreciated!

Could it be a useful idea to make type-setting functions with LaTeX style/formatted strings as input?

IMO parsing LaTeX is very simple. You can always determine where current token ends and there is no precedence rules in play. The first issue arise when you must match the longest alternative from a table another is infix expressions.

Of course, you need some lookup context with tables containing names of visible macros and reparse macro expansions.

Maybe you meant something specific?

I find LaTeX typesetting syntax useful for organising text output under OpenGL (GLOBE_3D). I have only tried some simple attempts. It would be great to use LaTeX syntax for terminal output. One reason is that i have LaTeX under my finger tips:-)

All LaTeX syntax is \xyz{}, repeat. There is basically nothing to parse. Or do you mean the reverse action, rendering LaTeX?

For example, I would like to emphasis some words with special fonts or colours inside a text. Math typeset output would of course be nice :slight_smile:

Yes, of course. I use it in a similar case for HTML output.