Book logo xindy

A Flexible Indexing System

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Announce: Alpha versions available


> > I've just put alpha versions of the 2.0 release at
> >
> >
> It seems that isolatin1[ms]-tex.xdy lack entries. At least the acute-
> and grave- accented characters are missing, e.g.
>   (merge-rule "\'e" "é" :string :again)

Yup, there was one grep rule too much. Fixed it.

> Which original file did you use to auto-generate the file?


> Is there some file that describes the differences between 1.2.3
> and 2.0?

Not yet. The only visible change is that SORT-RULE now accepts the
option :RUN <phase-num>. Hence, it contains the implementation of the
sort phases as described in my memo about the new sorting scheme.

For an example see the file


which is now part of the testsuite that comes with the distribution.
It contains some of the French sort-rules as outlined in my memo.

Here is part of the file:

;; This style-file tests the mapping scheme with several runs. We use
;; some examples in French and German to test these new features.

[stuff deleted]

;; The sorting rules in our example are a mixture of German and French
;; sorting rules.

(define-sort-rule-orientations (forward forward backward))

;; RUN 0

;; Case-insensitive run 0
(sort-rule "A" "a" :run 0)
(sort-rule "R" "r" :run 0)
(sort-rule "M" "m" :run 0)

;; Ignore accents in first run
(sort-rule "é" "e" :run 0)
(sort-rule "ô" "o" :run 0)

;; RUN 1

;; The previous rules leave the keyword groups
;; {arm,Arm,ARM}, {cote,côte,côté,coté}.

;; The next run decides that uppercase follows lowercase. This is done
;; by the following rules. We have only defined the rules necessary in
;; our concrete example.

(sort-rule "a" "a0" :run 1)
(sort-rule "A" "a1" :run 1)
(sort-rule "r" "r0" :run 1)
(sort-rule "R" "r1" :run 1)
(sort-rule "m" "m0" :run 1)
(sort-rule "M" "m1" :run 1)

;; Ignore accents in this run, too.
(sort-rule "é" "e" :run 1)
(sort-rule "ô" "o" :run 1)

;; RUN 1

;; The previous rules leave the keyword group. {cote,côte,côté,coté}.
;; The other group is now sorted. This run must now sort backwards,
;; due to the French sorting rules.

;; Now define an order on the accents. Since the comparison now is
;; from right to left (backwards) the tokens must define their order
;; number *before* the character.

(sort-rule "e" "0e" :run 2)
(sort-rule "é" "1e" :run 2)
(sort-rule "o" "0o" :run 2)
(sort-rule "ô" "1o" :run 2)

If we look at this example we see, that we need a mechanism to reuse
rule sets in different runs. For example the upper/lowercase mappings
appeear twice, in run 0 and 1, too. The question is, whether we need a
mechanism to group rules without enabling them immediately, such as

  (define-rule-set "upper-to-lowercase"
	(sort-rule "A" "a")
	(sort-rule "R" "r")
	(sort-rule "M" "m")

  (define-rule-set "ignore-accents"
        (sort-rule "ô" "o")
	(sort-rule "é" "e")

that can be instantiated in different runs, e.g.,

  (use-rule-set "upper-to-lowercase" :run 0)
  (use-rule-set "ignore-accents"	       :run 0)

  (use-rule-set "upper-to-lowercase" :run 1)

or something similar.

This has several advantages:

 1. It is much less to write.

 2. We can offer many of these mappings that will appear in practice
    prepared in style files, ready for application in many levels,
    hence, we do not bother casual users with the burden of learning
    all the details of a full specification.

 3. This would open the way for a nice frontend, where we can ask
    users, how to sort their stuff based on the predefined rule sets.
    From my experience this would cover most of the typical users

Coming back to the isolatin mappings discussed earlier, the isolatin
style files should actually simply define these rule sets (at least
the sort-rules), and allow their application later at any phase
desired. Currently, they are only applicable at run 0, which is not
sufficient, to my eyes.

What do you think?


Roger Kehr
Computer Science Department         Darmstadt University of Technology