Issue 122337 - Spanish: line break before Mdash (should be after mdash).
Summary: Spanish: line break before Mdash (should be after mdash).
Status: CONFIRMED
Alias: None
Product: Internationalization
Classification: Code
Component: ui (show other issues)
Version: 3.3.0 or older (OOo)
Hardware: All All
: P3 Normal (vote)
Target Milestone: ---
Assignee: AOO issues mailing list
QA Contact:
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-05-18 12:06 UTC by arcalaus
Modified: 2015-10-29 21:39 UTC (History)
4 users (show)

See Also:
Issue Type: DEFECT
Latest Confirmation in: 4.1.0-dev
Developer Difficulty: ---


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description arcalaus 2013-05-18 12:06:09 UTC
In Spanish, lines should NOT break after m-dash (Alt+0150 in Windows, Unicode U+2014, "—"), since it is a parenthetical character just as quotes or brackets are. n-dash could be used instead in order to solve this issue, but Spanish official spelling rules say m-dash, not n-dash should be used.


In Spanish, m-dash is used:
-At the beginning of paragraphs, to mark a literal quote in dialogues.
-Inside paragraphs, as a substitute of brackets, or breaking a dialogue quote.

This is a sample of m-dash usage in Spanish (m-dash represented as "---"):

   ---Blah, blah, blah ---comment about speaker--- blah, blah...

Notice spaces are used OUTSIDE the m-dash-encircled comment, just as they are used outside quotes. This is the main difference in m-dash usage between Spanish and French.

The example line should break this way:
   ---Blah, blah, blah ---comment about speaker---
blah, blah, blah.

The example line should NOT break this way:
   ---Blah, blah, blah ---comment about speaker
---blah, blah, blah.

Please notice Ms-Word had (and still has) the same issue for years, so if you solve this, you would put yourselves a step ahead of Ms-Word.

(A user- or locale-selectable configuration of breaking and non breaking chars should be useful, also.)
Comment 1 arcalaus 2013-05-18 12:08:54 UTC
I mark it as defect because it makes OpenOffice generate wrong typesetted documents. I understand 99% of Spanish people don't know how to typeset or how to spell, but I'm a Spanish teacher and I'm interested on generating well-typesetted documents for my pupils.
Comment 2 Rob Weir 2013-05-24 00:52:07 UTC
Thanks for the clear and detailed defect report!

-Rob
Comment 3 arcalaus 2014-03-05 15:08:34 UTC
This defect is already present in 4.0.1. 
I will try do download 4.1.0 to test it. 
Is anyone working on it?

Solution should be something similar to the "insert non-breaking spaces nexto to some french punctuators" option (with a small difference: we want no spaces).
Comment 4 arcalaus 2014-03-05 17:26:48 UTC
Please excuse me for bugging you once more.

I don't have enough harddisk space or time to compile OpenOffice. 

But, could someone check if this issue could be solved by adding a 
"<LineBreakHangingCharacters>[utf-code-for-emdash]</LineBreakhangingCharacters>" to *_ES.xml in the main\i18npool\source\localedata\data folder ?

Another way could be classifying it as a "double quote" when locale is Spanish (ES). But I can't find where is the character classification routine (I found many references to it in the i18n folder, but I found no real classification data for any language)
Comment 5 Ariel Constenla-Haile 2014-03-06 00:32:56 UTC
(In reply to arcalaus from comment #3)
> This defect is already present in 4.0.1. 
> I will try do download 4.1.0 to test it. 

It will be pointless to test this bug with the Beta, because:

> Is anyone working on it?

nobody is working on this bug.
Comment 6 Ariel Constenla-Haile 2014-03-06 02:00:55 UTC
(In reply to arcalaus from comment #4)
> Please excuse me for bugging you once more.
> 
> I don't have enough harddisk space or time to compile OpenOffice. 
> 
> But, could someone check if this issue could be solved by adding a 
> "<LineBreakHangingCharacters>[utf-code-for-emdash]</
> LineBreakhangingCharacters>" to *_ES.xml in the
> main\i18npool\source\localedata\data folder ?
> 
> Another way could be classifying it as a "double quote" when locale is
> Spanish (ES). But I can't find where is the character classification routine
> (I found many references to it in the i18n folder, but I found no real
> classification data for any language)

If I understood bug 92577 comment 12 correctly, this LineBreakHangingCharacters was introduced only in the locale data files but not implemented at the source code level because it will introduce incompatible API changes, and those are allowed only on major versions.