Apache OpenOffice (AOO) Bugzilla – Issue 93870
let spellcheck filter rarely-used words
Last modified: 2014-02-24 18:01:03 UTC
One barrier to providing a long dictionary is that rare words are often similar in spelling to common-word misspellings. Thus, a misspelling can be accepted by OOo because a rarity with a different meaning matches it. Words that are correct but rarely appear should be approved after the user is reminded of their rarity, so they can reconsider usage or misspelling. This includes spelling variants, technical terminology, foreign borrowings, words used in a field of scholarship but not elsewhere, and ancient language being quoted. A solution already exists, by creating or installing multiple dictionaries and sometimes loading all of them, sometimes only the main one. One supplemental dictionary would list rarities only. But it would be easier to simply click a button in the Spellcheck dialog to approve rarities of one or more categories. In short, if floccinaucinihilipilification were in a document and in a Spellcheck word list, running Spellcheck would show the word as challenged but instead of describing it as wrong or not in the dictionary it would be shown as being in the dictionary but rare. This rarity check could be optional, with radio buttons to treat all rarities as either always wrong or always right or as rarities to be considered at every appearance. I'm using OOo Writer 2.4.0 without Java Runtime Environment on Linux Fedora Core 4 with Gnome 2.10.0 desktop on a Pentium 4 laptop. I didn't see this feature. Thank you. -- Nick
There have been discussions about similar spelling of rare words and common words for the French dictionary which I maintain. Removing rare words suggest that these words have the same spelling as common words. And people usually don't know how to write rare ones. Rare words, in French, don't interfere so often with common-word mispellings, but common words often appears similar to common-word misspellings. So the problem is not so much about rarity. Imho, this is the job of grammar checker to analyse sentences and indicate where there is a possible confusion. And this is already what LanguageTool does. We can write rules to prevent confusion between two correct words. I don't know what has been done for English, but rules about this problem have been created for French. http://www.languagetool.org/ http://community.languagetool.org/
Reassigned to SBA
In English, grammar checkers are horrible. My main experience with them is with the one in Microsoft Word. It's my understanding that few people install standalone grammar checkers. I think only once have I gotten useful advice from any (I think it was Word's). I could try tweaking them for register and particulars, but I'd have to have a bunch of -- not just rules -- whole rule sets to switch between according to purpose and, in my experience, what rule applies is a judgment call. So I'm going to differ often even from a computer I've carefully tweaked for my own use. Example: something may be correct but likely to be misperceived by the reader for whom I'm writing. Writing these rules is not a project to be completed in a week or two of hard work. It's enormously complicated. What would you do with the clause "Believe you me"? Verb-subject-object is a very unusual ordering in most languages, but this one is idiomatically correct. When would you forbid splitting an infinitive? Winston Churchill is said to have answered a critique that he had allowed a participle to dangle, "There are some things up with which I will not put.", which I take as allowing dangling sometimes. The sentence adverb was considered wrong until it was eventually accepted. French has l'Académie française as a prescriptive institutional authority on the language; English has nothing of the sort. Spelling alone: If rare words interfering with common-word misspellings is not a problem in some languages, the feature can be disabled (dimmed) for those languages, to avoid making a user suffer feature overload. But, in my experience with English, the problem is with some parts of English, some writing assignments, some kinds of training, and the like. I often add words to my Lotus spellchecker dictionaries (I use Word Pro on my Win95a platform), including rarities. As a result, misspellings probably escape detection. People who never use rare words may never encounter that problem, but if they do add them to their user dictionary the nondetection problem grows, and is itself hard to detect. You don't know that you made a mistake, and the very purpose of a spellcheck is eroded. An option is not to add rarities, but then I get a ton of alleged misspellings marking up my document. A better option is to have 2 different graphical signals that a word might be problematic: one if definitely misspelled and the other if the spelling is questionable because it matches a rarity. -- Nick
This is not an enhancement but a full-blown feature (request) => Change issue type. Bringing "statistic values" into spell check proposal list with the long-term goal that the software ALWAYS knows what you wanted to write while you misspelled a word. Ending in something like a self-adjusting autocorrection? Note that "rarely used" might be measured via "internet statistic", but this does hardly ever reflect the individual view of "I want this proposal when I mis-type that word, thus I want THAT proposal and no other. After all, to use or to avoid single words is influenced by the subject and the personal skill, taste and direction when choosing words. Change component to lingucomponent.