Kinook Software Forum

Go Back   Kinook Software Forum > Ultra Recall > [UR] General Discussion

Reply
 
Thread Tools Rate Thread Display Modes
  #1  
Old 12-20-2008, 10:38 AM
pereh pereh is online now
Registered User
 
Join Date: 12-01-2008
Posts: 9
German language / Umlaute

Hello,

I am new to UR and just doing my first steps, and I have a question for which I could not find an answer in the doku or the forum.
Most of the documents I use are in German language, so there are a lot of special characters like Ä, ä, Ö, etc. (Umlaute). Not only don't they appear in the normal text display of the items, but the words containing such characters are also ignored when building the automated keywords. That way, searching might sometimes become pointless.
Is there a way to tell UR to handle these characters like normal ones?

Thanks for your help,
Peter.
Reply With Quote
  #2  
Old 12-21-2008, 02:43 AM
hartmut hartmut is online now
Registered User
 
Join Date: 06-12-2005
Posts: 70
I tested this now and found out that there is no Problem with the German Umlaut in the Title names or clean text-files, htm-files or Word-files documents imported into ultrarecall.
The Umlaute are shown in PDF Files but cannot be searched, even not with wildcards like "?" and "*"
The search within the PDF-document with the search function of the pdf-reader works well
In the Item Keyword window in all words with umlaute the umlaute are missing in the keyword list, but a far as I have seen only in pdf documents and not in the other kinds of documents mentioned above.
For example "Küche" ist found with Aearch Value "Küche" in all documents except PDF. IN PDF it will be found with "Kche" and not with"K?che"

Hartmut
Reply With Quote
  #3  
Old 12-22-2008, 08:37 AM
pereh pereh is online now
Registered User
 
Join Date: 12-01-2008
Posts: 9
Hello Hartmut,

thanks for your reply. You are right; I will have to use the search without Umlaute.
Reply With Quote
  #4  
Old 12-22-2008, 09:11 AM
kinook kinook is online now
Administrator
 
Join Date: 03-06-2001
Location: Colorado
Posts: 6,003
There did seem to be a problem with capturing accented Latin characters when keywording PDF documents. The main download at http://www.kinook.com/Download/UltraRecallProEval.exe has been udpated with a fix for this problem (UltraRecall.exe 3.5.3.1 in Help | About | Install Info after installing). You will need to re-import or synchronize (Item | Synchronize on the menu) PDF documents after installing to re-keyword.
Reply With Quote
  #5  
Old 12-22-2008, 09:33 AM
hartmut hartmut is online now
Registered User
 
Join Date: 06-12-2005
Posts: 70
Thank you for your prompt attention.

Hartmut
Reply With Quote
  #6  
Old 12-22-2008, 11:05 AM
pereh pereh is online now
Registered User
 
Join Date: 12-01-2008
Posts: 9
Hello to all at Kinook,

this was in fact the fastest answer I ever got for any problem I ever had with any kind of software! Great!
I installed the new version, it is also displayed in 'About | Install info', but now instead of leaving out these special characters, they are replaced with even more special ones:
"AbkŸrzungen" instead of 'Abkürzungen', "PortrŠt" instead of 'Porträt', "der Gro§e" instead of 'der Große'. Maybe I get these results because my XP is running with the scheme "German / Germany"?

Best regards,
Peter.
Reply With Quote
  #7  
Old 12-22-2008, 12:08 PM
kinook kinook is online now
Administrator
 
Join Date: 03-06-2001
Location: Colorado
Posts: 6,003
Apparently. It works ok in our testing here on English XP when configured for German locale, but we don't have a German XP to test with. You might be able to temporarily change to English locale when importing. Or you can get back the old behavior by unzipping and double-clicking the .reg file in the attached zip file and restarting UR.
Attached Files
File Type: zip reg.zip (332 Bytes, 1028 views)
Reply With Quote
  #8  
Old 12-22-2008, 12:20 PM
pereh pereh is online now
Registered User
 
Join Date: 12-01-2008
Posts: 9
OK, I am back to the old behaviour. Will there be a fix for this?
Reply With Quote
  #9  
Old 12-22-2008, 01:10 PM
kinook kinook is online now
Administrator
 
Join Date: 03-06-2001
Location: Colorado
Posts: 6,003
We will report the problem to the vendor of the PDF component.

And please ZIP and send a couple of problem PDF files to support@kinook.com so we can verify whether the problem specific to your files. Thanks.
Reply With Quote
  #10  
Old 12-22-2008, 03:42 PM
hartmut hartmut is online now
Registered User
 
Join Date: 06-12-2005
Posts: 70
I have the german XP and don't have a problem in the PDF as far as I see.
Peter, did you follow this instructions of Kinook:
"You will need to re-import or synchronize (Item | Synchronize on the menu) PDF documents after installing to re-keyword."

I searched for PDF, marked all in the search result window und "ITEM SYNCHONIZE".
Harmut
Reply With Quote
  #11  
Old 12-23-2008, 03:36 AM
pereh pereh is online now
Registered User
 
Join Date: 12-01-2008
Posts: 9
Quote:
Originally posted by hartmut
I have the german XP and don't have a problem in the PDF as far as I see.
Peter, did you follow this instructions of Kinook:
"You will need to re-import or synchronize (Item | Synchronize on the menu) PDF documents after installing to re-keyword."

I searched for PDF, marked all in the search result window und "ITEM SYNCHONIZE".
Harmut
Hello Hartmut,

I have tried re-import and synchronize. Now I have reinstalled UR (the version mentioned above), but the problem is still there. Please find a page attached for testing.

Best regards,
Peter.
Attached Files
File Type: zip exp.zip (29.4 KB, 1231 views)
Reply With Quote
  #12  
Old 12-23-2008, 04:50 AM
pereh pereh is online now
Registered User
 
Join Date: 12-01-2008
Posts: 9
I just tested with PDF2TXT V3.2 and it worked fine.
Reply With Quote
  #13  
Old 12-23-2008, 06:52 AM
pereh pereh is online now
Registered User
 
Join Date: 12-01-2008
Posts: 9
Now I have found a few PDFs, for which the keywording sometimes is ok ("möglich"), sometimes is wrong ("mglich") in the same document. I suspect now that it might have something to do with the fonts. For files that only use fonts Reader defines as type '1' (embedded) keywording gets always wrong. For files that additionaly use fonts defined as 'TrueType', the results are mixed. Maybe this is the right track to find the error?
Reply With Quote
  #14  
Old 12-23-2008, 09:17 AM
kinook kinook is online now
Administrator
 
Join Date: 03-06-2001
Location: Colorado
Posts: 6,003
Please ZIP and send a .urd file containing all problem PDFs imported (stored) to support@kinook.com. Thanks.
Reply With Quote
  #15  
Old 12-23-2008, 12:24 PM
kinook kinook is online now
Administrator
 
Join Date: 03-06-2001
Location: Colorado
Posts: 6,003
It seems that our licensed version of the PDF2TXT component has some issues. We are trying to get a working version of the licensed component from the vendor.
Reply With Quote
Reply

Thread Tools
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



All times are GMT -5. The time now is 07:09 AM.


Copyright © 1999-2023 Kinook Software, Inc.