#1
|
|||
|
|||
Text export sometimes as expected, sometimes all lowercase
I understand that SQLite's FTS probably switches to all-lowercase storage, and so, "text" export should be retrieved not from that FTS, but from ("stripped") rtf? (The other alternative would be to store THREE versions of the content in UR: the rtf, the FTS, AND the "text", with correct = original lowercase / uppercase: that would be probably "insane" though...)
I cannot say HOW / WHEN it's that way, but I'm absolutely positive, having observed that phenomenon many times: Some "text" export is all-lowercase (except perhaps then for some "proper names"!), whilst other (i.e. "plain") "text" export is correct, i.e. preserves the original cases for the whole text. From my point of view, this is totally unpredictable - I admit I haven't spend additional hours into further analysis -, so for the time being, when I need "text" export", I use "rtf" export instead, which works faultlessly (!; I then use "¬" as the necessary "page" / item separator), then open in TextMaker (I have a paid license (as well as for MS Word but which is much too much of a "fuss"), but their FREE version is as good as the paid one for this task - I checked that!), save "as text" from there, and then run my AHK scripts onto that TextMaker "plain text" output: works perfectly, so anybody encountering that problem will get a free, and robust solution to it. On the other hand, it's quite weird that UR's "text" output switches between plain text with correct case, and - unusable - almost-all-lowercase - perhaps depending on the "original origins" of those contents? (I had imported many UR contents from other sources / applications, and then, lots of my UR contents are "originally UR" - perhaps that element may explain it? I really don't know...) |
#2
|
|||
|
|||
Prior to v6.2, the default was to not convert to lowercase when indexing item text, and items indexed before then would have the original casing. Starting in v6.2, the default is now to lowercase to support case-insensitive search of non-ASCII content, and items indexed since then would be converted to lowercase.
If you only deal with ASCII text, you can turn this off to get the old behavior (see https://www.kinook.com/Forum/showthread.php?t=5757), but you will need to re-index existing items that were indexed with the lowercase option enabled (re-import, sync, or change the item text). |
#3
|
|||
|
|||
Sorry, my bad, I had seen - but not read - the other - very recent! - thread, but had not made the connection between my "export problem" and the "columns problem" discussed over there.
Thank you very much, Kyle, for clarifying, and I entirely understand the "more valid" interest in correctly retrieving diacritics by FTS! (All the more so since it's by option.) Last edited by Spliff; 02-25-2023 at 08:31 AM. |
#4
|
|||
|
|||
In order to get correct, i.e. case-sensitive, xml export ("attrib: Item Text"), and considering that in my "workflow", including uppercase diacritics into the search results of lowercase diacritics, and vice versa, is not really important, I changed back the registry setting to ancient behavior, then did a new Compact-and-Repair (C&R), with FTS always enabled; this did NOT correct the new behavior for existant items (and in-between had not been "saved" in a dedicated way, 1-by-1, after edits of them)... and, those "repair" C&R had not taken so much time after all, so...
HINT: Thus, I now did an intermediate C&R, with FTS DIS-abled for the C&R, and aftwards, did the "real repair" C&R, with FTS-EN-abled now, and this was fully successful: all FTS, for old data as for new, is according to the registry setting now. Btw, as expected, both the "disabled" C&R, and then the "enabled" C&R took a long time now each, and both reduced the db size by 20 and 19 p.c. each. - once "you" know how it's done correctly, no problem anymore whatsoever. And, quite some UR registry settings could be very helpful, as db-specific settings / attributes: This is just one of them, but I also think of 1-line vs. multiline item display in the tree, and some others... - they all have in common that the "extra" setting is just - but then, yes! - useful in exceptional situations, whilst for most of the users' staff, just the "regular" setting is the "good" one, so if these settings had been db-specific, the user could "extract" their data into some "special" db, with those special settings. This being said, it's obvious that some of those settings could be made db-specific quite easily, whilst for others, this could become a real chore... EDIT: Thank you for the link update; I couldn't get it run, and now like my solution better, but that's just me. It just occurred to me, re the (obvious) global scope of registry settings mentioned above: Would it be possible to "de-install UR for all users" - I'm the only user anyway, to re-install it just for the main user, with "regular" registry settings... and then to install it to another user, created just for that, and with its own UR registry settings? That would be a viable "intermediate" solution, in order to get multi-line item title display in the tree ("Data Explorer") for just ONE UR db... instead of using another pc for that? (Or doing a virtual Windows installation, needing a second Windows license, on top of all the fuss that would imply?) Last edited by Spliff; 05-15-2023 at 07:30 AM. |
|
|