Thread: Export problems
View Single Post
  #3  
Old 06-27-2021, 05:59 AM
Spliff Spliff is offline
Registered User
 
Join Date: 04-07-2021
Posts: 192
The best you could get, according to me (ideas welcome):

In UR: File - Export - Items to XML OPML - Attributes Check All - File to export selected item attributes/ data to: "yourpath\none.xml" (export child items recursive = checked, html = none)

Open the non.xml file in some good editor (wordwrap yes or no) and, if you do not need the indentation levels (you could identify them by counting the tabs) and want a better overview before running your scripts, regexreplace "\t{1,}<" by "<".

Scrape the file for all "<outlinecreated" (= outer loop; do NOT search for "between <outlinecreated and </outline" instead), then (= inner loop) for the next-following "text" = item title and "flag" (value may be empty) and "item ID".

("item text" then lists the content NOT formatted; "item details RTF" is redundant for our use; the respective values may be empty in case the item has got no content (but just a title); do NOT search for "</outline": if you need indentation values, count the tabs instead (instead of deleting them that is).

This will give you the IDs, the titles and the flags (or an empty value for the flag); from that, you can select or deselect the IDs to be processed or not; then you create an xml file from what you will have got.

Then you will need the formatted contents, after the respective titles (and in case there is such content, i.e. IF the ID has got a corresponding sub-sub-folder within the "yourpath\none_StoredContent" sub-folder.

If that is the case for the ID in question, you then process the html body within the .html file within that sub-sub-folder, deleting the unnecessary html codes and converting the needed (for formatting) html codes to their xml equivalents, then you insert the product after the respective title in your xml file.


Thus, UR will natively create 1-file-XML export, but just of the un-formatted kind, as far as the item contents are concerned (So this is similar, in a way, to CSV export, and it's also similar to some un-formatted flat html export I got in some way from my previous tries here.), and in order to preserve the content text formattings in your final output, it seems you have to go the way I have described here. (?)

Alternatively, you could, additionally to your UR content's native rtf formattings, insert "markdown" (just google "markdown") or similar codes, which then would be preserved into UR export's text-only xml format, and thus could be further processed within the UR export .xml file, in order to match the requested xml target format, i.e. without you having to write the script accessing, analyzing and transforming the formatted UR-created .html files(' contents).


EDIT: You could further drill down the content processing by inserting code characters, e.g. AFTER the content part to be exported from the respective item, and/or paragraph lead characters which would indicate that the paragraph is not to be exported (i.e. natively to the UR export bodies yes, but then to be discarded by the scripts you'll need anyway).

That being said, I would have preferred UR doing formatted (!) xml export natively, it would have been so much easier; here again, I think it's a little bit unfortunate to spare the user frequent (i.e. up to "yearly" even) paid updates (i.e. version numbers before the dot), for a program that almost ANY of its users will use it as their main PC application, almost any minute their PC is "on", in an age (i.e. the "Twenty-Twenties") where much lesser applications have - even successfully for many of them - forced subscriptions upon their users. Don't take me wrong here, subscriptions are utter nasty, even if they allow to to continue to access your "stuff" after the subscription may have ended (for whatever reason), but I'd be happy to UR to develop from "good", "very good even, considering the competition" to something really smooth... and I'd be eager to pay for every substantial update ($50 every 18 months or so would be a steal indeed)... as my findings show, there would be so much room for further, active development... PAID development that is.

((My html syntax checker virtually runs amok upon UR's html output... which seems to be from 2006? (UR's html output... not my syntax checker)... and many a user could very probably do with something much more modern in that field as well?))

Last edited by Spliff; 06-27-2021 at 06:33 AM.
Reply With Quote