Put XML child elment into Variable

Put XML child elment into Variable

74
Advanced UserAdvanced User
74

    Dec 31, 2019#1

    Hi I have this problem that I first thought was easy, but now I don't. I have a file with multiple procedure elements. Within each element are multiple clstep elements. I need to capture the procedure element and iterate through each clstep element and modify that elements children. I've got the capture of the procedure element figured out, but I'm drawing a blank on how to capture and manipulate the clstep children. Does UES have any type of XML node functions?
    Thank you for your help,
    Max

    Sample XML:

    Code: Select all

            <procedure id="proc1" next-id="proc2">
                <title>Crewstation</title>
                    <clstep1 id="clstepAtt">
                        <challenge>HDR</challenge>
                        <response>MUTE</response>
                        <role>P, Cole</role>
                        <page></page>
                        <pane></pane>
                    </clstep1>
                    <clstep1>
                        <challenge>Light switch</challenge>
                        <response>OFF</response>
                        <role>P, SO</role>
                        <page></page>
                        <pane></pane>
                    </clstep1>
                    <end/>
                </procedure>
    Code so far

    Code: Select all

    function XMLConv() {
        //Define environment for this script.
    
        if (UltraEdit.document.length > 0) //Is any file opened?
        {
            //  Define environment for this script.
            UltraEdit.insertMode();
            if (typeof(UltraEdit.columnModeOff) == "function") UltraEdit.columnModeOff();
            else if (typeof(UltraEdit.activeDocument.columnModeOff) == "function") UltraEdit.activeDocument.columnModeOff();
            UltraEdit.selectClipboard(8);
            //Get document index number
            var nActiveFileIndex = UltraEdit.activeDocumentIdx;
            UltraEdit.perlReOn();
            UltraEdit.document[nActiveFileIndex].top();
            UltraEdit.document[nActiveFileIndex].findReplace.mode = 0;
            UltraEdit.document[nActiveFileIndex].findReplace.matchCase = false;
            UltraEdit.document[nActiveFileIndex].findReplace.matchWord = false;
            UltraEdit.document[nActiveFileIndex].findReplace.regExp = true;
            UltraEdit.document[nActiveFileIndex].findReplace.searchDown = true;
            UltraEdit.document[nActiveFileIndex].findReplace.searchInColumn = false;
    
            UltraEdit.document[nActiveFileIndex].top();
            while (UltraEdit.document[nActiveFileIndex].findReplace.find("<procedure")) {
    
                //Remember in two variables the current caret position of
                //the chapter beginning in file. There is nothing selected!
                var nChLine = UltraEdit.document[nActiveFileIndex].currentLineNum;
                //UltraEdit.activeDocument.key("HOME");
                var nChColumn = UltraEdit.document[nActiveFileIndex].currentColumnNum;
    
                if (UltraEdit.document[nActiveFileIndex].findReplace.find("</procedure>")) {
                    UltraEdit.activeDocument.key("RIGHT ARROW");
                    UltraEdit.document[nActiveFileIndex].gotoLineSelect(nChLine, nChColumn);
    
                    var procedure = UltraEdit.document[nActiveFileIndex].selection;
    
                    var Regex = /<clstep.*?>.*?<\/clstep1>/img;
                    var clStep = Regex.exec(procedure);
                    while (clStep != null) {
                        UltraEdit.outputWindow.write("clStep " + clStep[0]);
                        clStep = Regex.exec(procedure);
                    }
                    UltraEdit.outputWindow.write("Outside of clStep while");
                }
            }
        }
    }
    XMLConv();

    6,675585
    Grand MasterGrand Master
    6,675585

      Jan 03, 2020#2

      The main problem is the line

      Code: Select all

      var Regex = /<clstep.*?>.*?<\/clstep1>/img;
      The dot does not match the newline characters carriage return and line-feed. For that reason this JavaScript regular expression object on execution of a procedure block string does not find anything to match. One solution would be using the flag s as documented on RegExp documentation page. But it depends on version of JavaScript core and therefore on version of UltraEdit if this flag is supported at all.

      The flag m is counter-productive for this regular expression search and the flag i is not needed here because of XML files are case-sensitive and for that reason a slower case-insensitive search should be never necessary on processing an XML file with a script or macro using finds/replaces searching for the case-sensitive tags in XML file.

      The solution working with any version of JavaScript core engine is:

      Code: Select all

      var Regex = /<clstep.*?>[\s\S]*?<\/clstep\d+>/g;
      The character class definition [\s\S] matches any whitespace character (including carriage return and line-feed) or any non-whitespace character, or in other words really any character. The regular expression object with this regular expression search string and flag g should be defined once outside of all loops for best efficiency.

      Another mistake is the fact that method exec() returns not a single string, but an array of strings, or more precise an array of a structure with first element being the found string, element index being the character index in input string on which the found string begins and element input being the entire input string searched by the method. This can be seen on using function var_dump which is a function added by UltraEdit and UEStudio to every script executed by UE/UES in the script file created in temporary files directory with all other script files included on using the special // include directive before the script is really executed by built-in JavaScript core engine.

      Here is your script with the small changes as described above.

      Code: Select all

      function XMLConv() {
          //Define environment for this script.
      
          if (UltraEdit.document.length > 0) // Is any file opened?
          {
              //  Define environment for this script.
              UltraEdit.insertMode();
              if (typeof(UltraEdit.columnModeOff) == "function") UltraEdit.columnModeOff();
              else if (typeof(UltraEdit.activeDocument.columnModeOff) == "function") UltraEdit.activeDocument.columnModeOff();
              UltraEdit.selectClipboard(8);
              // Get document index number
              var nActiveFileIndex = UltraEdit.activeDocumentIdx;
              UltraEdit.perlReOn();
              UltraEdit.document[nActiveFileIndex].top();
              UltraEdit.document[nActiveFileIndex].findReplace.mode = 0;
              UltraEdit.document[nActiveFileIndex].findReplace.matchCase = true;
              UltraEdit.document[nActiveFileIndex].findReplace.matchWord = false;
              UltraEdit.document[nActiveFileIndex].findReplace.regExp = true;
              UltraEdit.document[nActiveFileIndex].findReplace.searchDown = true;
              UltraEdit.document[nActiveFileIndex].findReplace.searchInColumn = false;
      
              var Regex = /<clstep.*?>[\s\S]*?<\/clstep1>/g;
              UltraEdit.document[nActiveFileIndex].top();
              while (UltraEdit.document[nActiveFileIndex].findReplace.find("<procedure")) {
      
                  // Remember in two variables the current caret position of
                  // the chapter beginning in file. There is nothing selected!
                  var nChLine = UltraEdit.document[nActiveFileIndex].currentLineNum;
                  //UltraEdit.activeDocument.key("HOME");
                  var nChColumn = UltraEdit.document[nActiveFileIndex].currentColumnNum;
      
                  if (UltraEdit.document[nActiveFileIndex].findReplace.find("</procedure>")) {
                      UltraEdit.activeDocument.key("RIGHT ARROW");
                      UltraEdit.document[nActiveFileIndex].gotoLineSelect(nChLine, nChColumn);
      
                      var procedure = UltraEdit.document[nActiveFileIndex].selection;
      
                      var clStep = Regex.exec(procedure);
                      if (clStep) var_dump(clStep);
                  }
              }
          }
      }
      XMLConv();
      
      However, on processing a small XML file with less than 20 MB it would be most likely better to load entire file content into memory and process the XML file in memory, and on making changes, write the processed data back to file overwriting entire file content. That is much faster as it avoids window updates and produces only one undo step to restore the file content before script execution.

      Here is the code for this alternate method to process a small XML file:

      Code: Select all

      function XMLConv() {
          //  Define environment for this script.
          UltraEdit.insertMode();
          if (typeof(UltraEdit.columnModeOff) == "function") UltraEdit.columnModeOff();
          else if (typeof(UltraEdit.activeDocument.columnModeOff) == "function") UltraEdit.activeDocument.columnModeOff();
          // Select entire file content.
          UltraEdit.activeDocument.selectAll();
      
          // Get the file content as array of strings with splitting the
          // file on each occurrence of end tag of XML element procedure.
          var asProcedures = UltraEdit.activeDocument.selection.split("</procedure>");
          if (!asProcedures)
          {
              UltraEdit.activeDocument.top();
              return;
          }
      
          var rStep = /<clstep.*?>[\s\S]*?<\/clstep[0-9]+>/g;
      
          // The last string in array must be always ignored.
          var nProcedureCount = asProcedures.length - 1;
      
          for (var nProcedureIndex = 0; nProcedureIndex < nProcedureCount; nProcedureIndex++)
          {
              var asStep = asProcedures[nProcedureIndex].match(rStep);
              if (asStep)
              {
                  for (var nStepIndex = 0; nStepIndex < asStep.length; nStepIndex++)
                  {
                      UltraEdit.outputWindow.write("Procedure " + nProcedureIndex +
                                                   ", step " + nStepIndex + ":\n" +
                                                   asStep[nStepIndex]);
                  }
              }
              // Append the end tag removed on file content splitting.
              asProcedures[nProcedureIndex] += "</procedure>";
          }
          // UltraEdit.activeDocument.write(asProcedures.join(""));
      }
      
      if (UltraEdit.document.length > 0) XMLConv();
      
      The entire file content is selected and split into substrings on each occurrence of string </procedure>. The last string in array of strings is the file content after last </procedure> which can be also an empty string. This last string in array must be ignored for that reason.

      There is again a regular expression search executed on each procedure block to get the clstep1 elements because I am not sure if the number is really fixed in real XML file, i.e. there are clstep1, clstep2, etc. within a procedure element. The string method split() with string </clstep1> would be otherwise more efficient to get an array of clstep1 elements (with procedure data on first element and last string being ignored).

      On each procedure block is missing </procedure> at end because of string method split() removes this string. This must be taken into account on finally joining the procedure blocks again and writing entire file content back to the file overwriting still selected entire file content. In the example code a currently commented line would append </procedure> to each procedure block.

      Attention: The simple usage of string method split() on processing an XML file can be used only if no element can exist nested on which split on end tag is done. In other words the splitting method cannot be used if a procedure element contains itself a procedure element or a clstep contains itself a clstep element.

      UltraEdit has an XML parser built-in used for XML Manager view. But the DOM (document object model) of the XML file is not accessible from within the scripting environment in UltraEdit for Windows v26.20.0.68 and all former versions to parse the nodes and modify the node objects to reformat an XML file based on its XML nodes by a script.
      Best regards from an UC/UE/UES for Windows user from Austria

      74
      Advanced UserAdvanced User
      74

        Jan 03, 2020#3

        Mofi this is great! Thank you so much. I agree loading the whole document into memory is much easier.

        I am having one problem though. I never selects more than two steps to cycle through. I got the length of the csteps and it's 33. I then added some small code to manipulate it.

        Code: Select all

        var clstep = asStep[nStepIndex];
                        clstep = clstep.replace(/<clstep1>/img, '<crewDrillStep stepLabel="' + a + '">');
                        clstep = clstep.replace(/<clstep1 conditional\s? ="(\d+)">/img, '<crewDrillStep crewStepCondition="$1" stepLabel"' + a + '\">');
        My results are only 2 clstep items are found and changed. I've verified there are more than two steps. I've added an example file with the XML mark up I'm trying to manipulate.

        As always thank you for all the help.
        Max

        6,675585
        Grand MasterGrand Master
        6,675585

          Jan 03, 2020#4

          I don't know what variable a is, but look on this code.

          Code: Select all

          function XMLConv() {
              //  Define environment for this script.
              UltraEdit.insertMode();
              if (typeof(UltraEdit.columnModeOff) == "function") UltraEdit.columnModeOff();
              else if (typeof(UltraEdit.activeDocument.columnModeOff) == "function") UltraEdit.activeDocument.columnModeOff();
              // Select entire file content.
              UltraEdit.activeDocument.selectAll();
          
              // Get the file content as array of strings with splitting the
              // file on each occurrence of end tag of XML element procedure.
              var asProcedures = UltraEdit.activeDocument.selection.split("</procedure>");
              if (!asProcedures)
              {
                  UltraEdit.activeDocument.top();
                  return;
              }
          
              var rStep = /<clstep.*?>[\s\S]*?<\/clstep[0-9]+>/g;
          
              // The last string in array must be always ignored.
              var nProcedureCount = asProcedures.length - 1;
          
              for (var nProcedureIndex = 0; nProcedureIndex < nProcedureCount; nProcedureIndex++)
              {
                  var asStep = asProcedures[nProcedureIndex].split("</clstep1>");
                  if (asStep)
                  {
                      var nStepCount = asStep.length - 1;
                      for (var nStepIndex = 0; nStepIndex < nStepCount; nStepIndex++)
                      {
                          var clstep = asStep[nStepIndex].replace(/<clstep1>/g, '<crewDrillStep stepLabel="' + (nStepIndex+1) + '">');
                          clstep = clstep.replace(/<clstep1 conditional\s* ="(\d+)">/g, '<crewDrillStep crewStepCondition="$1" stepLabel="' + (nStepIndex+1) + '\">');
                          asStep[nStepIndex] = clstep + "</crewDrillStep>";
                      }
                      asProcedures[nProcedureIndex] = asStep.join("");
                  }
                  // Append the end tag removed on file content splitting.
                  asProcedures[nProcedureIndex] += "</procedure>";
              }
              UltraEdit.activeDocument.write(asProcedures.join(""));
          }
          
          if (UltraEdit.document.length > 0) XMLConv();
          
          Best regards from an UC/UE/UES for Windows user from Austria

          74
          Advanced UserAdvanced User
          74

            Jan 14, 2020#5

            HI Mofi I've attached an output file. I've gone back to the basics and am now working with the original javascript file I pasted in the past submission. What seems to be a problem is the for loop is skipping every other <procedure> element. So I'll get an output window result saying proc1, proc3, proc5.  The other weird thing I the added </crewDrillStep> appearing twice in the output.

            Thank you for looking at this, and I'm sorry for the confusion.
            Max
            script and file.zip (3.71 KiB)   0
            PROC 5 XXX X_DRAFT-.xml (1.53 KiB)   0

            6,675585
            Grand MasterGrand Master
            6,675585

              Jan 15, 2020#6

              I have still my problems to understand the task. This in one input XML block to process:

              Code: Select all

              <procedure id="proc5">
                  <title>PROC 5 XXX X</title>
                  <note type="c">
                      <para>PROC 5 NOTEXXX X</para>
                  </note>
                  <clstep1>
                      <challenge>PROC 5 XXX X</challenge>
                      <response>PROC 5 XXX X</response>
                      <role>P, Comm</role>
                      <checkVar>PROC 5 XXX X</checkVar>
                      <page>PROC 5 XXX X</page>
                      <pane>PROC 5 XXX X</pane>
                  </clstep1>
                  <clstep1>
                      <challenge>XXX X</challenge>
                      <response>XXX X</response>
                      <role>Comm</role>
                      <checkVar>XXX X</checkVar>
                      <page>XXX X</page>
                      <pane>XXX X</pane>
                  </clstep1>
                  <clstep1>
                      <challenge>XXX X</challenge>
                      <response>XXX X</response>
                      <role>P, Comm</role>
                      <checkVar>XXX X</checkVar>
                      <page>XXX X</page>
                      <pane>XXX X</pane>
                  </clstep1>
                  <end/>
              </procedure>
              
              This is the expected output XML file for this block with correction of several mistakes to get a valid XML file:

              Code: Select all

              <content>
                  <crew id="proc5">
                      <crewRefCard>
                          <crewDrill>
                              <note type="c">
                                  <para>PROC 5 NOTE XXX X</para>
                              </note>
                              <crewDrillStep stepLabel="1">
                                  <crewMemberGroup>
                                      <crewMember crewMemberType="cm02"/>
                                      <crewMember crewMemberType="cm51"/>
                                  </crewMemberGroup>
                                  <crewProcedureName>
                                      <para></para>
                                  </crewProcedureName>
                                  <challengeAndResponse>
                                      <challenge>
                                          <para>PROC 5 XXX X</para>
                                      </challenge>
                                      <response>
                                          <para>PROC 5 XXX X</para>
                                      </response>
                                  </challengeAndResponse>
                                  <!-- <checkVar>PROC 5 XXX X</checkVar> -->
                                  <!-- <page>PROC 5 XXX X</page> -->
                                  <!-- <pane>PROC 5 XXX X</pane> -->
                              </crewDrillStep>
                              <crewDrillStep stepLabel="2">
                                  <crewMemberGroup>
                                      <crewMember crewMemberType="cm51"/>
                                  </crewMemberGroup>
                                  <crewProcedureName>
                                      <para></para>
                                  </crewProcedureName>
                                  <challengeAndResponse>
                                      <challenge>
                                          <para>XXX X</para>
                                      </challenge>
                                      <response>
                                          <para>XXX X</para>
                                      </response>
                                  </challengeAndResponse>
                                  <!-- <checkVar>XXX X</checkVar> -->
                                  <!-- <page>XXX X</page> -->
                                  <!-- <pane>XXX X</pane> -->
                              </crewDrillStep>
                              <crewDrillStep stepLabel="3">
                                  <crewMemberGroup>
                                      <crewMember crewMemberType="cm02"/>
                                      <crewMember crewMemberType="cm51"/>
                                  </crewMemberGroup>
                                  <crewProcedureName>
                                      <para></para>
                                  </crewProcedureName>
                                  <challengeAndResponse>
                                      <challenge>
                                          <para>XXX X</para>
                                      </challenge>
                                      <response>
                                          <para>XXX X</para>
                                      </response>
                                  </challengeAndResponse>
                                  <!-- <checkVar>XXX X</checkVar> -->
                                  <!-- <page>XXX X</page> -->
                                  <!-- <pane>XXX X</pane> -->
                              </crewDrillStep>
                              <end />
                          </crewDrill>
                      </crewRefCard>
                  </crew>
              </content>
              
              I can see some relationships by looking on input and output data as well as the code. But the input data is too garbled to ever get this output.

              For example there should be Pilot somewhere in input data block which is responsible for getting written into output file the XML element:

              Code: Select all

              <crewMember crewMemberType="cm02"/>
              I think your approach to get from input to output data is not very good which cause all the troubles. It would be much better to load entire input file as procedure blocks into the memory. Then run a loop interpreting the data of each procedure and create the output XML data block also in memory. For the clstep1 elements an inner loop is necessary to process the data and get the expected crewDrillStep elements finally in output file. That would be much more efficient and fail-safe .

              It would be good for me to have an input example with some procedure blocks for all possible variants to support by the script and the appropriate output files for this input file. The input and output files should contain the data in an anonymous form, but making it nevertheless possible that the script code can really be written to produce the expected output files. It would be good to use individual, but unique data in each element instead of XXX X everywhere. What about AAA 1, BBB 1, CCC 1, AAA 2, BBB 2, CCC 2, etc in input and output files? Important strings like Pilot must be present in input file.

              I offer to write the whole script for you completely new and fully commented based on a good input to output example. I know it is hard for a developer to stop a development on which already lots of hours have been invested and start from the beginning, but I think in this case that would be the best. The agile development process as used by you here as it looks like sometimes lead to this dramatic step to restart from the beginning.

              It is of course also possible to go further with this script and fix just the issues you have to get the expected output. That would be hard for me as I see so many issues with that code which are potential problematic on future changes, but if you want to keep the code and just the errors fixed, I will try my best to get your development approach finally working. But I need nevertheless an input file and the appropriate output files with data with which the script can really work with and which shows me that the output files produced by the script after fixing the errors contain correct all data.
              Best regards from an UC/UE/UES for Windows user from Austria

              74
              Advanced UserAdvanced User
              74

                Jan 15, 2020#7

                Hi Mofi, thank you for the help. I have the script working except the rearangeElements() function works on every other <cstep1> element. I can't figure out why. I've added the input, output, and script files attached. I'd really like to see your approach to making the script. I know mine is proving difficult.
                script and file.zip (6.41 KiB)   0

                6,675585
                Grand MasterGrand Master
                6,675585

                  Jan 17, 2020#8

                  I first revised the last version of script file parseXML_Submitted.js with fixing small errors and simplifying many regular expressions. This script file executed on file xmlExampleNormalProc.xml and MultipleProcs_demo.xml produces the same files as your last version with the only difference that the empty XML elements crewMember are correct written into the output files with /> at end and not wrong with \> and the two spaces around equal sign removed on attribute of this element. So this script is not the final solution, just an improved version of your script.

                  The next step is fixing the remaining issue you have with element clstep1 to get your script finally working and then rewrite the entire script to what I think is more efficient and fail-safe.
                  parseXML_Submitted.zip (3.86 KiB)   0
                  This ZIP file contains the improved version of your last version of script file parseXML_Submitted.js.
                  Best regards from an UC/UE/UES for Windows user from Austria

                  74
                  Advanced UserAdvanced User
                  74

                    Jan 17, 2020#9

                    Thank you Mofi

                    6,675585
                    Grand MasterGrand Master
                    6,675585

                      Jan 19, 2020#10

                      I could not easily see what are the remaining issues with your version of the script and so decided to rewrite the entire script.

                      The attached script file contains the rewritten script which produces the output files correct in my point of view. On comparing the output files of this script with the output files of the improved version of your last script, I could immediately see what are the remaining issues with your script approach. So I could try to fix those issues in your script, but decided to save the time and just upload the rewritten and commented script.

                      The indenting tabs can be removed by selecting all XML tag strings at top of the script and run on this selection a simple, none regular expression replace all searching for \t by an empty string.

                      Let me know if you have any questions on the script code to completely understand it for future similar tasks.
                      parseXML_Rewritten.zip (3.24 KiB)   0
                      Rewritten script to convert input XML file to the output XML files.
                      Best regards from an UC/UE/UES for Windows user from Austria

                      74
                      Advanced UserAdvanced User
                      74

                        Jan 27, 2020#11

                        Sorry I've been out of town. Mofi this is great. Thank you so much for taking the time to do this.

                        Max