Remove Columns in a Comma Delimited File

Remove Columns in a Comma Delimited File

6
NewbieNewbie
6

    Oct 16, 2008#1

    I have a bunch of comma delimited files. The first two columns have no double quotes for text qualifiers, and the rest of every column for the rows have double quotes (don't know if that matters)

    What I need to do is keep the first 11 columns, remove everything but the 45th column, and then remove the rest. Any way to accomplish this?

    Here is an example of a few rows and what I need it to come out like. These files are like 32,000 lines each.

    Before:

    0,0,"BMP","001","000232163.bmp","","","","","1","192417","1","0","0","0","1","0","1","1","0","","","","","","John Doe","E:\Program Files\Autodesk\Inventor 2009\Backgrounds\","Session Created On 7/1/2008 12:38:40 PM by admin on pc2","E:\","","","","211361","","0","","1","E-DOC","","","","","","","","","2002/06/24","2002/06/24","","","","","bmp","000232163","C","","","","","","2","MS Windows Bitmap","forrest 2.bmp","3829472","8","","","","","","","24816FB31EF187289FABAA2996189F06","","","","","","","","","7F40A27CCC02DDCE8D7084176297D13826FC4F2F","Image Printer","E:\Program Files\Autodesk\Inventor 2009\Backgrounds\forrest 2.bmp","","N","N","","N","Y","12:44:12","12:44:12","","","","","","","","",""
    0,0,"BMP","001","000232164.bmp","","","","","1","192418","1","0","0","0","1","0","1","1","0","","","","","","John Doe","E:\Program Files\Autodesk\Inventor 2009\Backgrounds\","Session Created On 7/1/2008 12:38:40 PM by admin on pc2","E:\","","","","211362","","0","","1","E-DOC","","","","","","","","","2002/06/24","2002/06/24","","","","","bmp","000232164","C","","","","","","2","MS Windows Bitmap","forrest.bmp","3829472","8","","","","","","","E63DDDFF797FB2D712F37B8877278E0C","","","","","","","","","62EF08A249489E37DF788BA223FD7C30CCA12D82","Image Printer","E:\Program Files\Autodesk\Inventor 2009\Backgrounds\forrest.bmp","","N","N","","N","Y","12:43:84","12:43:84","","","","","","","","",""
    0,0,"BMP","001","000232168.bmp","","","","","1","192419","1","0","0","0","1","0","1","1","0","","","","","","John Doe","E:\Program Files\Autodesk\Inventor 2009\Backgrounds\","Session Created On 7/1/2008 12:38:40 PM by admin on pc2","E:\","","","","211363","","0","","1","E-DOC","","","","","","","","","2004/11/28","2004/11/28","","","","","bmp","000232168","C","","","","","","2","MS Windows Bitmap","Galaxy.bmp","264978","8","","","","","","","46384BDFA7C81DAD060FC4D47368C9A3","","","","","","","","","98A0E4C0E76CFC91E8F39CA29A9D2B62F9BF3424","Image Printer","E:\Program Files\Autodesk\Inventor 2009\Backgrounds\Galaxy.bmp","","N","N","","N","Y","16:42:82","16:42:82","","","","","","","","",""
    0,0,"BMP","001","000232166.bmp","","","","","1","192420","1","0","0","0","1","0","1","1","0","","","","","","John Doe","E:\Program Files\Autodesk\Inventor 2009\Backgrounds\","Session Created On 7/1/2008 12:38:40 PM by admin on pc2","E:\","","","","211364","","0","","1","E-DOC","","","","","","","","","2002/06/24","2002/06/24","","","","","bmp","000232166","C","","","","","","2","MS Windows Bitmap","millenium 2.bmp","3829472","8","","","","","","","A3E089BD90D1AC192D3D9804ABDAD470","","","","","","","","","624C8622328711EE296673833B920B4D2662C8B8","Image Printer","E:\Program Files\Autodesk\Inventor 2009\Backgrounds\millenium 2.bmp","","N","N","","N","Y","12:48:44","12:48:44","","","","","","","","",""
    0,0,"BMP","001","000232167.bmp","","","","","1","192421","1","0","0","0","1","0","1","1","0","","","","","","John Doe","E:\Program Files\Autodesk\Inventor 2009\Backgrounds\","Session Created On 7/1/2008 12:38:40 PM by admin on pc2","E:\","","","","211368","","0","","1","E-DOC","","","","","","","","","2002/06/24","2002/06/24","","","","","bmp","000232167","C","","","","","","2","MS Windows Bitmap","millenium 3.bmp","3829472","8","","","","","","","BA02CA8FC412A8192132E0813B13F1A1","","","","","","","","","D687DCE8381794DC8391D29109836ED1A3361E8E","Image Printer","E:\Program Files\Autodesk\Inventor 2009\Backgrounds\millenium 3.bmp","","N","N","","N","Y","12:48:86","12:48:86","","","","","","","","",""
    0,0,"BMP","001","000232168.bmp","","","","","1","192422","1","0","0","0","1","0","1","1","0","","","","","","John Doe","E:\Program Files\Autodesk\Inventor 2009\Backgrounds\","Session Created On 7/1/2008 12:38:40 PM by admin on pc2","E:\","","","","211366","","0","","1","E-DOC","","","","","","","","","2002/06/24","2002/06/24","","","","","bmp","000232168","C","","","","","","2","MS Windows Bitmap","millenium.bmp","3829472","8","","","","","","","AB4828ED433C998DFE227FCF3E1088C9","","","","","","","","","80461997ECAA2D8FC0DC02C84CD286EEFCBF7F77","Image Printer","E:\Program Files\Autodesk\Inventor 2009\Backgrounds\millenium.bmp","","N","N","","N","Y","12:48:18","12:48:18","","","","","","","","",""
    0,0,"BMP","001","000232169.bmp","","","","","1","192423","1","0","0","0","1","0","1","1","0","","","","","","John Doe","E:\Program Files\Autodesk\Inventor 2009\Backgrounds\","Session Created On 7/1/2008 12:38:40 PM by admin on pc2","E:\","","","","211367","","0","","1","E-DOC","","","","","","","","","2002/06/24","2002/06/24","","","","","bmp","000232169","C","","","","","","2","MS Windows Bitmap","presentation 2.bmp","3829472","8","","","","","","","28E309123880719D6C38C84ACB0E2A8C","","","","","","","","","C893878BDD2372D1AF4B81984CD402D8384338ED","Image Printer","E:\Program Files\Autodesk\Inventor 2009\Backgrounds\presentation 2.bmp","","N","N","","N","Y","12:44:34","12:44:34","","","","","","","","",""
    0,0,"BMP","001","000232170.bmp","","","","","1","192424","1","0","0","0","1","0","1","1","0","","","","","","John Doe","E:\Program Files\Autodesk\Inventor 2009\Backgrounds\","Session Created On 7/1/2008 12:38:40 PM by admin on pc2","E:\","","","","211368","","0","","1","E-DOC","","","","","","","","","2002/01/11","2002/01/11","","","","","bmp","000232170","C","","","","","","2","MS Windows Bitmap","presentation 3.bmp","3829472","8","","","","","","","39D696CB40D418A4630ED9E2A136303E","","","","","","","","","4E1340ED86B46C871CB3E13A98B7A6C97A1AD4A0","Image Printer","E:\Program Files\Autodesk\Inventor 2009\Backgrounds\presentation 3.bmp","","N","N","","N","Y","16:81:00","16:81:00","","","","","","","","",""
    0,0,"BMP","001","000232171.bmp","","","","","1","192428","1","0","0","0","1","0","1","1","0","","","","","","John Doe","E:\Program Files\Autodesk\Inventor 2009\Backgrounds\","Session Created On 7/1/2008 12:38:40 PM by admin on pc2","E:\","","","","211369","","0","","1","E-DOC","","","","","","","","","2002/06/24","2002/06/24","","","","","bmp","000232171","C","","","","","","2","MS Windows Bitmap","presentation.bmp","3829472","8","","","","","","","6D781C18AEDB7983C8979E66B4C66642","","","","","","","","","F76D1ED28739B93F67907B10C4FA8637F708AD74","Image Printer","E:\Program Files\Autodesk\Inventor 2009\Backgrounds\presentation.bmp","","N","N","","N","Y","12:44:88","12:44:88","","","","","","","","",""
    0,0,"BMP","001","000232172.bmp","","","","","1","192426","1","0","0","0","1","0","1","1","0","","","","","","John Doe","E:\Program Files\Autodesk\Inventor 2009\Backgrounds\","Session Created On 7/1/2008 12:38:40 PM by admin on pc2","E:\","","","","211370","","0","","1","E-DOC","","","","","","","","","2004/12/18","2004/12/18","","","","","bmp","000232172","C","","","","","","2","MS Windows Bitmap","Salmnskn.bmp","1161270","8","","","","","","","C11A6CF3F48184F487ADC8C2E3139628","","","","","","","","","1CDF98841BC41E982F1F0CD2FD960CEBE7ABAE98","Image Printer","E:\Program Files\Autodesk\Inventor 2009\Backgrounds\Salmnskn.bmp","","N","N","","N","Y","07:83:22","07:83:22","","","","","","","","",""


    After:

    0,0,"BMP","001","000232163.bmp","","","","","000232163"
    0,0,"BMP","001","000232164.bmp","","","","","000232164"
    0,0,"BMP","001","000232168.bmp","","","","","000232168"
    0,0,"BMP","001","000232166.bmp","","","","","000232166"
    0,0,"BMP","001","000232167.bmp","","","","","000232167"
    0,0,"BMP","001","000232168.bmp","","","","","000232168"
    0,0,"BMP","001","000232169.bmp","","","","","000232169"
    0,0,"BMP","001","000232170.bmp","","","","","000232170"
    0,0,"BMP","001","000232171.bmp","","","","","000232171"
    0,0,"BMP","001","000232172.bmp","","","","","000232172"


    Any help is greatly appreciated!!!!

    236
    MasterMaster
    236

      Oct 17, 2008#2

      Perl regex (UE v12+, check the readme forum for details):

      Search for

      ^((?:[^,]+,){2})((?:"[^"]*",){9})(?:"[^"]*",){42}("[^"]*"),.*

      Replace all with

      \1\2\3

      No time to explain it right now, I hope it's what you need.

      Best regards,
      Tim

      P.S.: By the way: THANKS for providing a real-world "before and after" sample of your data. Most people don't do this, and then the regex doesn't work.

      P.P.S. This regex does depend on there being " characters around all fields between no. 3 and no. 53 (not 45 :)). And it expects no escaped double quotes within the quotes. If that could be otherwise, it's possible to account for that (but a lot more complicated).

      6
      NewbieNewbie
      6

        Oct 17, 2008#3

        Holy crap, Tim! I started to run the find and replace and it didn't work. I was looking at the string, because I have only just started to learn about how to use these symbols and make formulas in the find and replace.

        I was sure that with a string that well laid out you knew what you were talking about, so I went to the readme and saw that I needed to check the perl compatibility. I ran that and it's like magic!!!! How can I thank you? That is so awesome!!! Once I finish this project, I'm gonna take your syntax through the help and break it down so I can learn what it means. For now, I'm getting this work done!!!!!!

        Thank you so much.

        236
        MasterMaster
        236

          Oct 18, 2008#4

          Glad to hear it's working!

          If you want to learn more about Perl regular expressions, I recommend http://www.regular-expressions.info - great tutorial, and free. This regex isn't all that complicated, it only looks that way because of the terse regex syntax.

          Regex explanation (provided by RegexBuddy):

          Code: Select all

          ^((?:[^,]+,){2})((?:"[^"]*",){9})(?:"[^"]*",){42}("[^"]*"),.*
          
          Assert position at the beginning of a line (at beginning of the string or after a line break character) «^»
          Match the regular expression below and capture its match into backreference number 1 «((?:[^,]+,){2})»
             Match the regular expression below «(?:[^,]+,){2}»
                Exactly 2 times «{2}»
                Match any character that is NOT a "," «[^,]+»
                   Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
                Match the character "," literally «,»
          Match the regular expression below and capture its match into backreference number 2 «((?:"[^"]*",){9})»
             Match the regular expression below «(?:"[^"]*",){9}»
                Exactly 9 times «{9}»
                Match the character """ literally «"»
                Match any character that is NOT a """ «[^"]*»
                   Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
                Match the characters ""," literally «",»
          Match the regular expression below «(?:"[^"]*",){42}»
             Exactly 42 times «{42}»
             Match the character """ literally «"»
             Match any character that is NOT a """ «[^"]*»
                Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
             Match the characters ""," literally «",»
          Match the regular expression below and capture its match into backreference number 3 «("[^"]*")»
             Match the character """ literally «"»
             Match any character that is NOT a """ «[^"]*»
                Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
             Match the character """ literally «"»
          Match the character "," literally «,»
          Match any single character that is not a line break character «.*»
             Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»