Removing quotes within quotes (CSV)

Removing quotes within quotes (CSV)

3
NewbieNewbie
3

    Jun 25, 2007#1

    Hello all..

    I am using UE version 12.20.

    I am trying to remove quotes within quotes on a rather large data file that we need to process into our DB.

    For some reason this one is give me a run for my money I am sure it's an easy one for the experts here!!

    Sample text:

    ,"003","N041100","","LASER WELD WIRE 10""LG","350906",

    in this line the quotes (in bold) are within another set of quotes which is bad. I would like to search and remove this condition from my data. The quotes will always occur within another set of qoutes that are located between two comma's.

    Ok condition , "",
    This is acceptable and should not be removed.

    Any ideas? Thanks for any help you can provide!!
    Kizzy

    29
    Basic UserBasic User
    29

      Jun 26, 2007#2

      My first thought is, why try to remove the quotes and mangle your data in the process? Your sample text is valid CSV (quotes within each value are escaped by doubling them), so any tools designed to work with CSV should parse it correctly. For example:

      Code: Select all

      Plain text:  LASER WELD WIRE 10"LG
      CSV encoded:  "LASER WELD WIRE 10""LG"
      If you really must remove the quotes for some reason, I think it would be easier to use a tool designed to work with CSV files. For example, open the CSV file in Excel, search+replace the quotes, and re-save as CSV.

      3
      NewbieNewbie
      3

        Jun 26, 2007#3

        scallanh I am with you 100% we are importing about 2.2 GB of data from AS400 land into a SQL Server 2007 DB via an extracted CVS file. For some crazy reason SQL Server 2007 throws up all over the quotes within the quotes. Researching online says this seems to be a know bug. /sigh

        The replace idea has potential will give it a go!

        344
        MasterMaster
        344

          Jun 26, 2007#4

          hi dudes,

          if I understood this correctly, a replace like "no comma, double quotes, no comma" should solve the problem.
          Perl regexp:
          find:

          Code: Select all

          ([^,])\"\"([^,])
          replace with:

          Code: Select all

          $1$2
          so you get:

          Code: Select all

          ,"003","N041100","","LASER WELD WIRE 10LG","350906",
          Is that ok ?

          rds Bego
          Normally using all newest english version incl. each hotfix. Win 10 64 bit

          236
          MasterMaster
          236

            Jun 26, 2007#5

            Hi bego,

            clever idea :)

            However, you don't need to escape the quotes.

            Code: Select all

            ([^,])""([^,])
            should work too...

            And of course if you use negative lookaround, you can search for

            Code: Select all

            (?<!,)""(?!,)
            and replace with nothing (or something like "inch" etc.) which is a LOT faster.

            Cheers,
            Tim

            3
            NewbieNewbie
            3

              Jun 26, 2007#6

              You guys rock! I really need to start learning the Perl RegExpressions aspect. I mostly do very basic stuff using the Unix Reg. Thanks for all your help!!

              Kizzy

              344
              MasterMaster
              344

                Jun 26, 2007#7

                Hi Tim,

                you're right. Thx for the tip.

                kizzy, you can do the same with UE regexp or Unix regexp. Perl is just my personal favorite style. Glad to hear it works.

                Bego
                Normally using all newest english version incl. each hotfix. Win 10 64 bit