I am receiving data into an REST API, and I want to insert it as XML code into a database. When I later read the record from the database, I expect well-formed XML code.
data=str(request.data)
cur=mydb.cursor()
currentDT = datetime.datetime.now()
val=data.replace("'","''")
cur.execute("insert into MyXMLApi(dateofInsertion,xmlData) values('%s','%s')"%(str(currentDT),val))
mydb.commit()
This is what I expect to see in the database:
"<note>
Don't forget me this weekend!
</note>"
However, this is what I actually get:
'b"<note>
Don''t forget me this weekend!
</note>"'
So I have three issues here.
I have to deal with single quotes in the XML code.
It should be stored as proper XML code.
When I read from the database, I can't get the right code.
request.data is a bytestring in Flask. (See property data and get_data() in the docs.) But you want to save it as non-byte, just plain string to your database. Converting with str() is not the way to do it.
Assuming you want a UTF-8 string, replace your first line with
data=request.data.decode('UTF-8')
Then you will be able to save it to your database.
About the single quotes, I don't think you should escape them yourself. Use parameter binding, and the library will do it for you.
(By the way, this sounds like a very strange use-case. Why not store data in your table as field note?)
Related
I'm going to a pick a csv file in BizTalk and after some process I wanted to update it with two or more different systems.
In order to getting the csv file, I'm using the default Flatfile Disassembler for breaking it and constructing it as XML with the help of genereted schema. I can do that successfully with some consistent data however if I use a data with comma in it (other than delimiters), BizTalk fails!
Any other way to do this without using a custom pipeline component?
Expecting a simple configuration within the flatfile disassembler component!
So, here's the deal. BizTalk is not failing. Well, it is, but that is the expected and correct behavior.
What you have in an invalid CSV file. The CSV specification disallows the comma in field data unless a wrap character is used. Either way, both are reserved characters.
To accept the comma in field data, you must choose a wrap character and set that in the Wrap Character property in the Flat File Schema.
This is valid:
1/1/01,"Smith, John", $5000
This is not:
1/1/01,Smith, John, $5000
Since your schema definition has ',' as delimiter, flat file disassembler will consider the data with comma as two fields and will fail due to mismatch in columns.
You have few options:
Either add a new field to schema if you know , in data will only be present in a particular field.
Or change the delimiter in flat file from , to |(pipe) or some other character so that data does not conflict with delimiter.
Or as you mentioned manipulate the flat file in a custom pipeline component, which should be last resort if above two are not feasible.
I have a csv file that looks like this:
varCust_id,varCust_name,varCity,varStateProv,varCountry,varUserId,varUsername
When I run the HTTP Post Request to create a new customer, I get a JSON response. I am extracting the cust_id and cust_name using the json extractor. How can I enter this new value into the csv for the correct variable? For example, after creating the customer, the csv would look like this:
varCust_id,varCust_name,varCity,varStateProv,varCountry,varUserId,varUsername
1234,My Customer Name
Or once I create a user, the file might look like this:
varCust_id,varCust_name,varCity,varStateProv,varCountry,varUserId,varUsername
1234,My Customer Name,,,,9876,myusername
In my searching through the net, I have found ways and I'm able to append these extracted variables to a new line but in my case, I need to replace the value in the correct location so it is associated to the correct variable I have set up in the csv file.
I believe what you're looking to do can be done via a BeanShell PostProcessor and is answered here.
Thank you for the reply. I ended up using User Defined Variables for some things and BeanShell PreProcessors for other bits vs. using the CSV.
Well, never tried this. But what you can do is create all these variables and set them to Null / 0.
Once done, update these during your execution. At the end, you can concatenate these with any delimiter (say ; or Tab) and just push in CSV as a single string.
Once you got data in CSV, you can easily split in Ms excel.
Importing csv in Rapidminer is not loading data properly in the attributes/ columns and returns errors.
I have set the parameter values correctly in the 'Data Import Wizard'.
Column Separation is set to comma and when I check the "Use Quotes" parameter I see that there are too many "?" appear in the columns even though there is data in the actual csv file.
And when I do not check the “Use Quotes” option then I notice that the content of the columns are distributed across different columns, i.e., data does not appear in the correct column. It also gives error for the date column.
How to resolve this? Any suggestions please? I saw a lot of Rapidminer videos and read about it but did not help.
I am trying to import twitter conversations data which I exported from a 3rd party SaaS tool which extracts Twitter data for us.
Could someone help me soon please? Thanks, Geeta
It's virtually impossible to debug this without seeing the data.
The use quotes option requires that each field is surrounded by double quotes. Do not use this if your data does not contain these because the input process will import everything into the first field.
When you use comma as the delimiter, the observed behaviour is likely to be because there are additional commas contained in the data. This seems likely if the data is based on Twitter. This confuses the import because it is just looking for commas.
Generally, if you can get the input data changed, try to get it produced using a delimiter that cannot appear in the raw text data. Good examples would be | or tab. If you can get quotes around the fields, this will help because it allows delimiter characters to appear in the field.
Dates formats can be handled using the data format parameter but my advice is to import the date field as a polynominal and then convert it later to date using the Nominal to Date operator. This gives more control especially when the input data is not clean.
I have a CSV file that I need to format (i.e., turn into) a SQL file for ingestion into MySQL. I am looking for a way to add the text delimiters (single quote) to the text, but not to the numbers, booleans, etc. I am finding it difficult because some of the text that I need to enclose in single quotes have commas themselves, making it difficult to key in to the commas for search and replace. Here is an example line I am working with:
1239,1998-08-26,'Severe Storm(s)','Texas,Val Verde,"DEL RIO, PARKS",'No',25,"412,007.74"
This is FEMA data file, with 131246 lines, I got off of data.gov that I am trying to get into a MySQL database. As you can see, I need to insert a single quote after Texas and before Val Verde, so I tried:
s/,/','/3
But that only replaced the first occurrence of the comma on the first three lines of the file. Once I get past that, I will need to find a way to deal with "DEL RIO, PARKS", as that has a comma that I do not want to place a single quote around.
So, is there a "nice" way to manipulate this data to get it from plain CSV to a proper SQL format?
Thanks
CSV files are notoriously dicey to parse. Different programs export CSV in different ways, possibly including strangeness like embedding new lines within a quoted field or different ways of representing quotes within a quoted field. You're better off using a tool specifically suited to parsing CSV -- perl, python, ruby and java all have CSV parsing libraries, or there are command line programs such as csvtool or ffe.
If you use a scripting language's CSV library, you may also be able to leverage the language's SQL import as well. That's overkill for a one-off, but if you're importing a lot of data this way, or if you're transforming data, it may be worthwhile.
I think that I would also want to do some troubleshooting to find out why the CSV import into MYSql failed.
I would take an approach like this:
:%s/,\("[^"]*"\|[^,"]*\)/,'\1'/g
:%s/^\("[^"]*"\|[^,"]*\)/'\1'/g
In words, look for a double quoted set of characters or , \|, a non-double quoted set of characters beginning with a comma and replace the set of characters in a single quotation.
Next, for the first column in a row, look for a double quoted set of characters or , \|, a non-double quoted set of characters beginning with a comma and replace the set of characters in a single quotation.
Try the csv plugin. It allows to convert the data into other formats. The help includes an example, how to convert the data for importing it into a database
Just to bring this to a close, I ended up using #Eric Andres idea, which was the MySQL load data option:
LOAD DATA LOCAL INFILE '/path/to/file.csv'
INTO TABLE MYTABLE FIELDS TERMINATED BY ',' LINES TERMINATED BY '\r\n';
The initial .csv file still took a little massaging, but not as much as I were to do it by hand.
When I commented that the LOAD DATA had truncated my file, I was incorrect. I was treating the file as a typical .sql file and assumed the "ID" column I had added would auto-increment. This turned out to not be the case. I had to create a quick script that prepended an ID to the front of each line. After that, the LOAD DATA command worked for all lines in my file. In other words, all data has to be in place within the file to load before the load, or the load will not work.
Thanks again to all who replied, and #Eric Andres for his idea, which I ultimately used.
I have a .csv that I need to convert to a coldfusion query. I have used the cflib.org CSVtoQuery method which works fine... but...
If there is a 'cell' in the csv that includes a comma in the string, such as a list, the query row for that record gets messed up as it sees the comma in the string as a new value.
I have no control over how the data is going in, so I can't have it written or passed inside quotes or the like.
Does anyone know if there is a way to process a .csv (convert to a query or other workable struct) that may have commas in the values?
No, there isn't. Whoever is making the CSV is not making it properly. No CSV parser can tell the difference between commas that separate and commas that don't if there is no way to tell the difference.
Whoever is making the file should choose a different delimiter.