When exporting a CSV from Access 2007, it automatically converts decimals into scientific notation.
Unfortunately the tool that receives them treats these fields as text, and displays them as is.
The values being exported are from a query being run against some Excel linked tables, and they appear perfectly in the query view.
Is there any way to disable the automatic conversion to scientific notation.
I.e. if it appears as 0.007 in the query, it will appear as 0.007 in the output csv rather then 7E3?
Note: I'm constrained to use Excel and Access for this. As much as I'd like to switch to SQL Server, my wife would be unhappy if I put it on her work laptop!
You have a couple of choices:
you can use the Format() function directly in your query to force the data in the offending columns to be formatted a certain way, for instance:
SELECT ID, Format([Price],"standard") as Pricing FROM ORDERS;
you can write your own CSV export routine in VBA.
I posted one recently as an answer to this question.
You can easily modify the code to format numeric types a certain way.
If you don't know how, let me know and I'll modify the code and post it here.
You could write a short amount of VBA code in access to query the data from the linked table or Access query and write it out to a text file, thus creating your own .CSV and foregoing the "Wizard". I never liked Access' export "wizard" much, and just created the files myself.
One easy way to handle this in a Query is to double-convert the value to long integer and then to string.
For CSV-export it is character anyway.
myValue:ZString(ZLong(123456789))
Related
I am unable to import csv table > DATEs columns to BigQuery,
DATEs are not recognized, even they have correct format according this docu
https://cloud.google.com/bigquery/docs/schema-detect YYYY-MM-DD
So DATEs columns are not recognized and are renamed to _2020-0122, 2020-01-23...
Is the issue that DATES are in 1st row as column name ?
But How can I then import dates, when I want use them in TimeSeries Charts (DataStudio) ?
here is sample source csv>
Province/State,Country/Region,Lat,Long,2020-01-22,2020-01-23,2020-01-24,2020-01-25,2020-01-026
Anhui,China,31.8257,117.2264,1,9,15,39,60
Beijing,China,40.1824,116.4142,14,22,36,41,68
Chongqing,China,30.0572,107.874,6,9,27,57,75
Here is ig from Bigquery
If you have finite number of days, you can try unpivot table when using it. See blog post.
otherwise, if you dont know how many day columns in csv file.
choose a unique character as csv delimiter then just load whole file into a single column staging table, then use split function. you'll also need unnest. This approach requires a full scan and will be more expensive, especially when file gets bigger.
The issue is that in column names you cannot have a date type, for this reason when the CSV is imported it takes the dates and transforms them to the format with underscores.
The first way to face the problem would be modifying the CSV file, because any import with the first row as a header will change the date format and then it will be harder to get to date type again. If you have any experience in any programming language you can do the transformation very easily. I can help doing this but I do not know your use case so maybe this is not possible. Where does this CSV come from?
If the CSV previous modification is not possible then the second option is what ktopcuoglu said, importing the whole file as one column and process this using SQL function. This is way harder than the first option and as you import all the data into a single column, all the data will have the same data type, what will be a headache too.
If you could explain where the CSV comes from we may be able to influence it before being ingested by BigQuery. Else, you'll need to deep into SQL a bit.
Hope it helps!
Hi, now I can help you further.
First I found some COVID datasets into the public bigquery datasets. The one you are taking from github is already in BigQuery, but there are many others that may work better for your task such as the one called “covid19_ecdc”, that is inside bigquery-public-data. This last one has the confirmed cases and deaths per date and country so it should be easy to make a time series.
Second, I found an interesting link performing what you meant with python and data studio. It’s a kaggle discussion so you may not be familiar with it, but it deserves a check for sure . Moreover, he is using the dataset you are trying to use.
Hope it helps. Do not hesitate to ask!
I often have to cleanse and import messy CSV and Excel files into my MS SQL Server 2014 (but the question would be the same if I were using Oracle or another database).
I have found a way to do this with Alteryx. Can you help me understand if I can do the same with Pentaho Kettle or SSIS? Alternatively, can you recommend another ETL software which addresses my points below?
I often have tables of, say, 100,000 records where the first 90,000 records may be null. Most ETL tools scan only the first few hundred records to guess data types and therefore fail to guess the types of these fields. Can I force Pentaho or SSIS to scan the WHOLE file before guessing types? I understand this may not be efficient for huge files of many GBs, but for the files I handle scanning the entire file is much better than wasting a lot of time trying to guess each field manually
As above, but with the length of a string. If the first 10,000 records are, say, a 3-character string but the subsequent ones are longer, SSIS and Pentaho tend to guess nvarchar(3) and the import will fail. Can I force them to scan all rows before guessing the length of the strings? Or, alternatively, can I easily force all strings to be nvarchar(x) , where I set x myself?
Alteryx has a multi-field tool, which is particularly convenient when cleansing or converting multiple fields. E.g. I have 10 date columns whose datatype was not guessed automatically. I can use the multi-field formula to get Alteryx to convert all 10 fields to date and create new fields called $oldfield_reformatted. Do Pentajho and SSIS have anything similar?
Thank you!
A silly suggestion. In Excel add a row at the top of the list that has a formula that creates a text string with the same length of the longest value in the column.
This formula entered as an array formula would do it..
=REPT("X",MAX(LEN(A:A)))
You could also use a more advanced VBA function to create other dummy values to force datatypes in SSIS.
I've not used SSIS or anything like it, but in the past I would have loaded a file into a table with columns ALL of varchar 1000 say so that all the data loaded, then processed it across into the main table using SQL that casts or removes the data values as I required.
This gives YOU Ultimate control not a package or driver. I was very surprised to hear how this works!
I'm migrating content out of an old proprietary database in a new more structured solution. The new solution asks for CSV files. For approval process -- to be checked by a human eye balls -- I need to have column names as the first line in this CSV file.
select b.Title as Title,
b.listinguuid as UID,
.
.
.
FROM b as biblioRecord
-- more join magic
INTO OUTFILE '/tmp/biblio-import.csv'
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n';
Given the above snippet form an otherwise larger statement, can I direct mysql to inlcude the column header as the first line?
Richard
Having looked at the MySQL docs for data output what you are asking doesn't look like it it is possible.
You have some options for data validation.
Assuming you have some form of scripting knowledge you Amy be able to create an internal stored procedure that will output the whole table (including column headings). If memory serves the script language is based in Java (not Javascript).
However why not ask if the validation can be done via a web interface, then there are a large number of tools (php my admin comes to mind) that can be used to view the tables (with header info). PhP myadmin may even be able to output the tables in CSV format for you :)
A better solution, depending on how much data needs to be validated, and what the constraints are, may be to create a dedicated set of validation scripts. This is something that you may be needing anyway as part of the larger project, it could be run after a system upgrade for example. You should talk to the client. In fact the script would be a better way to confirm everything has transfered correctly as it could compare the old and new databases directly, and report any anomalous results.
Other possibilities:
Do you have an XML schema for your your new database structure? If you do you could dump your data into an XML database, then viewing it in something like Xl, or use an xslt to present it in a web page.
Im sure there are other possibilities, but they are all going to involve some work to get to your desired end result. They will all be more time consuming, but will have other potentially useful knock on effects that need to be elucidated and presented to the client.
Personally if you have a lot of data go for some form of validation script, human eyes get tired looking at lote of rows of data, and tired eyes confuse brains and cause mistakes.
Hallo all.
I need to run the 'replace([column], [new], [old])' in a query executing on n Access 2003 DB. I know of all the equivalent stuff i could use in SQL, and believe me I would love to, but i don't have this option now. I'm trying to do a query where all the alpha chars are stripped out of a column ie. '(111) 111-1111' simply becomes '1111111111'. I can also write an awsum custom VBA function and execute the query using this, but once again, can't use these functions through JET. Any ideas?
Thanx for the replies guys. Ok let me clarify the situation. I'm running an .NET web application. This app uses an Access 2003 db. Im trying to do an upgrade where I incorporate a type of search page. This page executes a query like: SELECT * FROM [table] WHERE replace([telnumber], '-', '') LIKE '1234567890'. The problem is that there are many records in the [telnumber] column that has alpha chars in, for instance '(123) 123-1234'. This i need to filter out before i do the comparison. So the query using a built in VBA function executes fine when i run the query in a testing environment IN ACCESS, but when i run the query from my web app, it throws an exception stating something like "Replace function not found". Any ideas?
Based on the sample query from your comment, I wonder if it could be "good enough" to rewrite your match pattern using wildcards to account for the possible non-digit characters?
SELECT * FROM [table] WHERE telnumber LIKE '*123*456*7890'
Your question is a little unclear, but Access does allow you to use VBA functions in Queries. It is perfectly legal in Access to do this:
SELECT replace(mycolumn,'x','y') FROM myTable
It may not perform as well as a query without such functions embedded, but it will work.
Also, if it is a one off query and you don't have concerns about locking a bunch of rows from other users who are working in the system, you can also get away with just opening the table and doing a find and replace with Control-H.
As JohnFx already said, using VBA functions (no matter if built in or written by yourself) should work.
If you can't get it to work with the VBA function in the query (for whatever reason), maybe doing it all per code would be an option?
If it's a one-time action and/or not performance critical, you could just load the whole table in a Recordset, loop through it and do your replacing separately for each row.
EDIT:
Okay, it's a completely different thing when you query an Access database from a .net application.
In this case it's not possible to use any built-in or self-written VBA functions, because .net doesn't know them. No way.
So, what other options do we have?
If I understood you correctly, this is not a one-time action...you need to do this replacing stuff every time someone uses your search page, correct?
In this case I would do something completely different.
Even if doing the replace in the query would work, performance wise it's not the best option because it will likely slow down your database.
If you don't write that often to your database, but do a lot of reads (which seems to be the case according to your description), I would do the following:
Add a column "TelNumberSearch" to your table
Every time when you save a record, you save the phone number in the "TelNumber" column, and you do the replacing on the phone number and save the stripped number in the "TelNumberSearch" column
--> When you do a search, you already have the TelNumberSearch column with all the stripped numbers...no need to strip them again for every single search. And you still have the column with the original number (with alpha chars) for displaying purposes.
Of course you need to fill the new column once, but this is a one-time action, so looping through the records and doing a separate replace for each one would be okay in this case.
I have a problem loading the .CSV file as the connection manager editor settings are out of my knowledge.
When i load the .CSV file up to 18 rows i have no problem it is loading in to the table.
However, from the 19th column the data is not partioning correctly.
row delimeter is {CR}{LF}
column delimeter is Comma {,}
How can i partition the data correctly?
any help?
Here are some ideas I have with no details.
What happens when you try to import the same .CSV file into Excel? Anything interesting around row 19?
Does there appear to be anything different about row 19?
If you delete row 19, what happens?
See, I bet you've thought of these things as well, and probably more, since you have the details. If you want anything more than superficial bad guesses, you'll have to provide a little detail.
I've found the CSV Import to be a bit limited with regards to bad data. If you're having trouble with the 19th column, I would suggest figuring out why that column is failing. You can try and tell the import task's error conditions to Ignore Errors with data truncation, etc...but that may not fix the issue.
I have often switched complicated or error-prone CSV imports to simply use a SSIS Script Task, then just write my own code to parse out the CSV and handle bad data.
If it's not partitioning correctly, it might be something as trivial as one of your field values on row 19 containing a comma, thus throwing out the import by making that row seem to have more columns. If this is the case, I hope you can get a revised version of the CSV file - this time with a text qualifier set. If possible, use something like | rather than " as the qualifier so that it's less likely to appear in the field values.
Put the file in a text editor such as notepad++ or textpad and change the view to show control characters. You will probably find your culprit there.
Nothing unusuale. when i paste in excel as one column and converting text to column has no problem. but i can see in the SSIS preview the field value where the problem has started has two square boxs and data of the next row.
if any one want to see the file let me know i will e-mail you the file.