I am going round in circles, please can someone help with what I guess is a relatively easy problem.
I have a table with 1200 users.
One of the fields in table db_user_accounts is 'status'
I have a sub list of those users in a csv that I want to set the 'status' to '5'
the csv is ordered user, status
I found this -
<?php
if (($handle = fopen("input.csv", "r")) !== FALSE)
{
while (($data = fgetcsv($handle, 1000, ",")) !== FALSE)
{
mysql_query(UPDATE db-user_accounts SET status="{$data[1]}" WHERE user = "{$data[0]}");
}
fclose($handle);
}
?>
Im not sure what the 1000 is for or whether this will actually work.
Any advice gratefully recieved
thanks
This code seems good and should work. 1000 is the length of the line, as from the PHP manual (quoting below):
Ref:http://php.net/manual/en/function.fgetcsv.php
Must be greater than the longest line (in characters) to be found in
the CSV file (allowing for trailing line-end characters). Otherwise,
the line is split into chunks of length characters unless the split
would occur inside an enclosure.
Omitting this parameter (or setting it to 0 in PHP 5.1.0 and later)
the maximum line length is not limited, which is slightly slower.
What your code does is:
Open the CSV file in reading mode, if the file is opened successfully it will enter into the loop
It will then start reading line by line (considering line length up to 1000) till the end of the file. The third parameter , is a delimiter.
The output variable $data will contain the values read from the current line i.e. it will hold user account id and status.
Then you are running MySQL query to update the database.
And finally closes the opened CSV file.
Now, advice:
You are directly passing the values read from CSV file into plain SQL query, doing so may cause unwanted errors, worst SQL injection
What I will suggest is to perform some kind of input validation on the data read from the CSV file and then use parameterized SQL query.
Also, you are using MySQL functions which are deprecated now. Use either Mysqli or PDO.
Related
I have a spreadsheet which really has only one complicated table. I basically convert the spreadsheet to a cvs and use a groovy script to generate the INSERT scripts.
However, I cannot do this with a table that has 28 fields with data within some of the fields on the spreadsheet that make importing into the CVS even more complicated. So the fields in the new CVS are not differentiated properly or my script has not accounted for it.
Does anyone have any suggestions on a better approach to do this? Thanks.
Have a look at LOAD DATA INFILE statement. It will help you to import data from the CSV file into table.
This is a recurrent question on stackoverflow. Here is an updated answer.
There are actually several ways to import an excel file in to a MySQL database with varying degrees of complexity and success.
Excel2MySQL or Navicat utilities. Full disclosure, I am the author of Excel2MySQL. These 2 utilities aren't free, but they are the easiest option and have the fewest limitations. They also include additional features to help with importing Excel data into MySQL. For example, Excel2MySQL automatically creates your table and automatically optimizes field data types like dates, times, floats, etc. If your in a hurry or can't get the other options to work with your data then these utilities may suit your needs.
LOAD DATA INFILE: This popular option is perhaps the most technical and requires some understanding of MySQL command execution. You must manually create your table before loading and use appropriately sized VARCHAR field types. Therefore, your field data types are not optimized. LOAD DATA INFILE has trouble importing large files that exceed 'max_allowed_packet' size. Special attention is required to avoid problems importing special characters and foreign unicode characters. Here is a recent example I used to import a csv file named test.csv.
phpMyAdmin: Select your database first, then select the Import tab. phpMyAdmin will automatically create your table and size your VARCHAR fields, but it won't optimize the field types. phpMyAdmin has trouble importing large files that exceed 'max_allowed_packet' size.
MySQL for Excel: This is a free Excel Add-in from Oracle. This option is a bit tedious because it uses a wizard and the import is slow and buggy with large files, but this may be a good option for small files with VARCHAR data. Fields are not optimized.
For comma-separated values (CSV) files, the results view panel in Workbench has an "Import records from external file" option that imports CSV data directly into the result set. Execute that and click "Apply" to commit the changes.
For Excel files, consider using the official MySQL for Excel plugin.
A while back I answered a very similar question on the EE site, and offered the following block of Perl, as a quick and dirty example of how you could directly load an Excel sheet into MySQL. Bypassing the need to export / import via CSV and so hopefully preserving more of those special characters, and eliminating the need to worry about escaping the content.
#!/usr/bin/perl -w
# Purpose: Insert each Worksheet, in an Excel Workbook, into an existing MySQL DB, of the same name as the Excel(.xls).
# The worksheet names are mapped to the table names, and the column names to column names.
# Assumes each sheet is named and that the first ROW on each sheet contains the column(field) names.
#
use strict;
use Spreadsheet::ParseExcel;
use DBI;
use Tie::IxHash;
die "You must provide a filename to $0 to be parsed as an Excel file" unless #ARGV;
my $sDbName = $ARGV[0];
$sDbName =~ s/\.xls//i;
my $oExcel = new Spreadsheet::ParseExcel;
my $oBook = $oExcel->Parse($ARGV[0]);
my $dbh = DBI->connect("DBI:mysql:database=$sDbName;host=192.168.123.123","root", "xxxxxx", {'RaiseError' => 1,AutoCommit => 1});
my ($sTableName, %hNewDoc, $sFieldName, $iR, $iC, $oWkS, $oWkC, $sSql);
print "FILE: ", $oBook->{File} , "\n";
print "DB: $sDbName\n";
print "Collection Count: ", $oBook->{SheetCount} , "\n";
for(my $iSheet=0; $iSheet < $oBook->{SheetCount} ; $iSheet++)
{
$oWkS = $oBook->{Worksheet}[$iSheet];
$sTableName = $oWkS->{Name};
print "Table(WorkSheet name):", $sTableName, "\n";
for(my $iR = $oWkS->{MinRow} ; defined $oWkS->{MaxRow} && $iR <= $oWkS->{MaxRow} ; $iR++)
{
tie ( %hNewDoc, "Tie::IxHash");
for(my $iC = $oWkS->{MinCol} ; defined $oWkS->{MaxCol} && $iC <= $oWkS->{MaxCol} ; $iC++)
{
$sFieldName = $oWkS->{Cells}[$oWkS->{MinRow}][$iC]->Value;
$sFieldName =~ s/[^A-Z0-9]//gi; #Strip non alpha-numerics from the Column name
$oWkC = $oWkS->{Cells}[$iR][$iC];
$hNewDoc{$sFieldName} = $dbh->quote($oWkC->Value) if($oWkC && $sFieldName);
}
if ($iR == $oWkS->{MinRow}){
#eval { $dbh->do("DROP TABLE $sTableName") };
$sSql = "CREATE TABLE IF NOT EXISTS $sTableName (".(join " VARCHAR(512), ", keys (%hNewDoc))." VARCHAR(255))";
#print "$sSql \n\n";
$dbh->do("$sSql");
} else {
$sSql = "INSERT INTO $sTableName (".(join ", ",keys (%hNewDoc)).") VALUES (".(join ", ",values (%hNewDoc)).")\n";
#print "$sSql \n\n";
eval { $dbh->do("$sSql") };
}
}
print "Rows inserted(Rows):", ($oWkS->{MaxRow} - $oWkS->{MinRow}), "\n";
}
# Disconnect from the database.
$dbh->disconnect();
Note:
Change the connection ($oConn) string to suit, and if needed add a
user-id and password to the arguments.
If you need XLSX support a quick switch to Spreadsheet::XLSX is all
that's needed. Alternatively it only takes a few lines of code, to
detect the filetype and call the appropriate library.
The above is a simple hack, assumes everything in a cell is a string
/ scalar, if preserving type is important, a little function with a
few regexp can be used in conjunction with a few if statements to
ensure numbers / dates remain in the applicable format when written
to the DB
The above code is dependent on a number of CPAN modules, that you can install, assuming outbound ftp access is permitted, via a:
cpan YAML Data::Dumper Spreadsheet::ParseExcel Tie::IxHash Encode Scalar::Util File::Basename DBD::mysql
Should return something along the following lines (tis rather slow, due to the auto commit):
# ./Excel2mysql.pl test.xls
FILE: test.xls
DB: test
Collection Count: 1
Table(WorkSheet name):Sheet1
Rows inserted(Rows):9892
I'm having some difficulties creating a table in Google BigQuery using CSV data that we download from another system.
The goal is to have a bucket in the Google Cloud Platform that we will upload a 1 CSV file per month. This CSV files have around 3,000 - 10,000 rows of data, depending on the month.
The error I am getting from the job history in the Big Query API is:
Error while reading data, error message: CSV table encountered too
many errors, giving up. Rows: 2949; errors: 1. Please look into the
errors[] collection for more details.
When I am uploading the CSV files, I am selecting the following:
file format: csv
table type: native table
auto detect: tried automatic and manual
partitioning: no partitioning
write preference: WRITE_EMPTY (cannot change this)
number of errors allowed: 0
ignore unknown values: unchecked
field delimiter: comma
header rows to skip: 1 (also tried 0 and manually deleting the header rows from the csv files).
Any help would be greatly appreciated.
This usually points to the error in the structure of data source (in this case your CSV file). Since your CSV file is small, you can run a little validation script to see that the number of columns is exactly the same across all your rows in the CSV, before running the export.
Maybe something like:
cat myfile.csv | awk -F, '{ a[NF]++ } END { for (n in a) print n, "rows have",a[n],"columns" }'
Or, you can bind it to the condition (lets say if your number of columns should be 5):
ncols=$(cat myfile.csv | awk -F, 'x=0;{ a[NF]++ } END { for (n in a){print a[n]; x++; if (x==1){break}}}'); if [ $ncols==5 ]; then python myexportscript.py; else echo "number of columns invalid: ", $ncols; fi;
It's impossible to point out the error without seeing an example CSV file, but it's very likely that your file is incorrectly formatted. As a result, one typo confuses BQ into thinking there are thousands. Let's say you have the following csv file:
Sally Whittaker,2018,McCarren House,312,3.75
Belinda Jameson 2017,Cushing House,148,3.52 //Missing a comma after the name
Jeff Smith,2018,Prescott House,17-D,3.20
Sandy Allen,2019,Oliver House,108,3.48
With the following schema:
Name(String) Class(Int64) Dorm(String) Room(String) GPA(Float64)
Since the schema is missing a comma, everything is shifted one column over. If you have a large file, it results in thousands of errors as it attempts to inserts Strings into Ints/Floats.
I suggest you run your csv file through a csv validator before uploading it to BQ. It might find something that breaks it. It's even possible that one of your fields has a comma inside the value which breaks everything.
Another theory to investigate is to make sure that all required columns receive an appropriate (non-null) value. A common cause of this error is if you cast data incorrectly which returns a null value for a specific field in every row.
As mentioned by Scicrazed, this issue seems to be generated as some file rows has an incorrect format, in which case it is required to validate the content data in order to figure out the specific error that is leading this issue.
I recommend you to check the errors[] collection that might contains additional information about the aspects that can be making to fail the process. You can do this by using the Jobs: get method that returns detailed information about your BigQuery Job or refer to the additionalErrors field of the JobStatus Stackdriver logs that contains the same complete error data that is reported by the service.
I'm probably too late for this, but it seems the file has some errors (it can be a character that cannot be parsed or just a string in an int column) and BigQuery cannot upload it automatically.
You need to understand what the error is and fix it somehow. An easy way to do it is by running this command on the terminal:
bq --format=prettyjson show -j <JobID>
and you will be able to see additional logs for the error to help you understand the problem.
If the error happens only a few times you just can increase the number of errors allowed.
If it happens many times you will need to manipulate your CSV file before you upload it.
Hope it helps
I'm importing a csv file of contacts and where one parent has many children it leaves the duplicated values blank. I need to make sure that they are populated when they reach the database however.
Is there a way that I can implement the following when I'm importing a .csv file into Perl and then exporting into MySQL?
if (value is null)
value = value above.
Thanks!
Why don't you place the individual values you read from the CSV file into an array (e.g. #FIELD_DATA). Then when you encounter an empty field while iterating over a row (e.g. for column 4) you can write
unless (length($CSV_FIELD[4])) {
$CSV_FIELD[4] = $FIELD_DATA[4]
}
Not with an import statement afaik. You could, however, make use of triggers (http://dev.mysql.com/doc/refman/5.0/en/triggers.html). Keep in mind though, that this will seriously impact the performance of the import statement.
Also: if they are duplicate values you should have a critical look at your database model or your setup overall.
ICE Version: infobright-3.5.2-p1-win_32
I’m trying to load a large file but keep running into problems with errors such as:
Wrong data or column definition. Row: 989, field: 5.
This is row 989, field 5:
”(450)568-3***"
Note: The last 3 chars are numbers as well, but didn’t want to post somebodys phone number on here.
It’s really no different to any of the other entries in that field.
The datatype of that field is VARCHAR(255) NOT NULL
Also, if you upgrade to the current release 4.0.6, we now support row level error checking during LOAD and support a reject file.
To enable the reject file functionality, you must specify BH_REJECT_FILE_PATH and one of the associated parameters (BH_ABORT_ON_COUNT or BH_ABORT_ON_THRESHOLD). For example, if you want to load data from the file DATAFILE.csv to table T but you expects that 10 rows in this file might be wrongly formatted, you would run the following commands:
set #BH_REJECT_FILE_PATH = '/tmp/reject_file';
set #BH_ABORT_ON_COUNT = 10;
load data infile DATAFILE.csv into table T;
If less than 10 rows are rejected, a warning will be output, the load will succeed and all problematic rows will be output to the file /tmp/reject_file. If the Infobright Loader finds a tenth bad row, the load will terminate with an error and all bad rows found so far will be output to the file /tmp/reject_file.
I've run into this issue when the last line of the file is not terminated by the value of --lines-terminated-by="\n".
For example If I am importing a file with 9000 lines of data I have to make sure there is a new line at the end of the file.
Depending on the size of the file, you can just open it with a text editor and hit the return k
I have found this to be consistent with the '\r\n' vs. '\n' difference. Even when running on the loader on Windows, '\n' succeeds 100% of times (assuming you don't have real issues with your data vs. col. definition)
I'm using SSIS and trying to import data from Filelmaker into SQL Server. In the Solution Explorer, I right click on "SSIS Packages" and select SQL Server Import and Export Wizard". During the process, I use my DSN as the source, SQL Server as the destination, use a valid query to pull data from Filemaker, and set the mappings.
Each time I try to run the package, I receive the following message:
The "output column "LastNameFirst" (12)" has a length that is not valide. The length must be between 0 and 4000.
I do not understand this error exactly, but in the documentation for ODBC:
http://www.filemaker.com/downloads/pdf/fm9_odbc_jdbc_guide_en.pdf (page 47) it states:
"The maximum column length of text is 1 million characters, unless you specify a smaller Maximum number of characters for the text field in FileMaker. FileMaker returns empty strings as NULL."
I'm thinking that the data type is too large when trying to convert it to varchar. But even after using a query of SUBSTR(LastNameFirst, 1, 2000), I get the same error.
Any suggestions?
I had this problem, and don't know the cause but these are the steps I used to find the offending row:
-in filemaker, export the data to CSV
-open the CSV in excel
-double click on the LastNameFirst column to maximize its width
-scroll down until you see a column '#########' -the way excel indicates data that is too large to be displayed.
I'm sure theres a better way, and I'd love to hear it!
You should use this:
nvarchar (max)