Best way to gather, then import data into drupal? - mysql

I am building my first database driven website with Drupal and I have a few questions.
I am currently populating a google docs excel spreadsheet with all of the data I want to eventually be able to query from the website (after it's imported). Is this the best way to start?
If this is not the best way to start what would you recommend?
My plan is to populate the spreadsheet then import it as a csv into the mysql db via the CCK Node.
I've seen two ways to do this.
http://drupal.org/node/133705 (importing data into CCK nodes)
http://drupal.org/node/237574 (Inserting data using spreadsheet/csv instead of SQL insert statements)
Basically my question(s) is what is the best way to gather, then import data into drupal?
Thanks in advance for any help, suggestions.

There's a comparison of the available modules at http://groups.drupal.org/node/21338
In the past when I've done this I simply write code to do it on cron runs (see http://drupal.org/project/phorum for an example framework that you could strip down and build back up to do what you need).
If I were to do this now I would probably use the http://drupal.org/project/migrate module where the philosophy is "get it into MySQL, View the data, Import via GUI."

There is a very good module for this, node import. It allows you to take your GoogleDocs spreadsheet and import it as a .csv file.
It's really easy to use, the module allows you to map your .csv columns to the node fields you want them to go to, so you don't have to worry about setting your columns in a particular order. Also, if there is an error on some records, it will spit out a .csv with the error files and what caused the error, but will import all good records.
I have imported up to 3000 nodes with this method.

Related

Bulk insert CSV file into Access DataBase through powershell

We are trying to import CSV file into Access Database via Powershell. My input file size is 1GB and it is getting difficult to iterate through each row and use insert command. Any quick suggestions here are highly appreciated
Thanks!!
as expresssed by #AlbertD.Kallal - what is the reason to use powershell at all? ... I simply made an assumption that you sought something that would run automatically, daily, unattended - - - as that is a typical reason.
if that is the case then it really breaks down to 2 parts:
make the import work manually in Access - - and then set up that import to fire automatically upon start/open of the Access file (auto exec).
just use powershell to start/open the Access file daily (or whenever...).
Access is not designed to be open full time and run unattended. So this is the typical approach to use it in that mode.
Ok, now having stated no need for power-shell, there are cases in which the IT folks and people are using power-shell to automate processes. So it not "bad" to consider power-shell - especially if it is being used.
I only wanted to point out that PowerShell will not help performance wise - and probably will be slower.
If you have (had) to say schedule a import to occur every 15 min or whatever?
Then I suggest setting up a VBA routine in a standard code module in Access to do the import. You then in the power shell, or windows script launch access, and then call that import routine. So, first step is to setup that routine in Access - even if using some kind of batch system for scheduling that import routine to run.
So, you use the windows scheduler.
It would: launch access, run the VBA sub, shutdown Access.
And using the windows scheduler is quite robust. So, we don't need (or want) to keep access running, but only launch it, run the import, and then shutdown access.
Next up, if the import process is "huge" or rather large, then on startup, a temp accDB file can be created, and then we import into that. We then can take the import table and send it into the production data table. (often column names are different etc. It also of course much safer to import into that temp table, and better yet, we can delete that temp file after - and thus we never suffer bloating or file size problems (no need to compact + repair).
So, the first thing to do is manually import the csv file using the Access UI. This ALSO allows you to create + setup a import spec. That import spec can thus remember the data types (currency, or often date time columns).
Once we have the import working and the import spec created?
Then we now can write code to do the above same steps, and THEN take the imported table and put that data into the production data table.
It it not clear if you "stage" the imported csv into that temp table, and then process that table into the real production data table, but I do suggest doing this.
(too dangerous to try and import directly into the production data).
you also don't share right now what kind of pre-processing, or what additonal code is required after you do the import of that csv (but, still, we assume now that such imports will be into a new temp table).
So, I would assume the steps are:
we import the csv file using built in import abiity of Acces
we then send this data table to the production table, perhaps with some code processing of each row before we send that temp table to production table.
Once we done the import, then we dump + delete the temp accDB file we used for the import, and thus we eliminate the huge data bloat issue.
Thus for the next time, then we create that temp file for a fresh import, and thus each time we start out with a nice empty database file.
So the first question, and you can create a blank new database for this test. Do you or can you import the csv file using Access. You want to do this, since such imports are VERY fast and VERY high speed. Even if the imported format is not 100% as you want, you do need to confirm and try if using the access UI you can import the csv file. if you can, then we can adopt VBA commands to do the same thing, but no use writing code if a simple csv import via Access UI can't be used.

How to Import a CSV file into an entity automatically?

Is there a way to import a CSV file into a CRM record automatically, say when the CSV file is created?
The plan is that this CSV file would have some cost center hours inside and job number which corresponds to a certain record already created in CRM.
And uploading this CSV would then update this record.
Please help me to solve this problem
You can import data both using UI and code in D365.
Also there are plenty of tools solving exactly same problem: KingswaySoft SSIS, Scribe, etc.
But it looks like buying 3rd party software might be overkill in your scenario. You can use Windows task scheduler and write a few PowerShell scripts to implement it.
Where to start:
https://learn.microsoft.com/en-us/dynamics365/customerengagement/on-premises/developer/import-data
https://learn.microsoft.com/en-us/dynamics365/customerengagement/on-premises/developer/sample-import-data-complex-data-map
https://github.com/seanmcne/Microsoft.Xrm.Data.PowerShell
https://learn.microsoft.com/en-us/dynamics365/customerengagement/on-premises/developer/define-alternate-keys-entity

Bulk uploading data to Parse.com

I have about 10GB worth of data that I would like to import to Parse. the data is currently in JSON format which is great for importing data using the parse importer.
However I have no unique identifier to these objects. Of course they have unique properties e.g. a url, the ids pointing to specific objects need to be constant.
What would be the best way to edit the large amount of data -in bulk- on their server without running into request issues (as I'm currently on the free pricing model) and without taking too much time to alter the data.
Option 1
Import the data once and export the data in JSON with the newly assigned objectIds. Then edit them locally matching the url then replace the class with the new edited data. Any new editions will receive a new objectId by Parse.
How much downtime between import and export will there be as I would need to delete the class and recreate it? Are there any other concerns with this methodology?
Option 2
Query for the URL or array of URLs and then edit the data then re-save. This means the data will persist indefinitely but as the edit will consist of hundreds of thousands of objects will this most likely over run the request limit?
Option 3
Is there a better option I am missing?
The best option is to upload to Parse then edit through their normal channels. Using various hacks it is possible to stay below the 30pings/second offered as part of the free tier. You can iterate over the data using background jobs (written in Javascript) -- you may need to slow down your processing so you don't hit limits. The super hacky way is to download from the table to a client (iOS/Android) app and then push back up to Parse. If you do this in batch (not a synchronous for loop, by the way), then the latency alone will keep you under the 30ping/sec limit.
I'm not sure why you're worried about downtime. If the data isn't already uploaded to Parse, can't you upload it, pull it down and edit it, and re-upload it -- taking as long as you'd like? Do this in a separate table from any you are using in production, and you should be just fine.

Methods of populating a database with content automatically?

I want to populate a mysql database with basic 'holiday resort' content e.g. name of the resort, description, country. What methods can i use to populate it?
First, find the Web sites of one or more holiday companies who offer the destinations you're interested in. You're going to scrape these.
You don't say what language you're using for the implementation, but here is how you might do it in Perl:
Write a scraper using LWP::UserAgent and HTML::TreeBuilder to explore the site and
extract the destination information.
Use DBI with the DBD::MySQL
driver to insert the data into your database.
Where is the content? You can use LOAD DATA INFILE syntax, or import from a CSV file, or write a script in Java or C++ or C# to parse the file(s) holding the data and populate the database via INSERT statements. You could hire and intern and make him/her type it all up. If you don't have it in a file, you could write a web-spider to go and get crap from Google and stuff it into the database.
But I can't help you until you tell me where the data is.
Ok, I know I'm late but I created an application to (create and/or) auto-populate tables. Below is a short demonstration but check it out here if you want.

SSIS Updating User Variables from a CSV file

I am fairly new to SSIS and I have been looking everywhere for the answer to this question and can't find it, which makes me think its really simple and obvious, because I'm pretty sure this is a standard problem with SSIS.
I am building a SSIS package to automate the uploading of data.
We have a multi-instance environment across four servers and are using SQL Server 2005. I therefore have a user variable for the server name and instance name. The database and table will always remain the same. The data is held in an excel file, but I will import the data using CSV.
Is there a way for me to update the user variables from the CSV file? Is TSQL - 'Open rowset' the way forward?
I had previously been updating the variables from the table I had imported the data into, but then I realised in a live situation I wont know where to import the data to, as the values will still be in the CSV file.
Please help! This is driving me crazy and I have a sinking feel that the answer is really obvious which is making it worse!!
Thank you!
Julie
There is a good example here:
http://vsteamsystemcentral.com/cs/blogs/applied_team_system/archive/2007/01/10/247.aspx
of how to load a user variable from a flat file.