I am still learning SQL Server.
The scenario is that I have a lot of .txt files with name format like DIAGNOSIS.YYMMDDHHSS.txt and only the YYMMDDHHSS is different from file to file. They are all saved in folder Z:\diagnosis.
How could I write a stored procedure to upload all .txt files with a name in the format of DIAGNOSIS.YYMMDDHHSS.txt in folder Z:\diagnosis? Files can only be loaded once.
Thank you
I would not do it using a stored proc. I would use SSIS. It has a for each file task you can use. When the file has been loaded, I would move it to an archive location so that it doesn't get processed the next time. Alternatively you could create a table where you store the names of the files that were successfully processed and have the for each file loop skip any in that table, but then you just keep getting more and more files to loop through, better to move processed ones to a different location if you can.
And personally I also would put the file data in a staging table before loading the data to the final table. We use two of them, one for the raw data and one for the cleaned data. Then we transform to staging tables that match the relational tables in production to make sure the data will meet the needs there before trying to affect production and send exceptions to an exception table of records that can't be inserted for one reason or another. Working in the health care environment you will want to make sure your process meets the government regulations for storage of patient records for the country you are in if they exist (See HIPAA in the US). You may have to load directly to production or severely limit the access to staging tables and files.
Related
I am loading multiple excel files into multiple SQL server table.
these excel files are the city, state, etc. Is it possible to create a single flat file connection? currently, I am creating for city and state and for other src. attached the screenshot for better understanding.
If all of the source files are structured exactly the same (number of columns, data types, header rows, etc.), it would be possible to re-use a single flat file connection.
In general, though, the best practice is to create a connection for each file. It makes the package easier to understand in a year or two when you, or someone else, has to open it up again to fix or update it. It will also make it much easier to fix later if one or more of the files ends up changing (new columns, etc.).
I have a front end MS Access Database that allows users to display records. This is based on a central .txt file that I have stored on a network drive. Each morning, I perform an ETL process that takes data from a variety of sources, combines it, and stores it in the .txt file. If someone is using the database and has one of the queries open that references this .txt file, I get an error that says the file is currently in use and I cannot make a change to it.
I've attempted writing a query that pulls the data from the .txt file and stores it in a table within the Access database so that it is not actively referencing the base file, but if multiple people attempt to run this query they get errors because you cannot replace a table that is open by someone else.
Does anyone have any experience in getting around an update issue like this? Would I be better off choosing a different front end other than Access to display queried records from a central table?
I have a client who got a zipped file that has all the database they had in the SaaS app they were using. Now, we have a similar app but our column names are different (obviously) and in some cases we have less columns. So, now i want to upload all this data to my database but i am not sure how to do it?
I run phpmyadmin on the servers.
Extract the file on your desktop.
Login to your phpMyAdmin account.
Click the Import tab.
Select the file to import, file format, ect. and click Go.
Browse through the structure of the imported database to the columns of interest. For each column, click the pencil icon to edit the column (i.e. rename it), or click the X icon to delete it.
To merge data sets, after importing the tables, you would need to run your own query in the SQL tab to merge the data sets.
That are two different tasks in one question,
phpMyAdmin is able to import ZIP-files directly – you don't need to extract them on your local machine. Also be aware of max upload sizes and maximum script execution times, when importing huge database dumps.
To map an existing database to another structure involves a lot of manual work, like renaming tables and columns and copying data from on table to another.I would suggest, you import the old/original database to some "working copy" database and have your new database separate. That way you can use MySQL-features (INSERT INTO new_db.YX … SELECT XY_a FROM old_db.XY) to copy the data where it should go.
Well first you need to take a look at the data files and see how the columns/tables differ. After you sort that out you can go about about figuring out how it insert the data. If the files are large and there are quite a few i wouldnt use phpmyadmin. I'd ssh into the box and use the command line client or set the DB up for remote access and use a local copy of the client.
If youre lucky you won't have to do any processing on the data and you just map values from the old columns to the new columns as part of you LOAD DATA INFILE statement. Whatever you do youll want to test all this on a dummy db(s) first before you go running it in a live environment.
I'm working on a membership site where users are able to upload a csv file containing sales data. The file will then be read, parsed, and the data will be charted. Which will allow me to dynamically create charts
My question is how to handle this csv upload? Should it be uploaded to folder and stored for later or should it be directly inserted into a MySQL table?
Depends on how much processing needs to be done, I'd say. if it's "short" data and processing is quick, then your upload-handling script should be able to take care of it.
If it's a large file and you'd rather not tie up the user's browser/session while the data's parsed, then do the upload-now-and-deal-with-it-later option.
It depends on how you think the users will use this site.
What do you estimate the size of the files for these users to be?
How often would they (if ever) upload a file twice, can they download the charts?
If the files are small and more for one-off use you could upload it and process it on the fly, if they require repetitive access and analysis then you will save the users time by importing the data to the database.
The LOAD DATA INFILE command in MySQL handles uploads like that really nice.If you make the table you want to upload it to and then use that command it has worked great and super quick for me. I've loaded several thousand rows of data in under 5 seconds using it.
http://dev.mysql.com/doc/refman/5.5/en/load-data.html
Is it possible to perform any sort of indirection in SSIS?
I have a series of jobs performing FTP and loops through the files before trying to run another DTSX package on them. Currently this incurs a lot of repeated cruft to pull down the file and logging.
Is there any way of redesigning this so I only need one package rather than 6?
Based on your comment:
Effectively the 6 packages are really 2 x 3. 1st for each "group" is FTP pull
down and XML parsing to place into flat tables. Then 2nd then transforms and
loads that data.
Instead of downloading files using one package and inserting data into tables using another package, you can do that in a single package.
Here is a link containing an example which downloads files from FTP and saves it to local disk.
Here is a link containing an example to loop through CSV files in a given folder and inserts that data into database.
Since you are using XML files, here is a link that shows how to loop through XML files.
You can effectively combine the above examples into a single package by placing the control flow tasks one after the other.
Let me know if this is not what you are looking for.