I need daily load the result of a specific query to Redshift. I've already created a table on redshift that will support the results of this query but now i'm a little stuck since i can't find a good way to solve this.
So far i tried using python but im getting lots of headaches regarding line terminators in fields that basically store a description, and character encodings.
I know lots of programs that allow you to connect to a db and make querys also have an export option to csv but since i need to automatically do this everyday i don't think any of those would work for me.
Now i would like to know if maybe there are better suited options so i can start looking into them. Im not asking for a step by step how to but just for tools/programs/etc that i should start looking into.
You should look into MySQL's stored procedures and events -- using just MySQL, you can have it generate a file every day.
You can't dynamically rename the file, or overwrite it, though, so you'd need a second job which deletes the file -- this can be done with Python.
Whether you're running Windows or Linux, you should be able to schedule a batch file or python script to execute once a day, and that would be an alternate way to do that.
Does this address your question?
Related
I ask this question, because I don't even have a clue what to google for. I have a MariaDB database which I access through node.js' mysql module. I write my code in TS.
My problem is, that the database I try to access will collect millions of datasets over time, and querying it might take awhile. I want would want to find a way to parse the database and serve one dataset whenever it is found instead of querying the whole database first and then sending an accumulated result.
Do you have any clue how I can solve this or what to google / YouTube search for?
I am using SSIS to move data between a local MSSQL server table to a remote MYSQL table (Data flow, OLEdb source and ODBC Destination). this works fine if im only moving 2 lines of data, but is very slow when using the table I want which has 5000 rows that fits into a csv of about 3mb, this currently takes about 3 minutes using ssis's options, however performing the steps below can be done in 5 seconds max).
I can export the data to a csv file copy it to the remote server then run a script to import straight to the DB, but this requires a lot more steps that I would like as I have multiple tables I wish to perform the steps on.
I have tried row by row and batch processing but both are very slow in comparison.
I know I can use the above steps but I like using the SSIS GUI and would have thought there was a better way of tackling this.
I have googled away multiple times but have not found anything that fits the bill so am calling on external opinions.
I understand SSIS has its limitations but I would hope there is is a better and faster way of achieving what I am trying to do. If SSIS is so bad I may as well just rewrite everything into a script and be done with it, But I like the look and feel of the Gui and would like to move my data in this nice friendly way of seeing things happen.
any suggestions or opinions would be appreciated.
thank you for your time.
As above have tried ssis options including a 3rd party option cozyroc but that sent some data with errors (delimiting on columns seemed off) now and again, different amount of rows being copied and enough problems to make me not trust the data.
Good Day,
I am trying to learn on how to save my data in MySQL database using C in real-time.
I am using a Raspberry Pi MCU, and an external web server where the data will be saved. I am also using C to get the data from the sensors and would like to save it to my external database, but I do not know how to proceed with this problem as I am not that familiar with using C and MySQL together. though my main concern here is how do i make sure that my data is real-time, or when my sensors get the data it will then be saved to the database.
I'm thinking of doing an infinite loop inside my main and will place an if statement that will serve as a trigger whenever there is a data from the sensors and will save it to the mysql server.
but i am not sure if that is the most efficient way of doing this, that is why if you have any better ideas of how to retrieve my data in real-time using C and saving it to MySQL then it would be greatly appreciated.
in PHP i would have simply made a cron job for this but since i will be doing this in C, I am at lost on how to proceed or if my idea is correct.
You are looking at two independent problem:
Retrieve the data at a fixed interval
Save the data to a database.
For the first, there are two known methods, the first is polling which means stay in a while loop and constantly check if updates are available. The second method is using interrupts, you should choose the most appropriate for your problem but for beginning you can use the first method and when the program works (maybe) move it to interrupts.
For the second, just install MySQL and mysql C connector, just go to their site and download and install it. Its connection is pretty simple and there are a lot of examples online, both for combining and syntax.
An efficient way to do such things is called 'the hardware interrupt'. You should read the docs to check if the hardware supports it.
If I were to want to create a PHP function that does the same thing as the Export tab in phpMyAdmin, how could I do it? I don't know if there is a MySQL function that does this or if phpMyAdmin just builds the export file (in SQL that is) manually. Without shell access. Just using PHP.
I tried the documentation for mysqldump, but that seemed to require using the shell. I'm not quite sure what that even is -- maybe my question is: how do you use shell?
My silly idea is to allow non-technical users to build a site on one server (say a localhost) using MySQL then export the site, database and all, to another server (eg. a remote server).
I think I'm pretty clear on the Import process.
You can check the phpMyAdmin source code (an advantage of open-source software). Check the export.php script and the supporting functions in the libraries/export/sql.php script file.
In summary, what phpMyAdmin does is:
get a list of the tables in the given database (SHOW TABLES FROM...),
get the create query for each table (SHOW CREATE TABLE...),
parse it and extract column definitions from it,
get all data (SELECT * FROM...)
build a query according to column data.
I've written similar code for my own apps (for backup purposes, when the GPL license of phpMyAdmin doesn't allow me to use it), however I use DESCRIBE to get column definitions. I think they rather parse the SHOW CREATE TABLE output because contains more information than DESCRIBE output.
This way to generate SQL sentences requires a bit of care to handle the escaping but it allows for some flexibility, as you can convert types, filter or sanitize data, etc. It is also a lot slower than using a tool like mysqldump and you should take care of not consuming all available memory (write soon, write often, don't keep everything in memory).
If you will implement a migration process (from server to server) maybe it would be easier to do it with some shell scripting and calling mysqldump directly, unless you will do everything with PHP.
For some security issues I'm in an envorinment where third party apps can't access my DB. For this reason I should have some service/tool/script (dunno what yet... i'm open to the best option, still reading to see what I'm gonna do...)
which enables me to generate on a regular basis(daily, weekly, monthly) some csv file with all new/modified records for a certain application.
I should be able to automate this process and also export at any time a new file.
So it should keep track for each application which records he still needs.
Each application will need some data in some other format (csv/xls/sql), also some fields will be needed for some application and some aren't... It should be fairly flexible...
What is the best option for me? Creating some custom tables for each application? Based on that extracting modified data?
I think you best thing here, assuming you have access to the server to let you set this up is to make a small command line program that can do the relativley simple task you need. Languages like pearl are good for this sort of thing I do believe.
once you have that 'tool' made you can schedule it through the OS of the server to run ever set amount of time. Either schedule task for a windows server or a cronjob for a linux server.
You can also (with out having to set up the scheduled task if you don't / can't want to) enable this small command line application to be called via 'CGI' this is a special way of letting applications on the server be executed at will by a web user. If you do enable this though, I suggest you add some sort of locking system so that it can only be run every so often and to stop it being run five times at once.
EDIT
You might also want to just look into database replication or adding read only users. This saves a hole lot of arseing around. Try to find a solution that dose not split or duplicate data. You can set up users to only be able to access certain parts of the database system in certain ways, such as SELECT data