I am aiming at having a clickhouse table that is constantly updating rows and easy to export. I am considering having a Clickhouse table reference a path to a CSV file, similar to how dictionaries can reference an Absolute Path to a file under its source.
Is there a way to have it update accordingly to a CSV file? Instead of having to update rows all the time?
Related
I need some help from you with a problem of understanding refrencing data from hive. The following situation: I have a CSV fil data.csv imported into hadoop. Now I have found many snippets that use an external table to create a schema on top of the csv file. My question is, how does hive know that the schema of the external table is connected to data.csv. In examples I cannot find a reference to the csv file.
Where is sample_1.csv referenced for usage in this hive example or how does hive know that data from sample_1.csv includes the data?
While creating external table we have to give the list of columns and hdfs location. Hive will store only column metadata like column name, datatype.. and the hdfs location.
When we execute query on external table it will fetch metadata and then fetch available files from hdfs location.
now we've got the answer. The manual recommends to store one file in one directory. When we then build an external table on top it seems that the data ist identified by the schema.
In my Testcase i have imorted 3 csv files with one schema 2 files got the matching schema. The third file got one column more. If i run a query the data of all three files are shown. The additional column from the third file is missing.
Everything is fine now - thank you!
I have read access to this database which I want to query. I have a csv file which contains a list of ids. I want to query the database and find all documents with the ids listed in the csv file. The normal way, I believe, I would go about doing this is to import the csv file as a table or as a new column. However, I only have read permissions, and I don't believe I can get my permissions changed. As such, I cannot change any of the tables or add any new tables.
So there is 1000 or so lines in the csv file. Typing in all the ids manually into a query isn't really an option. But I also can't import the csv file. Is there some solution that would allow me to look for and find all relevant entries (entries whose id matches an id in the csv file).
I have a scenario where my source can be on different versions of our database as a result the in source file I could have different number of columns while my destination have defined number of columns.
now
what we are trying to do is:
load data from source to flat files. move them to central server and
then load that data into central database. but if any column is
missing in flat file i need to add derived column.
what is the best way to do this?? how can i dynamically add derived columns?
You can either do this with BiMLScript as other have suggested in comments, or you can write a script task that reads the file, analyzes the contents, and imports it. Yet another option would be to bulk import the file as is to a staging table (that would have to be dropped and re-created everytime) and write a stored procedure that analyzes the DDL and contents, and imports data to the destination table.
I know how to import a text file into MySQL database by using the command
LOAD DATA LOCAL INFILE '/home/admin/Desktop/data.txt' INTO TABLE data
The above command will write the records of the file "data.txt" into the MySQL database table. My question is that I want to erase the records form the .txt file once it is stored in the database.
For Example: If there are 10 records and at current point of time 4 of them have been written into the database table, I require that in the data.txt file these 4 records get erased simultaneously. (In a way the text file acts as a "Queue".) How can I accomplish this? Can a java code be written? Or a scripting language is to be used?
Automating this is not too difficult, but it is also not trivial. You'll need something (a program, a script, ...) that can
Read the records from the original file,
Check if they were inserted, and, if they were not, copy them in another file
Rename or delete the original file, and rename the new file to replace the original one.
There might be better ways of achieving what you want to do, but, that's not something I can comment on without knowing your goal.
I am trying to update one of my SQL tables with new columns in my source CSV file. The CSV records in this file are already in this SQL table, but this SQL table is lacking some of the new columns from this CSV file.
I already added the new columns to my SQL table structure via ALTER TABLE. But now I just need to import the data from this CSV file into the new columns. How can I do this? I am trying to use SSIS and SQL Server to accomplish this, but am pretty new to Excel.
This is probably too late to solve salvationishere's problem; though I'm posting this for future readers!
You could just generate the SQL INSERT/UPDATE/etc command by parsing the csv file (a simple python script will do).
You could alternatively use this online parser:
http://www.convertcsv.com/csv-to-sql.htm
(Hoping that it'd still be available when you click!)
to generate your SQL command. The interface is extremely straight forward and it does the entire job in an awesome way.
You have several options:
If you are loading the data into a non-production system where you can edit the target tables, you could load the data into a new table, rename the old table to obsolete, and rename the new table to the old table name.
You can load the data into a staging table and then write a SQL statement to update the target table from the staging table.
You can open the CSV file in Excel and write a formula to generate an update script, drag the formula down across all rows so that you get a separate update statement for each row, and then run the separate update statements in management studio.
You can truncate the target table and update your existing ssis package that imports the file to use the new columns if you have the full history in your CSV file.
There are more options, but any of the above would probably be more than adequate solutions.