How to get hold of a pyarrow table schema (from a dataframe), copy it into code, and then apply it to new tables? - pyarrow

A common operation is to get a schema, and use it to create tables and write ParquetWriter. But how to actually get the schema, copy paste into the code, and then use it without manually writing a bunch of helper functions? There must be a from json or something?

Related

Reading CSV file in stored procedure and compare with existing table in MYSQL

I have a CSV file, and would like to do a comparison with an existing table to make sure that the fields are updated correctly. (Someone else did the update of the table, I'm just checking if it's updated correctly)
I understand that I can load the csv into a table before doing the comparison but would like to explore if there's any other way to READ the CSV without loading it into a new table.
I'm planning to create a stored procedure to do the comparison, would like to know if it's possible to READ the CSV in stored procedure and do the comparison with the table directly?
Thank you!
No, you cannot read a csv file in a mysql stored procedure without loading into a table with mysql's standard functionality.
The only option you have is to write or find a compiled user defined function that can do this operation or you need to modify mysql's source code.
These options are probably an overkill, so you are better off doing this reconciliation check in an external programming language.

Redshift/S3 - Copy the contents of a Redshift table to S3 as JSON?

It's straightforward to copy JSON data on S3 into a Redshift table using the standard Redshift COPY command.
However, I'm also looking for the inverse operation: to copy the data contained within an existing Redshift table to JSON that is stored in S3, so that a subsequent Redshift COPY command can recreate the Redshift table exactly as it was originally.
I know about the Redshift UNLOAD commnd, but it doesn't seem to offer any option to store the data in S3 directly in JSON format.
I know that I can write per-table utilities to parse and reformat the output of UNLOAD for each table, but I'm looking for a generic solution which allows me to do this Redshift-to-S3-JSON extract on any specified Redshift table.
I couldn't find any existing utilities that will do this. Did I miss something?
Thank you in advance.
I think the only way is to unload in CSV and write a simple lambda function that turns an input CSV into JSON taking the CSV header as keys and values of every row as values.
There is no built in way to do this yet. So you might have to hack your query with some hardcoding :
https://sikandar89dubey.wordpress.com/2015/12/23/how-to-dump-data-from-redshift-to-json/

Migrate new database with exceeding old database value

I need to migrate the exceeding database value with new one. I have two database like test and test new. I create the both database with same data. I made the all changes in test now I need migrate that changes in test new without affecting existing value.
If table schema is different, how will I then go about doing this? In my prev job, what I did was import data (in my case, from Access) into my destination (MySQL) leaving table structures, then use SQL to select data and manipulate as required into final destination tables.
in my case, where I don't have documentation for the old database, and the columns was not named correctly, e.g. it uses say 'field1', 'field2' etc. I needed to trace from the application code what the columns mean. Is there any better way? Also, sometimes columns contain multiple values in delimited data, is reading code the only way?
It sounds like you know what to do, but are just not keen to do it.
If there is no documentation then it makes sense that you will have to go to the code to figure out what it does. Regarding porting it across you will most likely have to write custom scripts that pull the data, manipulate it and insert it into the new table based on the new structure.
There are some tools to generate migration scripts - i.e. scripts that generate inserts for all your data. I think mysql workbench does it, but it most likely won't be sufficient since your tables have different structures.

infer table structure from file in MySql

Another posting said there is a way to infer the table columns from a data file using phpMyAdmin. I haven't found documentation on this, can you point me to it? Does it only use the header row, or does it also sample the data to infer the data type?
I'm trying to create several tables in MySQL from data files, which have roughly 100 columns each, so I don't want to write the SQL DDL to create the tables manually.
Thanks!

Can I import tab-separated files into MySQL without creating database tables first?

As the title says: I've got a bunch of tab-separated text files containing data.
I know that if I use 'CREATE TABLE' statements to set up all the tables manually, I can then import them into the waiting tables, using 'load data' or 'mysqlimport'.
But is there any way in MySQL to create tables automatically based on the tab files? Seems like there ought to be. (I know that MySQL might have to guess the data type of each column, but you could specify that in the first row of the tab files.)
No, there isn't. You need to CREATE a TABLE first in any case.
Automatically creating tables and guessing field types is not part of the DBMS's job. That is a task best left to an external tool or application (That then creates the necessary CREATE statements).
If your willing to type the data types in the first row, why not type a proper CREATE TABLE statement.
Then you can export the excel data as a txt file and use
LOAD DATA INFILE 'path/file.txt' INTO TABLE your_table;