MySQL is it good to store 1MB of JSON - mysql

So i have a user based application, which multiple (around 500 users in total) users will be accessing.
Every user will have like 500KB of encoded JSON string in each column (total of 10 cols) that makes 5MB of data for each user.
For 500 users this will be 5*500MB of data in total.
I won't be doing any filtering and searching in the JSON. Moreover the data will be totally different across all the JSONs, so i can not think of storing only the unique content.
I can not use No SQL Databases. Is this a good architecture, if yes what should be the column type?

You may try using mysql 5.7+ and JSON column type, internally it will be stored more efficiently and also you can query it.

Related

Storing large amount data as a single JSON field - extract important fields to their own field?

I'm planning of storing a large amount of data from a user submitted form (around 100 questions) in a json field.
I will only need to access for queries for two pieces of data from the form, name and type.
Would it be advisable (and more efficient), to extract name and type to their own fields for querying or shall I just whack it all in one json field and query that json field since json searching is now supported?
If you are concerned about performance, then maintaining separate fields for the name and type is probably the way to go here. The reason for this is that if these two points of data exist as separate fields, it leaves open the possibility to do things like add indices to those columns. While you can use MySQL's JSON API to query by name and type, it would most likely would never be able to compete with an index lookup, at least not in terms of performance.
From a storage point of view, you would not pay much of a price to maintain two separate columns. The main price you would pay is that everytime the JSON gets updated, you would have to also update the name and type columns.

How to store big matrix(data frame) that can be subsetted easily later

I will generate a big matrix(data frame) in R whose size is about 1300000*10000, about 50 GB. I want to store this matrix in a appropriate format, so later I can feed the data into Python or other program codes to make some analysis. Of course I cannot feed the data one time, so I have to subset the matrix and feed them little by little.
But I don't know how to store the matrix. I think of two ways, but I think neither is appropriate:
(1) plain text(including csv or excel table), because it is very hard to subset(e.g. if I just want some columns and some rows of the data)
(2) database, I have searched information about mysql and sqlite, but it seems that the number of columns is limited in sql database(1024).
So I just want to know if there are any good strategies to store the data, so that I can subset the data by row/column indexes or name.
Have separate column(s) for each of the few columns you need to search/filter on. Then put the entire 10K columns into some data format that is convenient for the client code to parse. JSON is one common possibility.
So the table would 1.3M rows and perhaps 3 columns: an id (auto_increment, primary key), the column search on, and the JSON blob - as datatype JSON or TEXT (depending on software version) for the many data values.

When to use Json over key/value tables in postgres for billions of rows

I am doing a project, where I need to store billions of rows of unstructured history_data in a sql database (postgres) 2-3 years. The data/columns may change from day to day.
So example, day one the user might save {“user_id”:”2223”, “website”:”www.mywebsite.org”, “webpage”:”mysubpageName”}.
And the following day {“name”:”username”, “user_id”: “2223”, “bookclub_id”:”1” }.
I have been doing a project earlier, where we used the classic entity key/value table model for this problem. We saved maybe up to 30 key/values pr entity. But when exceeding 70-100 mill rows, the queries began to run slower and slower (too many inner joins).
Therefore i am wondering if I should change using the Json model in postgres. After searching the web, and reading blogs, I am really confused. What are the pro and con changing this to json in postgres?
You can think about this in terms of query complexity. If you have an index to the json documents (maybe user_id) you can do a simple index-scan to access the whole json string very fast.
You have to dissect it on the client side then, or you can pass it to functions in postgres, if e.g. you want to extract only data for specific values.
One of the most important features of postgres when dealing with json is having functional indexes. In comparison to "normal" index which index the value of a column, function indexes apply a function to a value of one (or even more) column values and index the return value. I don't know the function that extracts the value of a json string, but consider you want the user that have bookclub_id = 1. You can create an index like
create index idx_bookblub_id on mytable using getJsonValue("bookclub_id",mytable.jsonvalue)
Afterwards queries like
select * from mytable where getJsonValue("bookclub_id",mytable.jsonvalue) = 1
are lightning fast.

json field type vs. one field for each key

I'm working on a website which has a database table with more than 100 fields.
The problem is when my records number get very much (like more than 10000) the speed of response gets very much and actually doesn't return any answer.
Now i want to optimize this table.
My question is: Can we use json type for fields to reduce the number of columns?
my limitation is that i want to search, change and maybe remove that specific data which is stored in json.
PS: i read this qustion : Storing JSON in database vs. having a new column for each key, but that was asked in 2013 and as we know in MuSQL 5.7 json field type is added.
tnx for any guide...
First of all having table with 100 columns may suggest you should rethink your architecture before proceeding. Otherwise it will only become more and more pain in later stages.
May be you are storing data as seperate columns which can be broken down to be stored as seperate rows.
I think the sql query you are writing is like (select * ... ) where you may be fetching extra columns than you may require. You may specify the columns you require. It will definitely speed up the api response.
In my personal view storing active data in json inside sql is not useful. Json should be used as last resort for the meta data which does not mutate or needs not to be searched.
Please make your question more descriptive about the schema of your database and query you are making for api.

storing multiple values as binary in one field

I have a project where I need to store a large number of values.
The data is a dataset holding 1024 2Byte Unsigned integer values. Now I store one value at one row together with a timestamp and a unik ID.
This data is continously stored based on a time trigger.
What I would like to do, is store all 1024 values in one field. So would it be possible to do some routine that stores all the 1024 2byte integer values in one field as binary. Maybe a blobfield.
Thanks.
Br.
Enghoej
Yes. You can serialize your data into a byte array, and store it in a BLOB. 2048 bytes will be supported in a BLOB in most databases.
One big question to ask yourself is "how will I need to retrieve this data?" Any reports or queries such as "what IDs have value X set to Y" will have to load all rows from the table and parse the data AFAIK. For instance, if this were user configuration data, you might need to know which users had a particular setting set incorrectly.
In SQL Server, I'd suggest considering using an XML data type and storing a known schema, since this can be queried with XPath. MySQL did not support this as of 2007, so that may not be an option for you.
I would definitely consider breaking out any data that you might possibly need to query in such a manner into separate columns.
Note also that you will be unable to interpret BLOB data without a client application.
You always want to consider reporting. Databases often end up with multiple clients over the years.