Now I'm working with GraphQL and postgresql as the database.
As you know, we can easily map data to schemas with GraphQL.
If I just use a simple join table query, we need to process the data to fit the output like graphql schema, but if I do the query using json format, the query will be very hard to read.
My question is, when should we use json format, and when should we do a simple table join? Is there any particular concern if we use json format or simple join table? and how does the performance compare between the two?
Related
I was going through some past questions for inserting json data into a postgresql table and learnt about JSONB and JSONL. I couldn't understand the difference between the both as both looks same as a list of json data i.e.
[{"a":"b","c":"d"},{"a1":"b1","c1":"d1"}]
I am doing a project, where I need to store billions of rows of unstructured history_data in a sql database (postgres) 2-3 years. The data/columns may change from day to day.
So example, day one the user might save {“user_id”:”2223”, “website”:”www.mywebsite.org”, “webpage”:”mysubpageName”}.
And the following day {“name”:”username”, “user_id”: “2223”, “bookclub_id”:”1” }.
I have been doing a project earlier, where we used the classic entity key/value table model for this problem. We saved maybe up to 30 key/values pr entity. But when exceeding 70-100 mill rows, the queries began to run slower and slower (too many inner joins).
Therefore i am wondering if I should change using the Json model in postgres. After searching the web, and reading blogs, I am really confused. What are the pro and con changing this to json in postgres?
You can think about this in terms of query complexity. If you have an index to the json documents (maybe user_id) you can do a simple index-scan to access the whole json string very fast.
You have to dissect it on the client side then, or you can pass it to functions in postgres, if e.g. you want to extract only data for specific values.
One of the most important features of postgres when dealing with json is having functional indexes. In comparison to "normal" index which index the value of a column, function indexes apply a function to a value of one (or even more) column values and index the return value. I don't know the function that extracts the value of a json string, but consider you want the user that have bookclub_id = 1. You can create an index like
create index idx_bookblub_id on mytable using getJsonValue("bookclub_id",mytable.jsonvalue)
Afterwards queries like
select * from mytable where getJsonValue("bookclub_id",mytable.jsonvalue) = 1
are lightning fast.
I have a sharded table with one pk column and a text column. The text column holds an object in json format. I want to enable ad hoc business analytics by using drill or presto.
Just experimented with both but i am unable to figure out how to parse the json and access its fields in a query.
For drill i tried convert_from(features,'JSON') and for presto i tried json_parse(features). Both seem to convert column text to JSON as a simple select but i cannot access object fields in the same query.
Performance is important so need to avoid io, open to options requiring development effort or hardware scaling.
I was able to analyze json column in presto by using json_extract_scalar on output of json_parse ex. json_extract_scalar(json_parse(features),'$.r_id'). This returns me a string which i can cast to required data type.
I'm working on a website which has a database table with more than 100 fields.
The problem is when my records number get very much (like more than 10000) the speed of response gets very much and actually doesn't return any answer.
Now i want to optimize this table.
My question is: Can we use json type for fields to reduce the number of columns?
my limitation is that i want to search, change and maybe remove that specific data which is stored in json.
PS: i read this qustion : Storing JSON in database vs. having a new column for each key, but that was asked in 2013 and as we know in MuSQL 5.7 json field type is added.
tnx for any guide...
First of all having table with 100 columns may suggest you should rethink your architecture before proceeding. Otherwise it will only become more and more pain in later stages.
May be you are storing data as seperate columns which can be broken down to be stored as seperate rows.
I think the sql query you are writing is like (select * ... ) where you may be fetching extra columns than you may require. You may specify the columns you require. It will definitely speed up the api response.
In my personal view storing active data in json inside sql is not useful. Json should be used as last resort for the meta data which does not mutate or needs not to be searched.
Please make your question more descriptive about the schema of your database and query you are making for api.
My database is MySQL and I found that there isn't a JSON type but only text. But in my situation, I need to store the data of JSON meaning and use index on the hash key of the JSON data. So 'text' type isn't the right way. By the way, I can't separate the JSON data into different columns, because I don't know what the keys are in the JSON.
What I really need is a solution for searching JSON data by the JSON keys in MySQL using index or sth, so the searching speed could be fast and acceptable.
If you can't normalize your data into a RDBMS-readable format, you should not use MySQL.
A noSQL database might be the better approach.
MySQLs strengths are to work with relational data.
You are trying to fit a square object through a round hole :)
select concat('{"objects\":['
,(select group_concat(
concat('{"id":"',id, '","field":"',field,'"}') separator ',')
from placeholder)
,']}');
Anything more generic would require dynamic sql
I think your situation calls for the Levenshtein Distance algorithm. You basically need to do a text search of an unknown quantity (the haystack) with a text fragment (the needle). And you need it to be fast, because indexing is out of the question.
There is a little known (or little-used at any rate) capability of MySQL whereby you can create User-Defined Functions. The function itself is a stored procedure, and for extra speed, it can be compiled in C++. UDF's are used natively in your queries as though they are part of regular MySQL syntax.
Here are the details for implementing a Levenshtein User-Defined Function in MySQL.
An example query might be...
SELECT json FROM my_table WHERE levenshtein('needle') < 5;
The 5 refers to the 'edit distance' and actually allows for near matches to be located.