I have a JSON file with data about employees and their skills. I need to model the data somehow in a PostgreSQL database (and the reason is related to the application we are developing).
The JSON file has a lot of data that I don't really need for my application (at least for now). I only need a few columns: Employee ID, Name, Qualifications. But the rest of the data should be stored in the table (only temporarily, as this is still a POC).
Data
{
"employee": {
"ID": 654534543,,
"Name": "Max Mustermann",
"Email": "max.mustermann#firma.de",
"skills": [
{"name": python, "level": 3},
{"name": c, "level": 2},
{"name": openCV, "level": 3}
],
},
"employee":{
"ID": 3213213,,
"Name": "Alex Mustermann",
"Email": "alex.mustermann#firma.de",
"skills":[
{"name": Jira, "level": 3},
{"name": Git, "level": 2},
{"name": Tensorflow, "level": 3}
],
}
};
I thought of creating a table with the columns: Employee ID as primary key, CHAR for the name, array for the skills and JSONB for the rest of the information about the employee.
TABLE
CREATE TABLE employee(
id INT PRIMARY KEY,
name VARCHAR(255) NOT NULL,
position VARCHAR(255) NOT NULL,
description VARCHAR (255),
skills TEXT [],
join_date DATE,
);
Some factors to keep in mind: the data should be periodically updated (lets say once a month), the application should use the database to query one (or more) employee ID(s) who are covering certain required skill set(and skill levels). And so far we are not sure if we are going to query json fields (but could be possible in near future)
also, the data is complicated and dense (what I attached below is merely a simplified sample), so I guess querying directly from a JSONB column would not be convenient (as mentioned in other similar questions)
My questions now are:
1- Would the proposed data model meet the required conditions, we have a huge json data file (fast search for employee skills, scalable, easy/fast query and retrieval of employee data (for e.g employee id)?
2- What should be considered when developing a relational database schema?
3- Would there be advantages to splitting the data into multiple tables? e.g. one table for employee personal data with employee ID as primary key, one table for skills with employee ID as foreign key and a text field for skills, one JSON table for the rest of the data.
I am using PostgreSQL 15.1 on windows 10. I am also still getting familiar with PostgreSQL databases.
much thanks
Here is what I would do:
create table employee (
id bigint not null primary key,
name text not null,
email text not null
);
create table skill (
id bigint generated always as identity primary key,
skill_name text not null unique
);
create table employee_skill (
id bigint generated always as identity primary key,
employee_id bigint not null references employee(id),
skill_id bigint not null references skill(id),
skill_level int not null,
unique (employee_id, skill_id)
);
Then, to populate the schema (after correcting the errors with the JSON):
with indata as (
select '[
{
"ID": 654534543,
"Name": "Max Mustermann",
"Email": "max.mustermann#firma.de",
"skills": [
{"name": "python", "level": 3},
{"name": "c", "level": 2},
{"name": "openCV", "level": 3}
]
},
{
"ID": 3213213,
"Name": "Alex Mustermann",
"Email": "alex.mustermann#firma.de",
"skills":[
{"name": "Jira", "level": 3},
{"name": "Git", "level": 2},
{"name": "Tensorflow", "level": 3}
]
}
]'::jsonb as j
), expand as (
select emp, skill
from indata
cross join lateral jsonb_array_elements(j) as el(emp)
cross join lateral jsonb_array_elements(emp->'skills') as sk(skill)
), insemp as (
insert into employee (id, name, email)
select distinct (emp->>'ID')::bigint, emp->>'Name', emp->>'Email'
from expand
on conflict (id) do update
set name = excluded.name, email = excluded.email
returning *
), insskill as (
insert into skill (skill_name)
select distinct skill->>'name'
from expand
on conflict (skill_name) do nothing
returning *
), allemp as (
select * from insemp union select * from employee
), allskill as (
select * from insskill union select * from insskill
), insempskill as (
insert into employee_skill (employee_id, skill_id, skill_level)
select e.id as employee_id, s.id as skill_id,
(i.skill->>'level')::int as skill_level
from expand i
join allemp e on e.id = (i.emp->>'ID')::bigint
join allskill s on s.skill_name = i.skill->>'name'
on conflict (employee_id, skill_id) do update
set skill_level = excluded.skill_level
returning *
)
delete from employee_skill
where (employee_id, skill_id) not in
(select employee_id, skill_id from insempskill
union
select employee_id, skill_id from employee_skill)
;
See working fiddle
Related
I am looking for a way of find rows by given element of the json table that match the pattern.
Lets start with mysql table:
CREATE TABLE `person` (
`attributes` json DEFAULT NULL
);
INSERT INTO `person` (`attributes`)
VALUES ('[{"scores": 1, "name": "John"},{"scores": 1, "name": "Adam"}]');
INSERT INTO `person` (`attributes`)
VALUES ('[{"scores": 1, "name": "Johny"}]');
INSERT INTO `person` (`attributes`)
VALUES ('[{"scores": 1, "name": "Peter"}]');
How to find all records where attributes[*].name consists John* pattern?
In the John* case the query should return 2 records (with John and Johny).
SELECT DISTINCT person.*
FROM person
CROSS JOIN JSON_TABLE(person.attributes, '$[*]' COLUMNS (name TEXT PATH '$.name')) parsed
WHERE parsed.name LIKE 'John%';
https://sqlize.online/sql/mysql80/c9e4a3ffa159c4be8c761d696e06d946/
I m a beginner and trying to insert JSON values into the database using a tutorial
I have created the table using the following command
CREATE TABLE table_name( id character varying(50),
data json NOT NULL,
active boolean NOT NULL,
created_at timestamp with time zone NOT NULL,
updated_at timestamp with time zone NOT NULL,
CONSTRAINT table_name_pkey PRIMARY KEY (id)
);
The table is created with table_name.
Now I am trying to insert the values into the database:
INSERT INTO table_name
SELECT id,data,active,created_at,updated_at
FROM json_populate_record (NULL::table_name,
'{
"id": "1",
"data":{
"key":"value"
},
"active":true,
"created_at": SELECT NOW(),
"updated_at": SELECT NOW()
}'
);
It throws the following error
ERROR: Invalid input syntax for type JSON '{
Could anyone help me to resolve and insert the JSON values into the DB?
You can't include arbitrary SQL commands inside a JSON string. From a JSON "perspective" SELECT NOW() is an invalid value because it lacks the double quotes. But even if you used "select now()" that would be executed as a SQL query and replaced with the current timestamp) .
But I don't understand why you are wrapping this into a jsonb_populate_record. The better solution (at least in my opinion) would be:
INSERT INTO table_name (id, data, active, created_at, updated_dat)
VALUES ('1', '{"key": "value"}', true, now(), now();
If you really want to complicate things, you need to use string concatenation:
SELECT id,data,active,created_at,updated_at
FROM json_populate_record (NULL::table_name,
format('{
"id": "1",
"data":{
"key":"value"
},
"active":true,
"created_at": "%s",
"updated_at": "%s"
}', now(), now())::json
);
Suppose I have a table user:
PK id: int
firstname: varchar
lastname: varchar
FK group_id: int
And a table group:
PK id: int
name: varchar
Now I want to be able to send to the server a student JSON object that contains:
{
"firstname": "john",
"lastname": "doe",
"group": "group_name"
}
How to insert it in the tables? How can I ensure that a row with "group_name" exists and if not, create it and then and only then insert the student with the corresponding FK group_id?
What I would do is something like:
select id from group where name="group_name"
if not exists:
insert into group values ("group_name")
insert into user values("john", "doe", the_existing_or_newly_inserted_group_id)
But it seems a bit overkill in terms of number of requests.
In MySQL, you can do this with just two statements:
insert into groups (group_name)
select s.group_name
from (select :group_name as group_name) s
where not exists (select 1 from groups g where g.group_name = s.group_name);
insert into users (firstname, lastname, group_id)
select :firstname, :lastname, g.group_id
from groups g
where g.group_name = :group_name;
The first statement create the new group if it does not yet exists. The second statement recovers the id of the group and inserts the user information.
For this to properly work, group_name must be a unique key in the groups table.
Notes:
You might want to wrap the statements in a single transaction so concurrency is properly managed
Values preceded by : represents the parameters of the queries, that should be passed from the application
Both user and group are language keywords, hence poor choices for object names; I changed the table names to users and groups.
I have a table of data with and id column and a jsonb column with object data:
id (int)
data (jsonb)
I'm querying the data like this
select row_to_json(t)
from (
select id, data from accounts
) t;
which gets me data that looks like this:
{
"id":3,
"data":
{
"emailAddress": "someone#gmail.com",
"mobileNumbers": ["5559991212"]
}
}
I would like to merge the data field into the main record set, I basically want the keys in the data node into the main record:
{
"id":3,
"emailAddress": "someone#gmail.com",
"mobileNumbers": ["5559991212"]
}
You can use
SELECT jsonb_set(data, '{id}', to_jsonb(id))
FROM accounts;
I cannot help remarking that a table with just a primary key and a jsomb column seems like a problematic database design to me.
100500 times I found answer here but not now.
I have PostgreSQL 11.1 and table
CREATE TABLE public.bots_script_elements (
id integer,
icon text,
CONSTRAINT bots_script_elements_pri PRIMARY KEY (id)
);
Values
ID ICON
1 "begin"
2 "form"
3 "calendar"
How can I select data as json below?
{
"1": {"id":1, "icon":"begin"},
"2": {"id":2, "icon":"form"},
"3": {"id":3, "icon":"calendar"}
}
Json object keys 1, 2 and 3 is value from ID column.
Use the aggregate function jsonb_object_agg():
select jsonb_object_agg(id, to_jsonb(b))
from bots_script_elements b
Test it in rextester.