Power BI: grouping items in one colum, other columns that have different values appear as several columns - duplicates

I have a question for a table on Power BI. In fact, I have a first colums that is the name or the id of a process, the 2nd one, that is the number of the step of the process, and the 3rd one that is the time of the step. Example :
Column 1: [A, A, B, A, C, B, C]
Column 2: [1, 2, 1, 3, 1, 2, 2]
Column 3: [8, 9, 6, 10, 18, 7, 19]
I want it to appear as a table with the first colum the process (without duplicate), and new columns with the steps and associated hours, like :
Column 1: [A, B, C]
Column 2: [8, 6, 18] #hour of the step 1 of each process
Column 3: [9, 7, 19] #hour of the step 2 of each process
Column 4: [10, Nan, Nan] #hour of the step 3 of each process
Is this possible to do directly on Power BI or I need to pass by other tools such as python?
Thank you,

You can achieve this with following measures
_1 = MINX(FILTER(tbl,tbl[colB]=1),tbl[colC])
_2 = MINX(FILTER(tbl,tbl[colB]=2),tbl[colC])
_3 = MINX(FILTER(tbl,tbl[colB]=3),tbl[colC])
A power query solution would be following
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSUTIEYgulWB0IzwiILeE8Y5C8AZjrBFVqBueBlJqDec5QOUMLOBckaQg0KBYA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [colA = _t, colB = _t, colC = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"colA", type text}, {"colB", Int64.Type}, {"colC", Int64.Type}}),
#"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(#"Changed Type", {{"colB", type text}}, "en-US"), List.Distinct(Table.TransformColumnTypes(#"Changed Type", {{"colB", type text}}, "en-US")[colB]), "colB", "colC", List.Min)
in
#"Pivoted Column"

Related

Insert list json objects into row based on other column values in dataframe

I have dataframe with the following columns:
ID A1 B1 C1 A2 B2 C2 A3 B3 C3
AA 1 3 6 4 0 6
BB 5 5 4 6 7 9
CC 5 5 5
I want to create a new column called Z that takes each row, groups them into a JSON list of records, and renames the column as their key. After the JSON column is constructed, I want to drop all the columns and keep Z and ID only.
Here is the output desired:
ID Z
AA [{"A":1, "B":3,"C":6},{"A":4, "B":0,"C":6}]
BB [{"A":5, "B":5,"C":4},{"A":6, "B":7,"C":9}]
CC [{"A":5, "B":5,"C":5}]
Here is my current attempt:
df2 = df.groupby(['ID']).apply(lambda x: x[['A1', 'B1', 'C1',
'A2', 'B2', 'C2', 'A3', 'B3', 'C3']].to_dict('records')).to_frame('Z').reset_index()
The problem is that I cannot rename the columns so that only the letter remains and the number is removed like the example above. Running the code above also does not separate each group of 3 into one object as opposed to creating two objects in my list. I would like to accomplish this in Pandas if possible. Any guidance is greatly appreciated.
Pandas solution
Convert the columns to MultiIndex by splitting and expanding around a regex delimiter, then stack the dataframe to convert the dataframe to multiindex series, then group the dataframe on level=0 and apply the to_dict function to create records per ID
s = df.set_index('ID')
s.columns = s.columns.str.split(r'(?=\d+$)', expand=True)
s.stack().groupby(level=0).apply(pd.DataFrame.to_dict, 'records').reset_index(name='Z')
Result
ID Z
0 AA [{'A': 1.0, 'B': 3.0, 'C': 6.0}, {'A': 4.0, 'B': 0.0, 'C': 6.0}]
1 BB [{'A': 5.0, 'B': 5.0, 'C': 4.0}, {'A': 6.0, 'B': 7.0, 'C': 9.0}]
2 CC [{'A': 5.0, 'B': 5.0, 'C': 5.0}]
Have you tried to go line by line?? I am not very good with pandas and python. But I have me this code. Hope it works for you.
toAdd = []
for row in dataset.values:
toAddLine = {}
i = 0
for data in row:
if data is not None:
toAddLine["New Column Name "+dataset.columns[i]] = data
i = i +1
toAdd.append(toAddLine)
dataset['Z'] = toAdd
dataset['Z']
# create a columns name map for chang related column
columns = dataset.columns
columns_map = {}
for i in columns:
columns_map[i] = f"new {i}"
def change_row_to_json(row):
new_dict = {}
for index, value in enumerate(row):
new_dict[columns_map[columns[index]]] = value
return json.dumps(new_dict, indent = 4)
dataset.loc[:,'Z'] = dataset.apply(change_row_to_json, axis=1)
dataset= dataset[["ID", "Z"]]
I just add a few lines on subham codes and it worked for me
import pandas as pd
from numpy import nan
data = pd.DataFrame({'ID': {0: 'AA', 1: 'BB', 2: 'CC'}, 'A1': {0: 1, 1: 5, 2: 5}, 'B1': {0: 3, 1: 5, 2: 5}, 'C1': {0: 6, 1: 4, 2: 5}, 'A2': {0: nan, 1: 6.0, 2: nan}, 'B2': {0: nan, 1: 7.0, 2: nan}, 'C2': {0: nan, 1: 9.0, 2: nan}, 'A3': {0: 4.0, 1: nan, 2: nan}, 'B3': {0: 0.0, 1: nan, 2: nan}, 'C3': {0: 6.0, 1: nan, 2: nan}} )
data
data.index = data.ID
data.drop(columns=['ID'],inplace=True)
data
data.columns = data.columns.str.split(r'(?=\d+$)', expand=True)
d = data.stack().groupby(level=0).apply(pd.DataFrame.to_dict, 'records').reset_index(name='Z')
d.index = d.ID
d.drop(columns=['ID'],inplace=True)
d.to_dict()['Z']
Now we can see we get desired output thanks, #shubham Sharma, for the answer I think this might help

MySQL merge json field with new data while removing duplicates, where the json values are simple scalar values

Suppose that I have a MySQL table with a JSON field that contains only numbers, like this (note: using MySQL 8):
CREATE TABLE my_table (
id int,
some_field json
);
Sample data:
id: 1
some_field: [1, 2, 3, 4, 5]
id: 2
some_field: [3, 6, 7]
id: 3
some_field: null
I would like to merge another array of data with the existing values of some_field, while removing duplicates. I was hoping that this might work, but it didn't:
update my_table set some_field = JSON_MERGE([1, 2, 3], some_field)
The result of this would be:
id: 1
some_field: [1, 2, 3, 4, 5]
id: 2
some_field: [1, 2, 3, 6, 7]
id: 3
some_field: [1, 2, 3]
Considering you have 3 records in your table and you want to merge 1 and 2 as mentioned in your example.
I hope JavaScript is suitable to follow through for you.
// Get both the records
const records = db.execute(“SELECT id, some_field FROM my_table WHERE id=1 OR id=2”);
// You get both the rows.
// Merging row1, you can either use the Set data structure if you’re dealing with numbers like your example, or you could loop using a map and use the spread operator if using JSON. Since your object is an array, I’ll just be explaining to merge 2 arrays.
records[0].some_field = Array.from(new Set(records[0].some_field + record[1].some_field))
// Same for second record.
records[1].some_field = Array.from(new Set(records[0].some_field + record[1].some_field))
// Now update both the records in the database one by one.

looking for "Sum and combine" json columns

In PostgreSql I can't find in the docs a function that could allow me to combine n json entities, whilst summing the value part in case of existing key/value pair
English not being my main language, I suspect I don't know how to search with the right terms
In other words
from a table with 2 columns
name data
'didier' {'vinyl': 2, 'cd': 3)
'Anne' {'cd' : 1, 'tape' : 4}
'Pierre' {'cd' : 1, 'tape': 9, 'mp3':2}
I want to produce the following result :
{ 'vinyl' : 2, 'cd' : 5, 'tape':13, mp3 : 2}
With is a "combine and sum" function
Thanks in advance for any idea
Didier
Using the_table CTE for illustration, first 'normalize' data column then sum per item type (k) and finally aggregate into a JSONB object.
with the_table("name", data) as
(
values
('didier', '{"vinyl": 2, "cd": 3}'::jsonb),
('Anne', '{"cd" : 1, "tape" : 4}'),
('Pierre', '{"cd" : 1, "tape": 9, "mp3":2}')
)
select jsonb_object_agg(k, v) from
(
select lat.k, sum((lat.v)::integer) v
from the_table
cross join lateral jsonb_each(data) as lat(k, v)
group by lat.k
) t;
-- {"cd": 5, "mp3": 2, "tape": 13, "vinyl": 2}

how to make lua table key in order

my test code:
local jsonc = require "jsonc"
local x = {
a = 1,
b = 2,
c = 3,
d = 4,
e = 5,
}
for k, v in pairs(x) do
print(k,v)
end
print(jsonc.stringify(x))
output:
a 1
c 3
b 2
e 5
d 4
{"a":1,"c":3,"b":2,"e":5,"d":4}
someone help:
from for pairs output, lua store table by key hash order, how can i change it?
i need output: {"a":1,"b":2,"c":3,"d":4,"e":5}
thanks
Lua tables can't preserve the order of their keys. There are two possible solutions.
You can store the keys in a separate array and iterate through that whenever you need to iterate through the table:
local keys = {'a', 'b', 'c', 'd', 'e'}
Or, instead of a hash table, you can use an array of pairs:
local x = {
{'a', 1},
{'b', 2},
{'c', 3},
{'d', 4},
{'e', 5},
}

Django. How i can get list of Users with a duplicate?

i have array like this: [1, 1, 1, 2, 3]. How i can get users with a duplicate? For example this query return list without duplicate
list= User.objects.filter(id__in=[1, 1, 1, 2, 3])
for example it will be users with id's:
1,
2,
3
but i need list of users like this:
1,
1,
1,
2,
3
list = []
for x in [1, 1, 1, 2, 3]:
list.append(User.objects.filter(id=x)
It this what you mean? I don't quite understand the spacing.
Get your queryset sorted in the right order. .order_by('id) for ascending by id (which may be the default anyway). Then iterate over the queryset with code to make extra operations with the same object (or a copy thereof) as dictated by the list of IDs.
idlist = [1, 1, 1, 2, 3]
queryset = User.objects.filter(id__in = idlist ).order_by('id')
for object in queryset:
for _ in range( idlist.count( object.id))
do_something_with( object)
Note, this is only one DB call (one queryset), unlike the accepted answer which does one DB query for each element in the id list. Not good.