Power BI: grouping items in one colum, other columns that have different values appear as several columns

Power BI: grouping items in one colum, other columns that have different values appear as several columns - duplicates

I have a question for a table on Power BI. In fact, I have a first colums that is the name or the id of a process, the 2nd one, that is the number of the step of the process, and the 3rd one that is the time of the step. Example :
Column 1: [A, A, B, A, C, B, C]
Column 2: [1, 2, 1, 3, 1, 2, 2]
Column 3: [8, 9, 6, 10, 18, 7, 19]
I want it to appear as a table with the first colum the process (without duplicate), and new columns with the steps and associated hours, like :
Column 1: [A, B, C]
Column 2: [8, 6, 18] #hour of the step 1 of each process
Column 3: [9, 7, 19] #hour of the step 2 of each process
Column 4: [10, Nan, Nan] #hour of the step 3 of each process
Is this possible to do directly on Power BI or I need to pass by other tools such as python?
Thank you,

You can achieve this with following measures
_1 = MINX(FILTER(tbl,tbl[colB]=1),tbl[colC])
_2 = MINX(FILTER(tbl,tbl[colB]=2),tbl[colC])
_3 = MINX(FILTER(tbl,tbl[colB]=3),tbl[colC])
A power query solution would be following
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSUTIEYgulWB0IzwiILeE8Y5C8AZjrBFVqBueBlJqDec5QOUMLOBckaQg0KBYA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [colA = _t, colB = _t, colC = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"colA", type text}, {"colB", Int64.Type}, {"colC", Int64.Type}}),
#"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(#"Changed Type", {{"colB", type text}}, "en-US"), List.Distinct(Table.TransformColumnTypes(#"Changed Type", {{"colB", type text}}, "en-US")[colB]), "colB", "colC", List.Min)
in
#"Pivoted Column"

Related

Insert list json objects into row based on other column values in dataframe

I have dataframe with the following columns:
ID A1 B1 C1 A2 B2 C2 A3 B3 C3
AA 1 3 6 4 0 6
BB 5 5 4 6 7 9
CC 5 5 5
I want to create a new column called Z that takes each row, groups them into a JSON list of records, and renames the column as their key. After the JSON column is constructed, I want to drop all the columns and keep Z and ID only.
Here is the output desired:
ID Z
AA [{"A":1, "B":3,"C":6},{"A":4, "B":0,"C":6}]
BB [{"A":5, "B":5,"C":4},{"A":6, "B":7,"C":9}]
CC [{"A":5, "B":5,"C":5}]
Here is my current attempt:
df2 = df.groupby(['ID']).apply(lambda x: x[['A1', 'B1', 'C1',
'A2', 'B2', 'C2', 'A3', 'B3', 'C3']].to_dict('records')).to_frame('Z').reset_index()
The problem is that I cannot rename the columns so that only the letter remains and the number is removed like the example above. Running the code above also does not separate each group of 3 into one object as opposed to creating two objects in my list. I would like to accomplish this in Pandas if possible. Any guidance is greatly appreciated.

Pandas solution
Convert the columns to MultiIndex by splitting and expanding around a regex delimiter, then stack the dataframe to convert the dataframe to multiindex series, then group the dataframe on level=0 and apply the to_dict function to create records per ID
s = df.set_index('ID')
s.columns = s.columns.str.split(r'(?=\d+$)', expand=True)
s.stack().groupby(level=0).apply(pd.DataFrame.to_dict, 'records').reset_index(name='Z')
Result
ID Z
0 AA [{'A': 1.0, 'B': 3.0, 'C': 6.0}, {'A': 4.0, 'B': 0.0, 'C': 6.0}]
1 BB [{'A': 5.0, 'B': 5.0, 'C': 4.0}, {'A': 6.0, 'B': 7.0, 'C': 9.0}]
2 CC [{'A': 5.0, 'B': 5.0, 'C': 5.0}]

Have you tried to go line by line?? I am not very good with pandas and python. But I have me this code. Hope it works for you.
toAdd = []
for row in dataset.values:
toAddLine = {}
i = 0
for data in row:
if data is not None:
toAddLine["New Column Name "+dataset.columns[i]] = data
i = i +1
toAdd.append(toAddLine)
dataset['Z'] = toAdd
dataset['Z']

# create a columns name map for chang related column
columns = dataset.columns
columns_map = {}
for i in columns:
columns_map[i] = f"new {i}"
def change_row_to_json(row):
new_dict = {}
for index, value in enumerate(row):
new_dict[columns_map[columns[index]]] = value
return json.dumps(new_dict, indent = 4)
dataset.loc[:,'Z'] = dataset.apply(change_row_to_json, axis=1)
dataset= dataset[["ID", "Z"]]

I just add a few lines on subham codes and it worked for me
import pandas as pd
from numpy import nan
data = pd.DataFrame({'ID': {0: 'AA', 1: 'BB', 2: 'CC'}, 'A1': {0: 1, 1: 5, 2: 5}, 'B1': {0: 3, 1: 5, 2: 5}, 'C1': {0: 6, 1: 4, 2: 5}, 'A2': {0: nan, 1: 6.0, 2: nan}, 'B2': {0: nan, 1: 7.0, 2: nan}, 'C2': {0: nan, 1: 9.0, 2: nan}, 'A3': {0: 4.0, 1: nan, 2: nan}, 'B3': {0: 0.0, 1: nan, 2: nan}, 'C3': {0: 6.0, 1: nan, 2: nan}} )
data
data.index = data.ID
data.drop(columns=['ID'],inplace=True)
data
data.columns = data.columns.str.split(r'(?=\d+$)', expand=True)
d = data.stack().groupby(level=0).apply(pd.DataFrame.to_dict, 'records').reset_index(name='Z')
d.index = d.ID
d.drop(columns=['ID'],inplace=True)
d.to_dict()['Z']
Now we can see we get desired output thanks, #shubham Sharma, for the answer I think this might help

MySQL merge json field with new data while removing duplicates, where the json values are simple scalar values

Suppose that I have a MySQL table with a JSON field that contains only numbers, like this (note: using MySQL 8):
CREATE TABLE my_table (
id int,
some_field json
);
Sample data:
id: 1
some_field: [1, 2, 3, 4, 5]
id: 2
some_field: [3, 6, 7]
id: 3
some_field: null
I would like to merge another array of data with the existing values of some_field, while removing duplicates. I was hoping that this might work, but it didn't:
update my_table set some_field = JSON_MERGE([1, 2, 3], some_field)
The result of this would be:
id: 1
some_field: [1, 2, 3, 4, 5]
id: 2
some_field: [1, 2, 3, 6, 7]
id: 3
some_field: [1, 2, 3]

Considering you have 3 records in your table and you want to merge 1 and 2 as mentioned in your example.
I hope JavaScript is suitable to follow through for you.
// Get both the records
const records = db.execute(“SELECT id, some_field FROM my_table WHERE id=1 OR id=2”);
// You get both the rows.
// Merging row1, you can either use the Set data structure if you’re dealing with numbers like your example, or you could loop using a map and use the spread operator if using JSON. Since your object is an array, I’ll just be explaining to merge 2 arrays.
records[0].some_field = Array.from(new Set(records[0].some_field + record[1].some_field))
// Same for second record.
records[1].some_field = Array.from(new Set(records[0].some_field + record[1].some_field))
// Now update both the records in the database one by one.

looking for "Sum and combine" json columns

In PostgreSql I can't find in the docs a function that could allow me to combine n json entities, whilst summing the value part in case of existing key/value pair
English not being my main language, I suspect I don't know how to search with the right terms
In other words
from a table with 2 columns
name data
'didier' {'vinyl': 2, 'cd': 3)
'Anne' {'cd' : 1, 'tape' : 4}
'Pierre' {'cd' : 1, 'tape': 9, 'mp3':2}
I want to produce the following result :
{ 'vinyl' : 2, 'cd' : 5, 'tape':13, mp3 : 2}
With is a "combine and sum" function
Thanks in advance for any idea
Didier

Using the_table CTE for illustration, first 'normalize' data column then sum per item type (k) and finally aggregate into a JSONB object.
with the_table("name", data) as
(
values
('didier', '{"vinyl": 2, "cd": 3}'::jsonb),
('Anne', '{"cd" : 1, "tape" : 4}'),
('Pierre', '{"cd" : 1, "tape": 9, "mp3":2}')
)
select jsonb_object_agg(k, v) from
(
select lat.k, sum((lat.v)::integer) v
from the_table
cross join lateral jsonb_each(data) as lat(k, v)
group by lat.k
) t;
-- {"cd": 5, "mp3": 2, "tape": 13, "vinyl": 2}

how to make lua table key in order

my test code:
local jsonc = require "jsonc"
local x = {
a = 1,
b = 2,
c = 3,
d = 4,
e = 5,
}
for k, v in pairs(x) do
print(k,v)
end
print(jsonc.stringify(x))
output:
a 1
c 3
b 2
e 5
d 4
{"a":1,"c":3,"b":2,"e":5,"d":4}
someone help:
from for pairs output, lua store table by key hash order, how can i change it?
i need output: {"a":1,"b":2,"c":3,"d":4,"e":5}
thanks

Lua tables can't preserve the order of their keys. There are two possible solutions.
You can store the keys in a separate array and iterate through that whenever you need to iterate through the table:
local keys = {'a', 'b', 'c', 'd', 'e'}
Or, instead of a hash table, you can use an array of pairs:
local x = {
{'a', 1},
{'b', 2},
{'c', 3},
{'d', 4},
{'e', 5},
}

Django. How i can get list of Users with a duplicate?

i have array like this: [1, 1, 1, 2, 3]. How i can get users with a duplicate? For example this query return list without duplicate
list= User.objects.filter(id__in=[1, 1, 1, 2, 3])
for example it will be users with id's:
1,
2,
3
but i need list of users like this:
1,
1,
1,
2,
3

list = []
for x in [1, 1, 1, 2, 3]:
list.append(User.objects.filter(id=x)
It this what you mean? I don't quite understand the spacing.

Get your queryset sorted in the right order. .order_by('id) for ascending by id (which may be the default anyway). Then iterate over the queryset with code to make extra operations with the same object (or a copy thereof) as dictated by the list of IDs.
idlist = [1, 1, 1, 2, 3]
queryset = User.objects.filter(id__in = idlist ).order_by('id')
for object in queryset:
for _ in range( idlist.count( object.id))
do_something_with( object)
Note, this is only one DB call (one queryset), unlike the accepted answer which does one DB query for each element in the id list. Not good.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Power BI: grouping items in one colum, other columns that have different values appear as several columns - duplicates

Related

Insert list json objects into row based on other column values in dataframe

MySQL merge json field with new data while removing duplicates, where the json values are simple scalar values

looking for "Sum and combine" json columns

how to make lua table key in order

Django. How i can get list of Users with a duplicate?

Categories

Resources