I Need a DAX measure where date wise only duplicate values will return. The date will search always previous + Current Date. Ex: on date 2/2/2022, it will check both 2/1/2022 and 2/2/2022 and return only duplicate count.
Below Data set Count of on 2/2/2022 Duplicate will return 4. which is highlighted in
Total Output will be:
I already write a DAX but it just return unique count not only duplicate:
CALCULATE(DISTINCTCOUNT(data[Wrap-up]),DATESINPERIOD(data[Date],MAX(data[Date]),-2,DAY))
MyMeasure :=
VAR ThisDate =
MAX( data[Date] )
VAR MyTable =
SUMMARIZE(
FILTER( ALL( data ), data[Date] <= ThisDate && data[Date] >= ThisDate - 1 ),
data[Wrap-up],
"MyCount", COUNTROWS( data )
)
RETURN
SUMX( MyTable, IF( [MyCount] > 1, [MyCount] ) )
Date Wrap-up
2/1/2022 Subscription_ACCBalance_QFTR
2/1/2022 Package_AKASH Lite Plus_RFTR
2/1/2022 Package_AKASH LITE_RFTR
2/1/2022 Subscription_Service Act_QFTR
2/1/2022 Package_AKASH Lite Plus_RFTR
2/1/2022 Camp Offer_Referral Offer_RFTR
2/2/2022 Content_Specefic Channel_CFTR
2/2/2022 Subscription_ACCBalance_QFTR
2/2/2022 Package_AKASH STANDARD_RFTR
2/2/2022 Camp Offer_Referral Offer_QFTR
2/2/2022 Error Code_E17-0_CFTR
2/3/2022 Package_AKASH STANDARD_RFTR
2/3/2022 Package_AKASH STANDARD_QFTR
2/3/2022 Package_Package Info_QFTR
2/3/2022 Package_AKASH LITE_QFTR
2/4/2022 Subscription_ACCBalance_QFTR
2/4/2022 Content_Specefic Channel_QFTR
2/4/2022 Info Update_HelpPage_CFTR
2/4/2022 Purchase_General Process_QFTR
2/4/2022 Package_AKASH STANDARD_RFTR
2/4/2022 Recharge_bKash_QFTR
2/4/2022 Camp Offer_Feb Recharge_QFTR
2/4/2022 Subscription_Classific_QFTR
Date Wrap-up
2/1/2022 Subscription_ACCBalance_QFTR
2/1/2022 Package_AKASH Lite Plus_RFTR
2/1/2022 Package_AKASH LITE_RFTR
2/1/2022 Subscription_Service Act_QFTR
2/1/2022 Package_AKASH Lite Plus_RFTR
2/1/2022 Camp Offer_Referral Offer_RFTR
2/2/2022 Content_Specefic Channel_CFTR
2/2/2022 Subscription_ACCBalance_QFTR
2/2/2022 Package_AKASH STANDARD_RFTR
2/2/2022 Camp Offer_Referral Offer_QFTR
2/2/2022 Error Code_E17-0_CFTR
2/3/2022 Package_AKASH STANDARD_RFTR
2/3/2022 Package_AKASH STANDARD_QFTR
2/3/2022 Package_Package Info_QFTR
2/3/2022 Package_AKASH LITE_QFTR
2/4/2022 Subscription_ACCBalance_QFTR
2/4/2022 Content_Specefic Channel_QFTR
2/4/2022 Info Update_HelpPage_CFTR
2/4/2022 Purchase_General Process_QFTR
2/4/2022 Package_AKASH STANDARD_RFTR
2/4/2022 Recharge_bKash_QFTR
2/4/2022 Camp Offer_Feb Recharge_QFTR
2/4/2022 Subscription_Classific_QFTR
Related
The table has 1000 rows. And I want to add a new column with the Row_number that repeats after a count of 3
INPUT TABLE
ID
Name
CountryCode
District
Population
1
Kabul
Afg
Kabol
178000
2
Qandahar
Afg
Qandahar
237500
3
Herat
Afg
Herat
186800
4
Mazar-e-Sharif
Afg
Balkh
127800
5
Amsterdam
Nld
Noord-Holland
731200
6
Rotterdam
Nld
Zuid-Holland
593321
7
Haag
Nld
Zuid-Holland
440900
8
Utrecht
Nld
Utrecht
234323
9
Eindhoven
Nld
Noord-Brabant
201843
10
Tilburg
Nld
Noord-Brabant
193238
11
Groningen
Nld
Groningen
172701
OUTPUT TABLE
ID
Name
CountryCode
District
Population
Row_Number
1
Kabul
Afg
Kabol
178000
1
2
Qandahar
Afg
Qandahar
237500
2
3
Herat
Afg
Herat
186800
3
4
Mazar-e-Sharif
Afg
Balkh
127800
1
5
Amsterdam
Nld
Noord-Holland
731200
2
6
Rotterdam
Nld
Zuid-Holland
593321
3
7
Haag
Nld
Zuid-Holland
440900
1
8
Utrecht
Nld
Utrecht
234323
2
9
Eindhoven
Nld
Noord-Brabant
201843
3
10
Tilburg
Nld
Noord-Brabant
193238
1
11
Groningen
Nld
Groningen
172701
2
You should use ROW_NUMBER() to get a row number, you can the use various ways to get it to be 1,2,3 in a sequence, for instance use modulus
case when MOD(ROW_NUMBER(),3) = 0 then 3 else MOD(ROW_NUMBER(),3) end as row_number.
I want to group by the JobId, StartTime & EndTime only for continuous days. If a specific row doesn't form part of a range it should be discarded. The Id's should also pivot into a column per grouping.
Id
Date
StartTime
EndTime
JobId
1
2021-08-23
08:30:00
19:00:00
1
2
2021-08-24
08:30:00
19:00:00
1
3
2021-08-24
12:30:00
14:30:00
2
4
2021-08-24
15:30:00
19:00:00
1
5
2021-08-25
08:30:00
19:00:00
1
6
2021-08-25
12:30:00
14:30:00
2
7
2021-08-25
15:45:00
19:00:00
1
8
2021-08-26
08:30:00
09:30:00
1
9
2021-08-26
15:30:00
19:00:00
1
10
2021-08-26
10:30:00
11:00:00
1
11
2021-08-26
12:00:00
14:30:00
1
12
2021-08-27
08:30:00
09:30:00
1
13
2021-08-27
11:00:00
11:15:00
1
14
2021-08-27
11:30:00
14:30:00
1
15
2021-08-28
08:30:00
09:30:00
1
Using the above sample data you can see 3 groupings that can form such a continuous range.
Range 1 consists of Id's, 1,2 & 5 - 2021-08-23 to 2021-08-25, 08:30:00 to 19:00:00
Range 2 consists of Id's 3 & 6 - 2021-08-24 to 2021-08-25, 12:30:00 to 14:30:00
Range 3 consists of Id's 8, 12 & 15 - 2021-08-26 to 2021-08-28, 08:30:00 to 09:30:00
The end result should be:
JobId
StartDate
EndDate
StartTime
EndTime
Ids
1
2021-08-23
2021-08-25
08:30:00
19:00:00
1,2,5
2
2021-08-24
2021-08-25
12:30:00
14:30:00
3,6
1
2021-08-26
2021-08-28
08:30:00
09:30:00
8,12,15
MySQL 8.0.23
Assuming that JobId, `Date`, StartTime, EndTime is unique you may use:
SELECT JobId,
MIN(`Date`) StartDate,
MAX(`Date`) EndDate,
StartTime,
EndTime,
GROUP_CONCAT(Id) Ids
FROM test
GROUP BY JobId,
StartTime,
EndTime
HAVING COUNT(*) > 1
AND DATEDIFF(EndDate, StartDate) = COUNT(*) - 1
ORDER BY StartDate, StartTime
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=fce8590f72ac1d50cd9e89add3ed01e7
I want to combine these 4 tables under the results so the table will be like recap, data when I join the table that should be a tanggal field with the name novi irawati there are 2 different dates that are dates in the december and january months,
when I have made it turned ou t so the error in the date turns out that the december month is all in novi irawati is a solution? beg for his help
Table laporan_totalgaji
nip nama jabatan tanggal tahun_masuk gaji_pokok total_tunjangan
10010 Muhammad Hayyi Pimpinan 2019-12-15 2014-03-07 1425000 6669000
10011 Rifyal Ainul Yaqin Ka Mantri 2019-12-16 2015-03-19 920000 4889200
10016 Novi Irawati Kasir 2019-12-19 2016-04-18 650000 3075000
10019 Abdul Muik Mantri 2019-12-20 2017-08-04 525000 4245000
10015 Alfan Ka Mantri 2019-12-20 2017-03-10 850000 4889200
10017 Romiatul Jamil Staff Admin 2019-12-21 2017-02-09 525000 2455000
10012 Misbahul Munir Ka Mantri 2019-12-21 2015-09-28 920000 4889200
10018 Fidatul Hasanah Staff Admin 2019-12-21 2017-03-12 525000 2455000
10013 Ari Arif Sholeh Ka Mantri 2019-12-21 2015-03-08 920000 4889200
10016 Novi Irawati Kasir 2020-01-31 2016-04-18 650000 3075000
table sum_potongan
nip nama total_tunjangan
10010 Muhammad Hayyi 6669000
10011 Rifyal Ainul Yaqin 4889200
10012 Misbahul Munir 4889200
10013 Ari Arif Sholeh 4889200
10014 Sopantoni Hendri C 4889200
10015 Alfan 4889200
10016 Novi Irawati 3075000
10017 Romiatul Jamil 2455000
10018 Fidatul Hasanah 2455000
10019 Abdul Muik 4245000
10020 Supyan Bariki 4245000
10021 Imam Baihaki 4245000
10022 Ahmad Andika 4245000
10023 Ahmad Jufri 4245000
10024 Sulaiman Ali Farizi 4245000
table sum_potongan
nip nama total_potongan
10010 Muhammad Hayyi 242500
10011 Rifyal Ainul Yaqin 234000
10012 Misbahul Munir 234000
10013 Ari Arif Sholeh 234000
10014 Sopantoni Hendri C 234000
10015 Alfan 234000
10016 Novi Irawati 230500
10017 Romiatul Jamil 228500
10018 Fidatul Hasanah 228500
10019 Abdul Muik 227500
10020 Supyan Bariki 227500
10021 Imam Baihaki 227500
10022 Ahmad Andika 227500
10023 Ahmad Jufri 227500
10024 Sulaiman Ali Farizi 227500
10025 Andy Rachman 234000
10027 Tony Stark 234000
10028 Natasha 228500
table sum_potonganabsen
nip nama tanggal total_denda
10010 Muhammad Hayyi 2019-12-15 0
10011 Rifyal Ainul Yaqin 2019-12-16 0
10012 Misbahul Munir 2019-12-20 0
10013 Ari Arif Sholeh 2019-12-20 0
10014 Sopantoni Hendri C 2019-12-20 0
10015 Alfan 2019-12-20 0
10016 Novi Irawati 2020-01-03 37500
10016 Novi Irawati 2019-12-19 100000
10017 Romiatul Jamil 2019-12-20 0
10018 Fidatul Hasanah 2019-12-20 0
10019 Abdul Muik 2019-12-20 0
my query
select laporan_totalgaji.nip,
laporan_totalgaji.nama,
laporan_totalgaji.jabatan,
laporan_totalgaji.tanggal,
laporan_totalgaji.gaji_pokok,
laporan_totalgaji.total_tunjangan,
(laporan_totalgaji.gaji_pokok + laporan_totalgaji.total_tunjangan) as gaji_kotor,
(sp.total_potongan) as total_potongan,s.total_potonganabsen,((laporan_totalgaji.gaji_pokok +
laporan_totalgaji.total_tunjangan)-(sp.total_potongan+ s.total_potonganabsen)) as gaji_bersih
from laporan_totalgaji
inner join sum_tunjangan st on laporan_totalgaji.nip = st.nip
inner join sum_potongan sp on laporan_totalgaji.nip = sp.nip
inner join sum_potonganabsen s on laporan_totalgaji.nip = s.nip
group by laporan_totalgaji.nip,
laporan_totalgaji.nama,
laporan_totalgaji.jabatan,
laporan_totalgaji.gaji_pokok,
laporan_totalgaji.total_tunjangan,
(laporan_totalgaji.gaji_pokok + laporan_totalgaji.total_tunjangan),
(sp.total_potongan),
s.total_potonganabsen,((laporan_totalgaji.gaji_pokok + laporan_totalgaji.total_tunjangan)-
(sp.total_potongan+s.total_potonganabsen))
query result
[1
result i want
For the last join, try:
inner join sum_potonganabsen s on laporan_totalgaji.nip = s.nip
and laporan_totalgaji.tangal = s.tangal
I want to return only the price being shown on a grocery retailers website.
I have web scraped the table on the website but I want to only have the price for delivery in each cell in the dataframe. My idea is to filter each cell and return a regex match for a price within the string in the cell. I'm not sure if there's a simpler way I can do this, perhaps with pd.read_html?
import requests
import pandas as pd
from bs4 import BeautifulSoup
postcode = 'l4 0th'
payload = {'postcode': postcode}
putUrl = 'https://www.sainsburys.co.uk/gol-api/v1/customer/postcode'
Sains_url = 'https://www.sainsburys.co.uk/shop/PostCodeCheckSuccessView'
Sains_url2 = 'https://www.sainsburys.co.uk/shop/BookingDeliverySlotDisplayView'
client = requests.Session()
PutReq = client.put(putUrl, data=payload)
rget = client.get(Sains_url)
r2 = client.get(Sains_url2)
soup = BeautifulSoup(r2.content,'lxml')
table = soup.find_all('table')[0]
df = pd.read_html(str(table), skiprows=([1]))[0]
df = df[~df.Time.str.contains("Afternoon delivery")]
df = df[~df.Time.str.contains("Evening delivery")]
My dataframe should look like this:
+-------------+----------------+-------------+-------------+
| Time | Today | Wed 26 June | Thu 27 June |
+-------------+----------------+-------------+-------------+
| 7.30-8:30am | Not Available | £3 | £5 |
+-------------+----------------+-------------+-------------+
IIUC, you could do some post-processing with regex and applymap:
import re
pat = re.compile('£\S+')
# Where this regex will extract '£' and every proceeding character
# upto the next whitespace
df.applymap(lambda x: re.findall(pat, str(x))[0] if '£' in str(x) else x)
[out]
Time Today Wed 26 Jun Thu 27 Jun Fri 28 Jun \
0 7:30am - 8:30am Not Available Not Available £4.50 £7
1 8:00am - 9:00am Not Available £3 £5.50 £6
2 8:30am - 9:30am Not Available £3 £5.50 £6
3 9:00am - 10:00am Not Available £3 £4.50 £6
4 9:30am - 10:30am Not Available £3 £4.50 £6
5 10:00am - 11:00am Not Available £2.50 £3.50 £5
6 11:00am - 12:00pm Not Available £1.50 £2.50 £4
8 12:00pm - 1:00pm Not Available £1 £2 £3
9 1:00pm - 2:00pm Not Available £0.50 £2 £2.50
10 2:00pm - 3:00pm Not Available £0.50 £3 £2.50
11 3:00pm - 4:00pm Not Available £0.50 £3 £3.50
12 4:00pm - 5:00pm Not Available £1 £3 £4.50
13 4:30pm - 5:30pm Not Available £1 £3 £4.50
15 5:00pm - 6:00pm Not Available £1 £3.50 £4.50
16 5:30pm - 6:30pm Not Available £1 £3.50 £4.50
17 6:00pm - 7:00pm Not Available Not Available £2.50 £4
18 6:30pm - 7:30pm Not Available Not Available £2.50 £4
19 7:00pm - 8:00pm Not Available Not Available £2.50 £4
20 7:30pm - 8:30pm Not Available Not Available £2.50 £4
21 8:00pm - 9:00pm Not Available Not Available £1.50 £2
22 9:00pm - 10:00pm Not Available £1.50 £1 £1.50
23 10:00pm - 11:00pm Not Available £1 £0.50 £1.50
Sat 29 Jun Sun 30 Jun Mon 1 Jul
0 £6.50 Not Available £5.50
1 £7 £7 £5.50
2 £7 £7 £5.50
3 £7 £7 £5
4 £7 £7 £5
5 £5.50 £5.50 £4.50
6 £5.50 £5 £2.50
8 £3.50 £3.50 £2
9 £3 £3.50 £1.50
10 £3 £2.50 £3
11 £3.50 £3 £2.50
12 £3.50 £3.50 £4
13 £3.50 £3.50 £4
15 £3 £2.50 £4
16 £3 £2.50 £4
17 £3 £3 £3
18 £3 £3 £3
19 £3 £3 £3
20 £3 £3 £3
21 £2 £2 £1
22 £2 £2 £1
23 Not Available Not Available £0.50
If lambdas aren't your thing, this would be akin to the more explicit:
def extract_cost(string):
if '£' in string:
return re.findall('£\S+', string)[0]
else:
return string
df.applymap(extract_cost)
Where applymap here is just 'applying' the function extract_cost to every value in DataFrame
I can't find solution how can I sort data form three datasets. I have one static dataset and two matrix tables which I want to connect in one report. Every table has the same ID which I can use to connect them (the same number of rows as well) but don't know how could I do this? Is it possibile to connect few datasets?
table1:
N ID St From To
1 541 7727549 08:30:00 14:00:00
2 631 7727575 07:00:00 15:00:00
3 668 7727552 09:00:00 17:00:00
4 679 18:00:00 00:00:00
5 721 17:00:00 00:00:00
table:2
ID P1 P2 P3 P4
541 12:00:00 - 12:10:00
631 08:45:00 - 08:55:00 11:30:00 - 11:40:00 13:00:00 - 13:15:00
668 12:05:00 - 12:15:00 13:45:00 - 13:55:00 14:55:00 - 15:10:00
679 21:15:00 - 21:30:00
721 20:40:00 - 20:50:00 21:50:00 - 22:05:00
table3:
ID W1 W2 W3
541 11:28:58 - 11:39:13
631 08:46:54 - 08:58:43 11:07:04 - 11:17:05
668 11:26:11 - 11:41:44
679
721 11:07:19 - 11:17:06