How to query list of Id's in Database using dplyr in r - mysql

I'm new to using r to manipulate data from the database
I want to know how to query a list of Id's in a database table
I want a situation whereby the query returns all records of Id's if found
Before I used to query just one id with the code below
start_1<-tbl(connect, "accountbits")%>%
filter(Tranx_id == "2022011813250866101336997")%>%
collect()
So it shows the query with details attached with the id.
I want to have many id's like the example below
start_2<-tbl(connect, "accountbits")%>%
filter(Tranx_id = c("2022011813250866101336997","20220115675250866101336997"
"202201181325086610143246997","2022015433250866101336997")%>%
collect()
I want it to bring all records attached to this id in the database
Thank you

The R operator you are looking for is %in%. This checks set intersection:
c(1,3,5) %in% c(1,2,3,4)
# = (TRUE, TRUE, FALSE)
because 1 and 3 are in c(1,2,3,4).
You can type ?`%in%` at the console for help info about this operator (` is the backtick, located next to the number 1 in the top left corner of more keyboards).
There are dbplyr translations defined for %in% so a command like:
start_2 <- tbl(connect, "accountbits")%>%
filter(Tranx_id %in% c("1234","2345","3456"))
will translate into SQL like:
SELECT *
FROM accountbits
WHERE Tranx_id IN (1234, 2345, 3456)
and collect() will pull those values into local R memory as expected.

Related

Include Multiple Search Terms in an HTML Request

I found this post over here that shows how to search for news articles on Google using R:Scraping Google News with Rvest for Keywords
This post shows how to search for a single term, for example: keyword <- "https://news.google.com/rss/search?q=apple&hl=en-IN&gl=IN&ceid=IN:en"
Can this above code be modified to search for multiple terms? For example, suppose if I want to search for news articles that contain BOTH "iphone" and "covid":
Could I write the query like this?
library(tidyRSS)
#I have feeling that "IN" stands for "India" - if I want to change this to "Canada", I think I need to replace "IN" with "CAN"?
keyword <- "https://news.google.com/rss/search?q=apple&q=covid&hl=en-IN&gl=IN&ceid=IN:en"
# From the package vignette
google_news <- tidyfeed(
keyword,
clean_tags = TRUE,
parse_dates = TRUE
)
Is this correct?
Thank you!
PS: I wonder if there is a way to restrict the dates between which the search will be performed?
For multiple items, if we want either of them use OR or if both needs to be present use AND. Similarly, the hl stands for language, and gl for country. In addition, for date ranges, use keyword before/after
library(tidyRSS)
keyword <- "https://news.google.com/rss/search?q=apple%20AND%20covid+after:2022-07-01+before:2022-08-02&hl=en-US&gl=US&ceid=US:en"
google_news <- tidyfeed(
keyword,
clean_tags = TRUE,
parse_dates = TRUE
)
-checking for the date ranges
library(dplyr)
> all(between(as.Date(google_news$feed_pub_date),
as.Date("2022-07-01"), as.Date("2022-08-02")))
[1] TRUE

Check whether any values in array match any values in json column

I have a problem of checking whether any values in an array match any values in json column which contains an array with a name.
suppose i have an array [25,36,45,52] and json column is {"values": [25,24,15]}.
I want to check whether any values in array match any of values in json column in xampp mysql. please provide a better solution of doing this. this image show table structure of my database
i have 4 tables.
user
profile
profile
jobs
user table (id,userid)
jobs table (id,user_id,skill_id)
skill table (id,job_id,)
profile table (id,user_id)
now i want to search all jobs that match some or at least one skills.
i have tried with this but this is giving all jobs with out skills filtered.
$jobs = Job::with(['user','profile'])->with(['skills' => function($query){
$query->whereJsonContains('skills->skills',[35]);
}])->where('jobs.is_completed',0);
please help me.
you can use where Clause easily for example you would like to get rows that match skills 35,54:
$users = DB::table('table')
-> whereJsonContains('skills->skills', [35,54])
->get();
for more details about how to querying json column check official docs :
https://laravel.com/docs/5.8/queries#json-where-clauses

Downsampling analytics data in MySQL or in R

I am storing analytics data in an MySQL database as a table with a timestamp and some data, and want to downsample (i.e group it within a time range) this data (by counting the number of entries) for displaying on an admin console, and I was wondering if it would be more efficient to select the data and downsample it with an R script, or if it would be better to use
GROUP BY UNIX_TIMESTAMP(timestamp) DIV <some time>
and do it on the database layer. Any other tips would also be appreciated.
If you can use dplyr, you could do it with something like the following:
library(dplyr)
yay <-
# Specify username and password in my.cnf
src_mysql(host = "blah.com") %>%
tbl("some_table") %>%
# You will need to compute a grouping variable
mutate(group = unix_timestamp(timestamp)) %>%
group_by(group) %>%
# This will return the number of rows in each group
summarise(n = n()) %>%
# This will execute the query and return a data.frame
collect

correctly fetch nested list in SQL

I have a design problem with SQL request:
I need to return data looking like:
listChannels:
-idChannel
name
listItems:
-data
-data
-idChannel
name
listItems:
-data
-data
The solution I have now is to send a first request:
*"SELECT * FROM Channel WHERE idUser = ..."*
and then in the loop fetching the result, I send for each raw another request to feel the nested list:
"SELECT data FROM Item WHERE idChannel = ..."
It's going to kill the app and obviously not the way to go.
I know how to use the join keyword, but it's not exactly what I want as it would return a row for each data of each listChannels with all the information of the channels.
How to solve this common problem in a clean and efficient way ?
The "SQL" way of doing this produces of table with columns idchannel, channelname, and the columns for item.
select c.idchannel, c.channelname, i.data
from channel c join
item i
on c.idchannel = i.idchannel
order by c.idchannel, i.item;
Remember that a SQL query returns a result set in the form of a table. That means that all the rows have the same columns. If you want a list of columns, then you can do an aggregation and put the items in a list:
select c.idchannel, c.channelname, group_concat(i.data) as items
from channel c join
item i
on c.idchannel = i.idchannel
group by c.idchannel, c.channelname;
The above uses MySQL syntax, but most databases support similar functionality.
SQL is made for accessing two-dimensional data tables. (There are more possibilities, but they are very complex and maybe not standardized)
So the best way to solve your problem is to use multiple requests. Please also consider using transactions, if possible.

SQL Alchemy return list of ids

I am using SQL Alchemy and I want to return a list of Document Ids. The Ids are the primary key in the documents table. My current query returns a list of tuples.
userDocs = session.query(Document.idDocument).filter(Document.User_idUser == user.idUser).all()
The reason I want a list of ids is so that I can search another table using in_(userDocs).
So another solution would be to be able to search using tuples. I am currently returning nothing from my second query using userDocs.
Thank you!!
You don't need to do an intermediate query, you can do this all in one shot!
things = session.query(Things) \
.join(Thing.documents) \
.filter(Document.User_idUser==user.idUser)
You just query on the properties of the Document through its relationship() on the intended entity.