Proper way to convert SQL results to JSON - mysql

I am creating a API using nodejs. The API takes request and responds in JSON
For example:
I have a table QUESTION in my database so a GET request to endpoint http://localhost/table/question will output the table in JSON format.
However there is a problem when performing JOINS
Considering tables QUESTION and CHOICE. A question has many choices (answers) their join will be
Table:
I am trying to convert to something like this
{
"0":{
"QUESTION":"If Size of integer pointer is 4 Bytes what is size of float pointer ?",
"OPTION":{
"A":"3 Bytes",
"B":"32 Bits",
"C":"64 Bits",
"D":"12 Bytes"
}
},
"1":{
"QUESTION":"Which one is not a SFR",
"OPTION":{
"A":"PC",
"B":"R1",
"C":"SBUF"
}
},
"2":{
"QUESTION":"What is Size of DPTR in 8051",
"OPTION":{
"A":"16 Bits",
"B":"8 Bytes",
"C":"8 Bits"
}
},
"3":{
"QUESTION":"Is to_string() is valid builtin function prior to c++11 ? ",
"OPTION":{
"A":"Yes",
"B":"No"
}
}
}
The obvious solution is to parse it query using JOIN and convert it to JSON.
Is there any more efficient way to do it?

In MySQL you can achieve this with group_concat
tablenames, fieldnames, etc are pure fantasy :-)
select
q.text as question,
group_concat(answer.label, ';;;') as labels,
group_concat(answer.text, ';;;') as answers
from
question as q
join answer as a on a.quesion = q.id
group by
q.text
And then in your application (nodejs)
let resultRows = callSomeFunctionsThatReturnesAllRowsAsArray();
let properStructure = resultRows.map(row => {
let texts = row.answers.split(';;;');
return {
question: row.question,
options: row.labels.split(';;;').map((label, index) => {label: label, answer: texts[index]});
}
});

Related

Python glom with list of records group common unique client_ids together as key

I just discovered glom and the tutorial makes sense, but I can't figure out the right spec to use for chrome BrowserHistory.json entries to create a data structure grouped by client_id or if this is even the right use of glom. I think I can accomplish this using other methods by looping over the json, but was hoping to learn more about glom and its capabilities.
The json has Browser_History with a list for each history entry as follows:
{
"Browser_History": [
{
"favicon_url": "https://www.google.com/favicon.ico",
"page_transition": "LINK",
"title": "Google Takeout",
"url": "https://takeout.google.com",
"client_id": "abcd1234",
"time_usec": 1424794867875291
},
...
I'd like a data structure where everything is grouped by the client_id, like with the client_id as the key to a list of dicts, something like:
{ 'client_ids' : {
'abcd1234' : [ {
"title" : "Google Takeout",
"url" : "https://takeout.google.com",
...
},
...
],
'wxyz9876' : [ {
"title" : "Google",
"url" : "https://www.google.com",
...
},
...
}
}
Is this something glom is suited for? I've been playing around with it and reading, but I can't seem to get the spec correct to accomplish what I need. Best I've got without error is:
with open(history_json) as f:
history_list = json.load(f)['Browser_History']
spec = {
'client_ids' : ['client_id']
}
pprint(glom(data, spec))
which gets me a list of all the client_ids, but I can't figure out how to group them together as keys rather than have them as a big list. any help would be appreciated, thanks!
This should do the trick although I'm not sure if this is the most "glom"-ic way to achieve this.
import glom
grouping_key = "client_ids"
def group_combine (existing,incoming):
# existing is a dictionary used for accumulating the data
# incoming is each item in the list (your input)
if incoming[grouping_key] not in existing:
existing[incoming[grouping_key]] = []
if grouping_key in incoming:
existing[incoming[grouping_key]].append(incoming)
return existing
data ={ 'Browser_History': [{}] } # your data structure
fold_spec = glom.Fold(glom.T,init = dict, op = group_combine )
results = glom.glom(data["Browser_History"] ,{ grouping_key:fold_spec })

How can I load the following JSON (deeply nested) to a DataFrame?

A sample of the JSON is as shown below:
{
"AN": {
"dates": {
"2020-03-26": {
"delta": {
"confirmed": 1
},
"total": {
"confirmed": 1
}
}
}
},
"KA": {
"dates": {
"2020-03-09": {
"delta": {
"confirmed": 1
},
"total": {
"confirmed": 1
}
},
"2020-03-10": {
"delta": {
"confirmed": 3
},
"total": {
"confirmed": 4
}
}
}
}
}
I would like to load it into a DataFrame, such that the state names (AN, KA) are represented as Row names, and the dates and nested entries are present as Columns.
Any tips to achieve this would be very much appreciated. [I am aware of json_normalize, however I haven't figured out how to work it out yet.]
The output I am expecting, is roughly as shown below:
Can you update your post with the DataFrame you have in mind ? It'll be easier to understand what you want.
Also sometimes it's better to reshape your data if you can't make it work the way they are now.
Update:
Following your update here's what you can do.
You need to reshape your data, as I said when you can't achieve what you want it is best to look at the problem from another point of view. For instance (and from the sample you shared) the 'dates' keys is meaningless as the other keys are already dates and there are no other keys ate the same level.
A way to achieve what you want would be to use MultiIndex, it'll help you group your data the way you want. To use it you can for instance create all the indices you need and store in a dictionary the values associated.
Example :
If the only index you have is ('2020-03-26', 'delta', 'confirmed') you should have values = {'AN' : [1], 'KA':None}
Then you only need to create your DataFrame and transpose it.
I gave it a quick try and came up with a piece of code that should work. If you're looking for performance I don't think this will do the trick.
import pandas as pd
# d is the sample you shared
index = [[],[],[]]
values = {}
# Get all the dates
dates = [date for c in d.keys() for date in d[c]['dates'].keys() ]
for country in d.keys():
# For each country we create an array containing all 6 values for each date
# (missing values as None)
values[country] = []
for date in dates:
if date in d[country]['dates']:
for method in ['delta', 'total']:
for step in ['confirmed', 'recovered', 'tested']:
# Incrementing indices
index[0].append(date)
index[1].append(method)
index[2].append(step)
if step in value.keys():
values[country].append(deepcopy(d[country]['dates'][date][method][step]))
else :
values[country].append(None)
# When country does not have a date fill with None
else :
for method in ['delta', 'total']:
for step in ['confirmed', 'recovered', 'tested']:
index[0].append(date)
index[1].append(method)
index[2].append(step)
values[country].append(None)
# Removing duplicates introduced because we added n_countries times
# the indices
# 3 is the number of steps
# 2 is the number of methods
number_of_rows = 3*2*len(dates)
index[0] = index[0][:number_of_rows]
index[1] = index[1][:number_of_rows]
index[2] = index[2][:number_of_rows]
df = pd.DataFrame(values, index=index).T
Here is what I have for the transposed data frame of my output :
Hope this can help you
You clearly needs to reshape your json data before load it into a DataFrame.
Have you tried load your json like a dict ?
dataframe = pd.DataFrame.from_dict(JsonDict, orient="index")
The “orient” of the data. If the keys of the passed dict should be the columns of the resulting DataFrame, pass ‘columns’ (default). Otherwise if the keys should be rows, pass ‘index’.

How to query multiple fields with one value in Firebase Realtime Database? [duplicate]

{
"movies": {
"movie1": {
"genre": "comedy",
"name": "As good as it gets",
"lead": "Jack Nicholson"
},
"movie2": {
"genre": "Horror",
"name": "The Shining",
"lead": "Jack Nicholson"
},
"movie3": {
"genre": "comedy",
"name": "The Mask",
"lead": "Jim Carrey"
}
}
}
I am a Firebase newbie. How can I retrieve a result from the data above where genre = 'comedy' AND lead = 'Jack Nicholson'?
What options do I have?
Using Firebase's Query API, you might be tempted to try this:
// !!! THIS WILL NOT WORK !!!
ref
.orderBy('genre')
.startAt('comedy').endAt('comedy')
.orderBy('lead') // !!! THIS LINE WILL RAISE AN ERROR !!!
.startAt('Jack Nicholson').endAt('Jack Nicholson')
.on('value', function(snapshot) {
console.log(snapshot.val());
});
But as #RobDiMarco from Firebase says in the comments:
multiple orderBy() calls will throw an error
So my code above will not work.
I know of three approaches that will work.
1. filter most on the server, do the rest on the client
What you can do is execute one orderBy().startAt()./endAt() on the server, pull down the remaining data and filter that in JavaScript code on your client.
ref
.orderBy('genre')
.equalTo('comedy')
.on('child_added', function(snapshot) {
var movie = snapshot.val();
if (movie.lead == 'Jack Nicholson') {
console.log(movie);
}
});
2. add a property that combines the values that you want to filter on
If that isn't good enough, you should consider modifying/expanding your data to allow your use-case. For example: you could stuff genre+lead into a single property that you just use for this filter.
"movie1": {
"genre": "comedy",
"name": "As good as it gets",
"lead": "Jack Nicholson",
"genre_lead": "comedy_Jack Nicholson"
}, //...
You're essentially building your own multi-column index that way and can query it with:
ref
.orderBy('genre_lead')
.equalTo('comedy_Jack Nicholson')
.on('child_added', function(snapshot) {
var movie = snapshot.val();
console.log(movie);
});
David East has written a library called QueryBase that helps with generating such properties.
You could even do relative/range queries, let's say that you want to allow querying movies by category and year. You'd use this data structure:
"movie1": {
"genre": "comedy",
"name": "As good as it gets",
"lead": "Jack Nicholson",
"genre_year": "comedy_1997"
}, //...
And then query for comedies of the 90s with:
ref
.orderBy('genre_year')
.startAt('comedy_1990')
.endAt('comedy_2000')
.on('child_added', function(snapshot) {
var movie = snapshot.val();
console.log(movie);
});
If you need to filter on more than just the year, make sure to add the other date parts in descending order, e.g. "comedy_1997-12-25". This way the lexicographical ordering that Firebase does on string values will be the same as the chronological ordering.
This combining of values in a property can work with more than two values, but you can only do a range filter on the last value in the composite property.
A very special variant of this is implemented by the GeoFire library for Firebase. This library combines the latitude and longitude of a location into a so-called Geohash, which can then be used to do realtime range queries on Firebase.
3. create a custom index programmatically
Yet another alternative is to do what we've all done before this new Query API was added: create an index in a different node:
"movies"
// the same structure you have today
"by_genre"
"comedy"
"by_lead"
"Jack Nicholson"
"movie1"
"Jim Carrey"
"movie3"
"Horror"
"by_lead"
"Jack Nicholson"
"movie2"
There are probably more approaches. For example, this answer highlights an alternative tree-shaped custom index: https://stackoverflow.com/a/34105063
If none of these options work for you, but you still want to store your data in Firebase, you can also consider using its Cloud Firestore database.
Cloud Firestore can handle multiple equality filters in a single query, but only one range filter. Under the hood it essentially uses the same query model, but it's like it auto-generates the composite properties for you. See Firestore's documentation on compound queries.
I've written a personal library that allows you to order by multiple values, with all the ordering done on the server.
Meet Querybase!
Querybase takes in a Firebase Database Reference and an array of fields you wish to index on. When you create new records it will automatically handle the generation of keys that allow for multiple querying. The caveat is that it only supports straight equivalence (no less than or greater than).
const databaseRef = firebase.database().ref().child('people');
const querybaseRef = querybase.ref(databaseRef, ['name', 'age', 'location']);
// Automatically handles composite keys
querybaseRef.push({
name: 'David',
age: 27,
location: 'SF'
});
// Find records by multiple fields
// returns a Firebase Database ref
const queriedDbRef = querybaseRef
.where({
name: 'David',
age: 27
});
// Listen for realtime updates
queriedDbRef.on('value', snap => console.log(snap));
var ref = new Firebase('https://your.firebaseio.com/');
Query query = ref.orderByChild('genre').equalTo('comedy');
query.addValueEventListener(new ValueEventListener() {
#Override
public void onDataChange(DataSnapshot dataSnapshot) {
for (DataSnapshot movieSnapshot : dataSnapshot.getChildren()) {
Movie movie = dataSnapshot.getValue(Movie.class);
if (movie.getLead().equals('Jack Nicholson')) {
console.log(movieSnapshot.getKey());
}
}
}
#Override
public void onCancelled(FirebaseError firebaseError) {
}
});
Frank's answer is good but Firestore introduced array-contains recently that makes it easier to do AND queries.
You can create a filters field to add you filters. You can add as many values as you need. For example to filter by comedy and Jack Nicholson you can add the value comedy_Jack Nicholson but if you also you want to by comedy and 2014 you can add the value comedy_2014 without creating more fields.
{
"movies": {
"movie1": {
"genre": "comedy",
"name": "As good as it gets",
"lead": "Jack Nicholson",
"year": 2014,
"filters": [
"comedy_Jack Nicholson",
"comedy_2014"
]
}
}
}
For Cloud Firestore
https://firebase.google.com/docs/firestore/query-data/queries#compound_queries
Compound queries
You can chain multiple equality operators (== or array-contains) methods to create more specific queries (logical AND). However, you must create a composite index to combine equality operators with the inequality operators, <, <=, >, and !=.
citiesRef.where('state', '==', 'CO').where('name', '==', 'Denver');
citiesRef.where('state', '==', 'CA').where('population', '<', 1000000);
You can perform range (<, <=, >, >=) or not equals (!=) comparisons only on a single field, and you can include at most one array-contains or array-contains-any clause in a compound query:
Firebase doesn't allow querying with multiple conditions.
However, I did find a way around for this:
We need to download the initial filtered data from the database and store it in an array list.
Query query = databaseReference.orderByChild("genre").equalTo("comedy");
databaseReference.addValueEventListener(new ValueEventListener() {
#Override
public void onDataChange(#NonNull DataSnapshot dataSnapshot) {
ArrayList<Movie> movies = new ArrayList<>();
for (DataSnapshot dataSnapshot1 : dataSnapshot.getChildren()) {
String lead = dataSnapshot1.child("lead").getValue(String.class);
String genre = dataSnapshot1.child("genre").getValue(String.class);
movie = new Movie(lead, genre);
movies.add(movie);
}
filterResults(movies, "Jack Nicholson");
}
}
#Override
public void onCancelled(#NonNull DatabaseError databaseError) {
}
});
Once we obtain the initial filtered data from the database, we need to do further filter in our backend.
public void filterResults(final List<Movie> list, final String genre) {
List<Movie> movies = new ArrayList<>();
movies = list.stream().filter(o -> o.getLead().equals(genre)).collect(Collectors.toList());
System.out.println(movies);
employees.forEach(movie -> System.out.println(movie.getFirstName()));
}
The data from firebase realtime database is as _InternalLinkedHashMap<dynamic, dynamic>.
You can also just convert this it to your map and query very easily.
For example, I have a chat app and I use realtime database to store the uid of the user and the bool value whether the user is online or not. As the picture below.
Now, I have a class RealtimeDatabase and a static method getAllUsersOnineStatus().
static getOnilineUsersUID() {
var dbRef = FirebaseDatabase.instance;
DatabaseReference reference = dbRef.reference().child("Online");
reference.once().then((value) {
Map<String, bool> map = Map<String, bool>.from(value.value);
List users = [];
map.forEach((key, value) {
if (value) {
users.add(key);
}
});
print(users);
});
}
It will print [NOraDTGaQSZbIEszidCujw1AEym2]
I am new to flutter If you know more please update the answer.
ref.orderByChild("lead").startAt("Jack Nicholson").endAt("Jack Nicholson").listner....
This will work.

Query for nested JSON property in azure CosmosDb

I am having some difficulty crafting a query for nested data in cosmosDB.
Say I have data stored in this structure:
{
id:"1234",
data:{
people:{
"a826bbc5-add9-42d8-ba52-f5de52973556":{
first_name: "Kyle"
},
"efb119d-9f12-4d11-a7e1-38e4719a699c":{
first_name: "Bob"
},
"b402faac-d1ba-4317-9ba6-673c76a8fc37":{
first_name: "Jane"
}
}
}
}
Now I want to write a query that would return all of the people with the first name of "Bob"
I need something like:
Select * from c where c.data.people[*].first_name = "Bob";
Notice that the "people" object is an actual JSON object not a JSON array, so no array_contains, I need basically the JSON obj equivalent.
I've looked around and can't seem to find the appropriate query for this common use-case.
Anyone know how I can accomplish this query?
Since the key of people objects is random,i'm afraid you can't query it with normal sql.I tried to implement your needs with UDF in cosmos db.
Udf code:
function userDefinedFunction(peopleObj){
var returnArray = [];
for(var key in peopleObj){
if (peopleObj[key].first_name == "Bob"){
var map = {};
map[key] = peopleObj[key];
returnArray.push(map);
}
}
return returnArray;
}
Sql:
SELECT udf.test(c.data.people) as BobPeople FROM c
Sample data:
Output:
Marked Jay's answer as the accepted answer as I ended up using udfs - I'll post the function I ended up using and the query for anyone looking for something a little more generic.
function userDefinedFunction(properties, fieldName, filedValue){
for(var k in properties){
if(properties[k][fieldName] && properties[k][fieldName] == filedValue)
return true;
}
return false;
}
with a query of:
select * from c where udf.hasValue(c.data.people,"first_name","Bob")

MySQL to MongoDB query translation

I need to convert following mysql query to mongo.
Any help will be highly appreciated.
SELECT cr.*, COUNT(cj.job_id) AS finished_chunks FROM `checks_reports_df8` cr
LEFT JOIN `checks_jobs_df8` cj ON cr.id = cj.report_id
WHERE cr.started IS NOT NULL AND cr.finished IS NULL AND cj.is_done = 1
MongoDB doesn't do JOINs. So you will have to query both collections and do the JOIN on the application layer. How to do this exactly depends on which programming language you use to develop your application. You don't say which one you use, so I will just give you an example in JavaScript. When you use a different language: The second snippet is just a simple FOR loop.
These are the MongoDB queries you would use. I don't have access to your data, so I can not guarantee correctness.
var reports = db.checks_reports_df8.find({
"started": {$exists: 1 },
"finished": {$exists: 0 }
});
This query assumes that your null-values are represented by missing fields which is normal practice in MongoDB. When you have actual null values, use "started": { $ne: null } and "finished": null.
Then iterate over the array of documents you get. For each RESULT perform this query:
reports.forEach(function(report) {
var job_count = db.checks_jobs_df8.aggregate([
{$match: {
"report_id": report.id,
"is_done": 1
}},
{$group: {
_id: "$job_id",
"count": { $sum: 1 }
}}
])
// output the data from report and job_count here
});