REST api design to retrieve summary information - json

I have a scenario in which I have REST API which manages a Resource which we will call Group.
A Group is similar in concept to a discussion forum in Google Groups.
Now I have two GET access method which I believe needs separate representations.
The 1st GET access method retrieves the minimal amount of information about a Group.
Given a group_id it should return a minimal amount of information like
{
group_id: "5t7yu8i9io0op",
group_name: "Android Developers",
is_moderated: true,
number_of_users: 34,
new_messages: 5,
icon: "http://boo.com/pic.png"
}
The 2nd GET access method retrives summary information which are more statistical in nature like:
{
group_id: "5t7yu8i9io0op",
top_ranking_users: {
[ { user: "george", posts: 789, rank: 1 },
{ user: "joel", posts: 560, rank: 2 } ...]
},
popular_topics: {
[ ... ]
}
}
I want to separate these data access methods and I'm currently planning on this design:
GET /group/:group_id/
GET /group/:group_id/stat
Only the latter will return the statistical information about the group. What do you think about this ?

I don't see a problem with your approach. Since the statistics are basically separate data, you could treat the stats as a separate resource, too, providing a URI like
GET /stat/:group_id
Additionally you can cross reference your resources (meaning a group links to the corresponding stat resource and vice versa):
GET /group/5t7yu8i9io0op
{
group_id: "5t7yu8i9io0op",
group_name: "Android Developers",
is_moderated: true,
number_of_users: 34,
new_messages: 5,
icon: "http://boo.com/pic.png",
stats: "http://mydomain.com/stat/5t7yu8i9io0op"
}
GET /stat/5t7yu8i9io0op
{
group: "http://mydomain.com/group/5t7yu8i9io0op",
top_ranking_users: {
[ { user: "george", posts: 789, rank: 1 },
{ user: "joel", posts: 560, rank: 2 } ...]
},
popular_topics: {
[ ... ]
}
}

What would be even better would be if you embedded the link to the statistics in the group summary:
{
group_id: "5t7yu8i9io0op",
group_name: "Android Developers",
is_moderated: true,
number_of_users: 34,
new_messages: 5,
icon: "http://boo.com/pic.png"
stats_link : "http://whatever.who/cares"
}

Related

Fetch users data with related clients data in Yiii2 api

I need to get an API response, where each client row has its associated users' data in it. like this:
clients: [
{
id: 1
name: 'client_1',
users: [{ id: 1, name: 'user_1', clients_id: 1}, { id: 2, name: 'user_2',
clients_id: 1 }],
},
{
id: 2
name: 'client_2',
users: [{ id: 3, name: 'user_3', clients_id: 2}, { id: 4, name: 'user_4',
clients_id: 2 }],
},
]
How can I write Yii2 code for this result? Do query builder works here or do I need to use MySQL syntax?
I believe you have two tables those are related some how. You can achieve it by defining client relation in users table. One you defined the relation define extrafields function and add the relation there.
In url use expand query param and mention your relation there.
You are done!
Sample URLhttp://localhost/users?fields=id,email&expand=profile
Check the documentation

JSON Database doesn't work correctly as per REST technique

I created a JSON server and this is the data that I'm using. However, when I'm trying to query the examlist and relate it to the students (i'd like to receive the students based on their ID (the picture below shows the REST query - I'm using ?_expand=student )) it won't display them. The code shows correct as per JSON validators, but my goal is to have them working.
The way my data is organized (the examlist table) won't display the students, because apparently, it cannot read their IDs. This database will be used for HTTP requests, hence I need it fully working.
I'll upload another image so that you can visualize my code.
Momentarily instead of my studentIDs, it's showing some random 0,1 numbers and the student IDs are being pushed down along the arbitrary tree.
(Just the examlist "table")
It's M:M relationship (relational database) and how I want it structured is:
Table "students" that contains information about the students;
I have "table" exams that contains information about the exams;
And then I have another "table" examlist which contains information about the EXAM (ExamID) and the students enrolled in it (basically relates the two abovementioned tables)
When I try querying the students through the "examlist" table, it won't work. However, the other "table" -- exam, does work.
My assumption is the way I have organized the students in the examlist "table" is not good, however, given my very little experience I cannot seem to see where the issue is.
I hope I cleared it out for the most of you! Sorry for any inconvenience caused.
{
"students": [
{
"id": 3021,
"nume": "Ionut",
"prenume": "Grigorescu",
"an": 3,
"departament": "IE"
},
{
"id": 3061,
"nume": "Nadina",
"prenume": "Pop",
"an":3,
"departament": "MG"
},
{
"id": 3051,
"nume": "Ionut",
"prenume": "Casca",
"an": 3,
"departament": "IE"
}
],
"exams": [
{
"id": 1,
"subiect": "Web Semantic",
"profesor": {
"Nume": "Robert Buchman"
}
},
{
"id": 2,
"subiect": "Programare Web",
"profesor": {
"Nume": "Mario Cretu"
}
},
{
"id": 3,
"subiect": "Medii de Programare",
"profesor": {
"Nume": "Valentin Stinga"
}
}
],
"listaexamene": [
{
"examId":1,
"Data Examen":"02/06/2022 12:00",
"studentId":
[
{
"id":3021
},
{
"id":3051
}
]
},
{
"examId":2,
"Data Examen":"27/05/2022 10:00",
"studentId":
[
{
"id":3021
},
{
"id":3051
}
]
},
{
"examId":1,
"Data Examen":"04/06/2022 10:00",
"studentId":
[
{
"id":3021
},
{
"id":3051
},
{
"id":3061
}
]
}
]
}
I had to repost with more information after my first one got closed down
I think I finally got the answer. The problem lays in the JSON server. Apparently, it cannot obtain information from further down the arbitrary tree, only the first layer.
Thank you all for your input on the previous post!

Google Analytics API - RemarketingAudiences.insert only working when linkedAdAccounts is AD_WORDS

I'm writing a Google Apps Script for creating audiences in Google Analytics. I keep getting the very unhelpful error message of There was an internal error.
As per [this guide][1], I am able to insert new audiences with the type AD_WORDS without issue. However my current task involves duplicating audiences of type ANALYTICS.
It seems that the linkedAdAccounts attribute of the resource I submit is incorrect. I can see that the official docs mention 3 possible options for the type: ADWORDS_LINKS, DBM_LINKS, MCC_LINKS or OPTIMIZE. Unfortunately, no detailed explanation is given for how these work other than ADWORDS_LINKS.
Here is the payload which is being rejected:
{
name: "newName",
linkedViews: ["123445677"],
linkedAdAccounts: [
{
kind: "analytics#linkedForeignAccount",
internalWebPropertyId: "12345678",
status: "OPEN",
remarketingAudienceId: "aaaaaaaaaaaaaaaaaaaaa",
id: "xxxxxxxxxxxxxxxxxxxxx",
webPropertyId: "UA-1234567-1",
type: "ANALYTICS",
accountId: "12345678",
},
],
audienceType: "SIMPLE",
audienceDefinition: {
includeConditions: {
daysToLookBack: 7,
segment: "users::condition::ga:sessionDuration>60",
membershipDurationDays: 30,
isSmartList: false,
},
},
}
It turns out you can't add an id for an ANALYTICS linkedAdAccount. Just adding the following is sufficient.
linkedAdAccounts: [
{
type: "ANALYTICS",
},
],

Efficent MongoDB queries to get reports from one collection and insert into another

I would like to consult about how to solve a specific task using MongoDB. I will try my best to explain the whole picture so there won't by an XY problem. It's going to be a bit long so I appreciate all of who get to the end of the topic. I have a collection (lets call it Cars) that contains reports. All of the reports contain three main fields:
name.
color.
timestamp.
Those reports contain other fields as well, but they are irreverent for my question. There is only one more field I would like to explain - new_start. If new_start is located in the report (meaning new_start: 1) then I ignore all of the reports that have the same name and color but they are older reports then the report that contains new_start (meaning the timestamp is less than the wanted reports). I'll try to explain with an example. Please consider the following reports:
report1 - name: ABC, color: black, timestamp: 1581946973
report2 - name: ABC, color: black, timestamp: 1581946963
report3 - name: ABC, color: black, timestamp: 1581946953, new_start: 1
report4 - name: ABC, color: black, timestamp: 1581946943
report5 - name: ABC, color: black, timestamp: 1581946933, new_start: 1
report6 - name: ABC, color: black, timestamp: 1581946923
Those reports are sorted by timestamp (from newest to oldest) and all have the same name and color. So the reports that interest me are:
report1 - name: ABC, color: black, timestamp: 1581946973
report2 - name: ABC, color: black, timestamp: 1581946963
report3 - name: ABC, color: black, timestamp: 1581946953, new_start: 1
Note that if there were no reports with new_start then I would handle all of them.
I tried to write a query/code that does the following logic for me: For all of the reports that contain the same name and color get all of the reports. If one of the reports contains new_start then it should return the reports from the newest until that report.
What I tried (using python and pymongo):
Get all of the reports:
records = db.query(collection_name="cars", query={})
Iterate thought all of the reports and for each one, perform changes.
for record in records:
other_line_records = db.query(collection_name="cars", query={'name': record['name'], 'color': record['color'], '_id': {'$ne': record['_id']}})
# changes
But the problem is that I just get all of the reports and then the code iterates though them and that could take a while because there are a lot of reports and by doing this way I will iterate over the same report.
Here enters the purpose of this whole operation - I would like to merge those reports into one main report and insert it into another collection merged_cars. The merge logic I'll do myself after I get the needed reports, but I'll be glad to get help with the other questions:
In my suggested way, it will merge those reports in infinite loop. This means that merged_cars will have the same reports over and over. I need somehow to keep track of the merged reports. I though of creating the a field merged_ids that contains an array of all of the merged ids. That way I would know if there is a new report I should add to the merge. But how should I efficiently check if the report is already merged? Also, is it a valid solution to this problem? Feels a bit odd to save those ids.
Currently, I just iterate over all of the reports without actually using the power of MongoDB aggregation. I'm sure that there is a smarter and more efficient way so I won't have to iterate over all of the merged reports over and over again. But I can't seem to understand how to do it.
How should I take new_start into account?
To summarize, due to my lack of experience in MongoDB aggregation, I can't seem to figure out an efficient way of solving this problem. I will be glad to see some suggestions (please provide examples so it will be easier to understand) on how to approach this problem. As you can see my main problem is to figure out how should those queries look like.
With the MongoDb aggregation we could achieve that.
Explanation
We $group all records with the same name and color and store root documents into temporal field named data
From data, we find all documents with new_start + with $reduce we return the greates timestamp.
With $filter we match all records with max_result <= itemi timestamp
With $unwind we flatten filtered data
$replaceRoot helps use change root structure with data i sub-document
db.Cars.aggregate([
{
$group: {
_id: {
name: "$name",
color: "$color"
},
data: {
$push: "$$ROOT"
}
}
},
{
$addFields: {
max_timestamp: {
$reduce: {
input: "$data",
initialValue: 0,
in: {
$cond: [
{
$and: [
{
$eq: [
"$$this.new_start",
1
]
},
{
$gt: [
"$$this.timestamp",
"$$value"
]
}
]
},
"$$this.timestamp",
"$$value"
]
}
}
}
}
},
{
$addFields: {
data: {
$filter: {
input: "$data",
cond: {
$lte: [
"$max_timestamp",
"$$this.timestamp"
]
}
}
}
}
},
{
$unwind: "$data"
},
{
$replaceRoot: {
newRoot: "$data"
}
}
])
MongoPlayground
If you add $merge operator as the last step, the reports will be inserted into merged_cars collection
{
$merge: {
into: "merged_cars",
on: "_id",
whenMatched: "replace",
whenNotMatched: "insert"
}
}
Pymongo
from pymongo import MongoClient
db = MongoClient('mongodb://localhost:27017').test
pipeline = [
{
'$group': {
'_id': {
'name': "$name",
'color': "$color"
},
'data': {
'$push': "$$ROOT"
}
}
},
{
'$addFields': {
'max_timestamp': {
'$reduce': {
'input': "$data",
'initialValue': 0,
'in': {
'$cond': [
{
'$and': [
{
'$eq': [
"$$this.new_start",
1
]
},
{
'$gt': [
"$$this.timestamp",
"$$value"
]
}
]
},
"$$this.timestamp",
"$$value"
]
}
}
}
}
},
{
'$addFields': {
'data': {
'$filter': {
'input': "$data",
'cond': {
'$lte': [
"$max_timestamp",
"$$this.timestamp"
]
}
}
}
}
},
{
'$unwind': "$data"
},
{
'$replaceRoot': {
'newRoot': "$data"
}
}
]
print(list(db.cars.aggregate(pipeline)))
[{'_id': ObjectId('5e658bb6fd9da8cfcc2f5a08'), 'name': 'ABC', 'color': 'black', 'timestamp': 1581946973}, {'_id': ObjectId('5e658bb6fd9da8cfcc2f5a09'), 'name': 'ABC', 'color': 'black', 'timestamp': 1581946963}, {'_id': ObjectId('5e658bb6fd9da8cfcc2f5a0a'), 'name': 'ABC', 'color': 'black', 'timestamp': 1581946953, 'new_start': 1}]

Solr grouped query pagination not working properly. [Solr, Lucene]

I have grouped my solr documents by a field family.
the solr query for getting first 20 groups is as follows
/select?q=*:*&group=true&group.field=family&group.ngroups=true&start=0&group.limit=1
Result of this query is 20 groups as following
responseHeader: {
zkConnected: true,
status: 0,
QTime: 1260,
params: {
q: "*:*",
group.limit: "1",
start: "0",
group.ngroups: "true",
group.field: "family",
group: "true"
}
},
grouped: {
family: {
matches: 464779,
ngroups: 396324,
groups: [
{
groupValue: "__fam__ME.EA.HE.728928",
doclist: {
numFound: 1,
start: 0,
maxScore: 1,
docs: [
{
sku: "ME.EA.HE.728928",
title: "Rexton Pocket Family Hearing Instrument Fusion",
family: "__fam__ME.EA.HE.728928",
brand: "Rexton",
brandId: "6739",
inStock: false,
bulkDiscount: false,
quoteOnly: false,
cats: [
"Hearing Machine & Components",
"Health & Personal Care",
"Medical Supplies & Equipment"
],
leafCatIds: [
"6038"
],
parentCatIds: [
"6259",
"4913"
],
Type__attr__: "Pocket Family",
Type of Products__attr__: "Hearing Instrument",
price: 3790,
discount: 40,
createdAt: "2016-02-18T04:51:36Z",
moq: 1,
offerPrice: 2255,
suggestKeywords: [
"Rexton",
"Pocket Family",
"Rexton Pocket Family"
],
suggestPayload: "6038,Hearing Machine & Components",
_version_: 1548082328946868200
}
]
}
},
Just the thing to notice in this result is the value of ngroups which is 396324
But when i want to get data of last pages i would hit this query on Solr
select?q=*:*&group=true&group.field=family&group.ngroups=true&start=396320&group.limit=1
{
responseHeader: {
zkConnected: true,
status: 0,
QTime: 5238,
params: {
q: "*:*",
group.limit: "1",
start: "396320",
group.ngroups: "true",
group.field: "family",
group: "true"
}
},
grouped: {
family: {
matches: 464779,
ngroups: 396324,
groups: [ ]
}
}
}
0 results when i set start to 396320. There must be 5 documents in the result. The actual number of groups are 386887. Why is ngroups incorrect?
btw this issue is not present in my local solr server i have setup up. just shows up in solr cloud on the test env
This is a known issue with grouping across distributed nodes (which is what happens in SolrCloud mode):
Grouping is supported for distributed searches, with some caveats:
Currently group.func is is not supported in any distributed searches
group.ngroups and group.facet require that all documents in each group must be co-located on the same shard in order for accurate counts to be returned. Document routing via composite keys can be a useful solution in many situations.
The most direct solution is to use the family as a part of the routing key, ensuring that all identical family values will end up on the same node. As it seems that the number of distinct family values are very high compared to the number of nodes, this should still ensure that you have a good distribution of documents across nodes.
Depending on what you're actually trying to do, there might be other alternative solutions as well (if you just want a count, using a JSON facet might be a good solution).