Map Reduce with Relational Databases - mysql

I have 2 relational tables
Table A (Person 1, Title of Book Read)
Table B (Book Title, Author Name)
I'm creating a map-reduce job which counts the books by author which are read by every person in table 1.
This means that if there were 2 books by the same author and the person read both, then the map-reduce would yield:
(Person1, Author 1, 2);
My map function (at the Meta-level) is:
map {
emit(TableB.BookTitle, 1)
}
and my reduce function is:
reduce function (title,values)
{
while(values.hasNext())
{
if(title == tableA.bookRead)
sum+=values
}
output.collect(tableA.person1, tableB.author, sum)
}
I know there are some holes to be filled between the person reading the books but I'm not quite sure how to approach it? Also would I have to run this query for every person in the Table B?

We can break the given problem into two jobs:
1) In the first part we should create a map reduce job with two mapper. For First Mapper-A Table A is the input and for second Mapper-B table B is the input. And there will be only one reducer.
Mapper A emits "BooK Title" as Key and "Person Name#Table-A".
Mapper B emits "Book Title" as Key and "Author Name#Table-B"
Since in Map-Reduce records for one key goes to same reducer and in this job we just have one reducer so records will reach over there like
{Book Title,
Then you need to implement logic to extract out Person Name and Author Name. At the reducer end and Reducer will emit its output as:
Book Title %Author Name%PersonName
for eg.
while(values.hasNext())
{
String line = values.next().toString();
String[] det_array = line.split("#");
if(det_array[0].equals("person_book"))
{
person_name = det_array[1];
emit_value = emit_value + person_name + ",";
}
else if(det_array[0].equals("auth_book") && !author_seen)
{
author_name = det_array[1];
emit_value = emit_value + "%" + author_name + "%" + ",";
author_seen = true;
}
}
output.collect(new Text(key),new Text(emit_value));
Then Your Final Output File Will Look Like:
Book Title %Author_Name%Person Name
2) In the Second Map Reduce Job: Code Just One Mapper and Reducer. Input for Your Job is of format:
Book Title %Author_Name%Person Name1,PersonName2 etc..
For Your Mapper Output Key is Author_Name+Person and Value is 1.
As at this stage you have Combination of Author_Name and Person in Reducer you just need to count 1 and emit outout as Person Name, Author Name and Total Count.
Please let me know if this is not clear to you or you would like to see actual java code.
THanks !!

Related

Is there anyway to match name and number?

For each client we have, we associate their name with an ID number in a database. However, we sign people in by name. I am trying to convert the names into their ID number in a spreadsheet.
I have a list of all the names and corresponding IDs. I realize that I could hard code it so that it would look something like:
for (i=0; i < 31; i++) {
if name = 'john doe'
id = 256589
elseif name = 'jane doe'
id = 248352...}
and repeat that for each client. I've tested with a couple names and this solution does work. Since we don't have that many individuals come in it wouldn't be impossible to just repeat it. However, I would like to know if there are any shortcuts available.
It depends where you're doing this lookup.
This looks like script so you could use an object with the names as keys:
function getIdFromName(name) {
// list of all employees and ids
let employees = {
"john doe": 256589,
"jane doe": 248352
}
if (employees[name]) {
return employees[name]
} else {
// this covers the case if name not found
return false
}
}
// in rest of your code
var id = getIdFromName(name)
If you want to do the lookup in the sheet, you can use a lookup table containing names and ids then use VLOOKUP/INDEX(MATCH()) to find the corresponding ID

Select a random item from json imported dictionary

I'm trying to select a random player from an imported json file.
data = json.loads(source)
randPlayer = data['areas']['homes']
randP = random.choice(randPlayer)
print(randP)
Here is a code I tried, basically in 'homes', I have a list of player names and I want to select one at random.
Err Output
Source Code Example:
{'Player1': {'lvl': 192}, 'Player2': {'lvl': 182}}
This should work
randP = random.choice(list(randPlayer))
This is the good example which I found in some other sites, It is giving exact answer and I have checked it already. I am posting this for you and some other people who need perfect answer. All the best
Example code
import random
weight_dict = {
"Kelly": 50,
"Red": 68,
"Jhon": 70,
"Emma" :40
}
key = random.choice(list(weight_dict))
print ("Random key value pair from dictonary is ", key, " - ", weight_dict[key])
output
Random key value pair from dictonary is Jhon - 70

How to access the elements in the set returned by an Alloy function?

I have an Alloy function in my model like:
fun whichFieldIs[p:Program, fId:FieldId, c:Class] : Field{
{f:Field | f in c.*(extend.(p.classDeclarations)).fields && f.id = fId}
}
This function is working in my model and can return a set of elements such as:
{Field$0, Field$1}
although the function return is not "set Field". I already saw this through the Alloy evaluator tool (available in alloy4.2.jar). What i am trying to do is getting the first element of this set in another predicate, for instance:
pred expVarTypeIsOfA[p:Program, exprName:FieldId, mClass:Class, a:ClassId]{
let field = whichFieldIs[p, exprName, mClass],
fieldType = field[0].type
{
...
}
}
Even when i change the return of the function to "set Field", the error "This expression failed to be typechecked" appears. I only want to get the first element of a set returned by a function, any help?
Does the order really matter in that case? If so, you should take a look at this: seq
In the following example, for each person p, "p.books" is a sequence
of Book:
sig Book { }
sig Person {
books: seq Book
}
...So if s is a sequence of Book, then the first element is s[0]...
seq is now a reserved word, but is nothing more than a relation Int -> Elem.
If it does not matter, you could use an adequate quantifier, e.g.:
pred expVarTypeIsOfA[p:Program, exprName:FieldId, mClass:Class, a:ClassId]{
some field: whichFieldIs[p, exprName, mClass] | {
field.type ...
}
}

How to get ordered results from couchbase

I have in my bucket a document containing a list of ID (childList).
I would like to query over this list and keep the result ordered like in my JSON. My query is like (using java SDK) :
String query = new StringBuilder().append("SELECT B.name, META(B).id as id ")
.append("FROM" + bucket.name() + "A ")
.append("USE KEYS $id ")
.append("JOIN" + bucket.name() + "B ON KEYS ARRAY i FOR i IN A.childList end;").toString();
This query will return rows that I will transform into my domain object and create a list like this :
n1qlQueryResult.allRows().forEach(n1qlQueryRow -> (add to return list ) ...);
The problem is the output order is important.
Any ideas?
Thank you.
here is a rough idea of a solution without N1QL, provided you always start from a single A document:
List<JsonDocument> listOfBs = bucket
.async()
.get(idOfA)
.flatMap(doc -> Observable.from(doc.content().getArray("childList")))
.concatMapEager(id -> bucket.async().get(id))
.toList()
.toBlocking().first();
You might want another map before the toList to extract the name and id, or to perform your domain object transformation even maybe...
The steps are:
use the async API
get the A document
extract the list of children and stream these ids
asynchronously fetch each child document and stream them but keeping them in original order
collect all into a List<JsonDocument>
block until the list is ready and return that List.

Limiting and sorting by different properties on couchbase

Given a JSON document on couchbase, for example, a milestone collections, which is similar to this:
{
"milestoneDate" : /Date(1335191824495+0100)/,
"companyId" : 43,
"ownerUserId": 475,
"participants" : [
{
"userId": 2,
"docId" : "132546"
},
{
"userId": 67,
"docId" : "153"
}
]
}
If I were to select all the milestones of the company 43 and want to order them by latest first.. my view on couchbase would be something similar to this:
function (doc, meta) {
if(doc.companyId && doc.milestoneDate)
{
//key made up of date particles + company id
var eventKey = dateToArray(new Date(parseInt(doc.milestoneDate.substr(6))));
eventKey.push(doc.companyId);
emit(eventKey, null);
}
}
I do get both dates and the company Id on rest urls.. however, being quite new to couchbase, I am unable to work out how to restrict the view to return only milestones of company 43
The return key is similar to this:
"key":[2013,6,19,16,11,25,14]
where the last element (14) is the company id.. which is quite obviously wrong.
The query parameters that I have tried are:
&descending=true&startkey=[{},43]
&descending=true&startkey=[{},43]&endKey=[{},43]
tried adding companyId to value but couldn't restrict return results by value.
And according to couchbase documentation I need the date parts in the beginning to sort them. How do I restrict them by company id now, please?
thanks.
Put the company id at the start of the array, and because you'll be limiting by company id, couchbase sorts by company id and then by date array so you will be only ever getting the one company's milestone documents
I'd modify the view to emit
emit([doc.copmanyId, eventKey], null);
and then you can query the view with
&descending=true&startkey=[43,{}]
This was what worked for me previously..
I went back and tried it with end key and this seems to work - restricts and orders as required:
&descending=true&startkey=[43,{}]&endkey=[42,{}]
or
&descending=true&startkey=[43,{}]&endkey=[43,{}]&inclusive_end=true
either specify the next incremented/decremented value (based on descending flag) with end key, or use the same endkey as startkey and set inclusiveEnd to true
Both of these options should work fine. (I only tested the one with endkey=42 but they should both work)