Endeca need to return all its values under one dimension - json

I need to return all values under one dimension (e.g. Product.category) in Endeca and return all its values as JSON object to content assembler. Can someone provide an optimal way to achieve this feature?

This is a tricky one, particularly because I'm assuming the product.category is a hierarchical dimension.
With a regular navigation query (such as a search results page), there's no way to bring back every level of a hierarchical dimension at once. However, using a Dimension search (and if you have --compoundDimSearch turned OFF), you can make a query like this: D=*&Dn=0&Di=10001 (where 10001 might be the dimension ID for product.category).
That will bring back every value in the dimension.
What you could do is maybe make / extend the DimensionSearchResultsHandler to help you out. In the preprocess() method, you would construct a query like the one above.
Then in the process method, you'd do something like:
ENEQueryResults results = executeMdexRequest(mMdexRequest);
NavigationState navigationState = getNavigationState();
navigationState.inform(results);
DimensionSearchResults dimensionSearchResults = new DimensionSearchResults(cartridgeConfig);
DimensionSearchResultsBuilder.build(
getActionPathProvider(),
dimensionSearchResults,
navigationState,
results.getDimensionSearch(),
cartridgeConfig.getDimensionList(),
cartridgeConfig.getMaxResults(),
cartridgeConfig.isShowCountsEnabled());
return dimensionSearchResults;
That will help you build out the Assembler objects for the results. Then if you made an Assembler query that returns JSON, these results would be returned as well.
One big caveat: The results above aren't nicely formatted. What I mean is that this will bring back every leaf value and its ancestors. If you wanted to create a nice hierarchical display, you'd have to do a bunch of formatting yourself.

Related

How to fix a query in functions within foundry which is hiting ObjectSet:PagingAboveConfiguredLimitNotAllowed?

I have phonorgraph object with billions of rows and we are querying it through object set service
for example, I want to get all DriverLicences from certain city.
#Function()
public getDriverLicences(city: string): ObjectSet<DriverLicences> {
let drivers = Objects.search().DriverLicences().filter(row => row.city.exactMatch(city));
return drivers ;
}
I am facing this error when I am trying query it from slate:
ERROR 400: {"errorCode":"INVALID_ARGUMENT","errorName":"ObjectSet:PagingAboveConfiguredLimitNotAllowed","errorInstanceId":"0000-000","parameters":{}}
I understand that I am probably retrieving more than 100 000 results but I need all the results because of the implemented logic in the front is a complex slate dashboard built by another team that we cannot re-factor.
The issue here is that, specifically in the Slate <> Function connector, there is a "translation layer" that serializes the contents of the object set and provides a response data structure that materializes the property:value pairs for each object in the set.
This clearly doesn't work for large object sets where throwing so much data into the browser is likely to overwhelm the resources allocated to the tab.
From context it seems like you might be migrating an existing Slate app over to Functions; in the current version, how is the query limiting the number of results returned? It certainly must not be returning several 100 thousand results for further processing on the front end? (And if so, that might be an anti-pattern to consider addressing).
As for options that you could currently explore, you can sort your object set and then specify a smaller limit to return:
Objects.search().DriverLicences().filter(row => row.city.exactMatch(city)).orderBy(date_of_issue).take(100)
You'll find a few more details in the Functions documentation Reference entry on Ontology API: Object Sets in the section on Ordering and limiting.
You can even make a work around for the (current) lack of paging when return an ObjectSet to Slate by using the last value from the property ordered on (i.e. date_of_issue) as a filter in the subsequent request and return the next N objects.
This can work if you need a Slate table or HTML widget that renders on set of results then, on a user action, gets the next page.

Grails criteria projections - return whole table AND the projections

I want to know if I can have a single createCriteria() call, that returns me the whole table, and some specified joined columns.
Something like this:
SELECT table1.*, table2.property1,table2.property2 FROM table1 WHERE ... INNER JOIN table2.
I have a code similar to this:
MyDomainClass.createCriteria().list{
createAlias("relationedObject", "relationedObjectAlias")
condition1(...)
condition2(...)
condition3(...)
projections{
property("relationedObjectAlias.nestedProperty")
property("someProperty")
property("anotherProperty")
}
}
It returns me an array of arrays, containing these 3 properties listed inside the projections closure. But what should I do to receive the whole MyDomainClass object row, AND the projections?
What I really need, actually, is an array containing the whole MyDomainClass object, and the nestedProperty from the relationedObject.
I know I could just do another createCriteria() call, without specifying the projections, and manually "join" them in code, but this looks ugly to me... any ideas?
I'm using grails 2.5.5
I don't think there is a way in Hibernate to accomplish what you are doing so (nothing in the documentation that I've seen) and since you are using a HibernateCriteriaBuilder, I would say no.
I think your alternative would be to have all of your domain class's properties defined within your projection, depending on how many properties are involved you could do this manually or with some help:
import org.codehaus.groovy.grails.commons.DefaultGrailsDomainClass
import org.hibernate.criterion.CriteriaSpecification
...
def propertyNames = new DefaultGrailsDomainClass(MyDomainClass.class).
getPersistentProperties().
findAll{ p -> !p.isOneToMany() }*.
name
MyDomainClass.createCriteria().list{
createAlias("relationedObject", "relationedObjectAlias")
condition1(...)
condition2(...)
condition3(...)
resultTransformer(CriteriaSpecification.ALIAS_TO_ENTITY_MAP)
projections{
property("relationedObjectAlias.nestedProperty")
propertyNames.each{ pn ->
property(pn, pn)
}
}
}
I would not call it pretty but it may work for your situation; I tested it on several of my domain objects and it worked successfully. I'm using DefaultGrailsDomainClass because getPersistentProperties() is a method on a non-static method and I don't want to rely on any particular instance. I'm excluding any collections based on my own testing.
Rather than relying on an returned array and the position of properties within that array, I'm using the ALIAS_TO_ENTITY_MAP result transformer to return a map. I think this is generally a good idea anyways, especially when dealing with larger result sets; and I think it's absolutely critical if gathering the properties in an automated fashion. This does require the property(<String>, <String>) method call as opposed to just the `property()', with the 2nd argument being the map key.

Complex Gremlin queries to output nodes/edges

I am trying to implement a query and graph visualisation framework that allows a user to enter a Gremlin query, returning a D3 graph of results. The D3 graph is built using a JSON - this is created using separate vertices and edges outputs from the Gremlin query. For simple queries such as:
g.V.filter{it.attr_a == "foo"}
this works fine. However, when I try to perform a more complicated query such as the following:
g.E.filter{it.attr_a == 'foo'}.groupBy{it.attr_b}{it.outV.value}.cap.next().findAll{k,e->e.size()<=3}
- Find all instances of *value*
- Grouped by unique *attr_b*
- Where *attr_a* = foo
- And *attr_b* is paired with no more than 2 other instances of *value*
Instead, the output is of the following form:
attr_b1: {value1, value2, value3}
attr_b2: {value4}
attr_b3: {value6, value7}
I would like to know if there is a way for Gremlin to output the results as a list of nodes and edges so I can display the results as a graph. I am aware that I could edit my D3 code to take in this new output but there are currently no restrictions to the type/complexity of the query, so the key/value pairs will no necessarily be the same every time.
Thanks.
You've hit what I consider one of the key problems with visualizing Gremlin results. They can be anything. Gremlin results might not just be a list of vertices and edges. There is no way to really control this that I can think of. At the end of the day, you can really only visualize results that match a pattern that D3 expects. I'd start by trying to detect that pattern and visualize only in those cases (simply display non-recognized patterns as JSON perhaps).
Thinking of your specific example that results like this:
attr_b1: {value1, value2, value3}
attr_b2: {value4}
attr_b3: {value6, value7}
What would you want D3 to visualize there? The vertices/edges that were traversed over to get that result? If so, you might be stuck. Gremlin doesn't give you a way to introspect the pipeline to see what's passing through it. In other words, unless the user explicitly gathers vertices and edges within the pipeline that were touched you won't have access to them. It would be nice to be able to "spy" on a pipeline in that way, but at the moment it doesn't do that. There's been internal discussion within TinkerPop to create a new kind of pipeline implementation that would help with that, but at the moment, it doesn't exist.
So, without the "spying" capability, I think your only workarounds would be to:
detect vertex/edge list on your client side and only render those with d3. this would force users to always write gremlin that returned data in such a format, if they wanted visualization. put it in the users hands.
perhaps supply server-side bindings for a list of vertices/edges that a user could explicitly side-effect their vertices/edges into if their results did not conform to those expected by your visualization engine. again, this would force users to write their gremlin appropriately for your needs if they want visualization.

Using $skip with the SharePoint 2013 REST API

Forgive me, I'm very new to using REST.
Currently I'm using SP2013 Odata (_api/web/lists/getbytitle('<list_name>')/items?) to get the contents of a list. The list has 199 items in it so I need to call it twice and each time ask for a different set of items. I figured I could do this by calling:
_api/web/lists/getbytitle('<list_name>')/items?$skip=100&$top=100
each time changing however many I need to skip. The problem is this only ever returns the first 100 items. Is there something I'm doing wrong or is $skip broken in the OData service?
Is there a better way to iterate through REST calls, assuming this way doesn't work or isn't practical?
I'm using the JSon protocol with the Accept Header equaling application/json;odata=verbose
I suppose the $top=100 isn't really necessary
Edit: I've looked it up more and, I'm not entirely sure of the terms here, but using $skip works fine if you're using the method introduced with SharePoint 2010, i.e., _vti_bin/ListData.svc/<list_name>?$skip=100
Actually, funny enough, the old way doesn't set a 100 item limit on returns. So skip isn't even necessary. But, if you'd like to only return a certain segment of data, you'd have to do something like:
_vti_bin/ListData.svc/<list_name>?$skip=x&$top=(x+y)
where each time through the loop you would have something like x+=y
You can either use the old method which I described above, or check out my answer below for an explanation of how to do this using SP2013 OData
Alright, I've figured it out. $skip isn't a command which is meant to be used at the items? level. It works only at the lists? level. But, there's a way to do this, actually much easier than what I wanted to do.
If you just want all the data
In the returned data, assuming the list you are calling holds more than 100 items, there will be a __next at d/__next (assuming you are using json). This __next (it is a double underscorce, keep that in mind. I had a few problems at first because I was trying to get d/_next which never returned anything) is the right URL to get the next set of items. __next will only ever be a value if there is another set of items available to get.
I ended up creating a RequestURL variable which was initially set to to original request, but was changed to d/__next at the end of the loop. Then the loop went and checked if the RequestURL was not empty before going inside the loop.
Forgive my lack of code, I'm using SharePoint Designer 2013 to make this, and the syntax isn't horribly descriptive.
If you'd only like a small set of data
There's probably a few situations where you would only want x amount of rows from your list each time you go through the loop and that's real easy to do as well.
if you just add a $top=x parameter to your request, the __next URL that comes back with the response will give you the next x rows from your list. Eventually when there are no rows left to return __next won't be returned with the response.
Don't forget that in order to use __next you need to have a
$skiptoken=Paged=TRUE
in the url as well.

How do you return two values from a single method?

When your in a situation where you need to return two things in a single method, what is the best approach?
I understand the philosophy that a method should do one thing only, but say you have a method that runs a database select and you need to pull two columns. I'm assuming you only want to traverse through the database result set once, but you want to return two columns worth of data.
The options I have come up with:
Use global variables to hold returns. I personally try and avoid globals where I can.
Pass in two empty variables as parameters then assign the variables inside the method, which now is a void. I don't like the idea of methods that have a side effects.
Return a collection that contains two variables. This can lead to confusing code.
Build a container class to hold the double return. This is more self-documenting then a collection containing other collections, but it seems like it might be confusing to create a class just for the purpose of a return.
This is not entirely language-agnostic: in Lisp, you can actually return any number of values from a function, including (but not limited to) none, one, two, ...
(defun returns-two-values ()
(values 1 2))
The same thing holds for Scheme and Dylan. In Python, I would actually use a tuple containing 2 values like
def returns_two_values():
return (1, 2)
As others have pointed out, you can return multiple values using the out parameters in C#. In C++, you would use references.
void
returns_two_values(int& v1, int& v2)
{
v1 = 1; v2 = 2;
}
In C, your method would take pointers to locations, where your function should store the result values.
void
returns_two_values(int* v1, int* v2)
{
*v1 = 1; *v2 = 2;
}
For Java, I usually use either a dedicated class, or a pretty generic little helper (currently, there are two in my private "commons" library: Pair<F,S> and Triple<F,S,T>, both nothing more than simple immutable containers for 2 resp. 3 values)
I would create data transfer objects. If it is a group of information (first and last name) I would make a Name class and return that. #4 is the way to go. It seems like more work up front (which it is), but makes it up in clarity later.
If it is a list of records (rows in a database) I would return a Collection of some sort.
I would never use globals unless the app is trivial.
Not my own thoughts (Uncle Bob's):
If there's cohesion between those two variables - I've heard him say, you're missing a class where those two are fields. (He said the same thing about functions with long parameter lists.)
On the other hand, if there is no cohesion, then the function does more than one thing.
I think the most preferred approach is to build a container (may it be a class or a struct - if you don't want to create a separate class for this, struct is the way to go) that will hold all the parameters to be returned.
In the C/C++ world it would actually be quite common to pass two variables by reference (an example, your no. 2).
I think it all depends on the scenario.
Thinking from a C# mentality:
1: I would avoid globals as a solution to this problem, as it is accepted as bad practice.
4: If the two return values are uniquely tied together in some way or form that it could exist as its own object, then you can return a single object that holds the two values. If this object is only being designed and used for this method's return type, then it likely isn't the best solution.
3: A collection is a great option if the returned values are the same type and can be thought of as a collection. However, if the specific example needs 2 items, and each item is it's 'own' thing -> maybe one represents the beginning of something, and the other represents the end, and the returned items are not being used interchangably, then this may not be the best option.
2: I like this option the best, if 4, and 3 do not make sense for your scenario. As stated in 3, if you wanted to get two objects that represent the beginning and end items of something. Then I would use parameters by reference (or out parameters, again, depending on how it's all being used). This way your parameters can explicitly define their purpose: MethodCall(ref object StartObject, ref object EndObject)
Personally I try to use languages that allow functions to return something more than a simple integer value.
First, you should distinguish what you want: an arbitrary-length return or fixed-length return.
If you want your method to return an arbitrary number of arguments, you should stick to collection returns. Because the collections--whatever your language is--are specifically tied to fulfill such a task.
But sometimes you just need to return two values. How does returning two values--when you're sure it's always two values--differ from returning one value? No way it differs, I say! And modern languages, including perl, ruby, C++, python, ocaml etc allow function to return tuples, either built-in or as a third-party syntactic sugar (yes, I'm talking about boost::tuple). It looks like that:
tuple<int, int, double> add_multiply_divide(int a, int b) {
return make_tuple(a+b, a*b, double(a)/double(b));
}
Specifying an "out parameter", in my opinion, is overused due to the limitations of older languages and paradigms learned those days. But there still are many cases when it's usable (if your method needs to modify an object passed as parameter, that object being not the class that contains a method).
The conclusion is that there's no generic answer--each situation has its own solution. But one common thing there is: it's not violation of any paradigm that function returns several items. That's a language limitation later somehow transferred to human mind.
Python (like Lisp) also allows you to return any number of
values from a function, including (but not limited to)
none, one, two
def quadcube (x):
return x**2, x**3
a, b = quadcube(3)
Some languages make doing #3 native and easy. Example: Perl. "return ($a, $b);". Ditto Lisp.
Barring that, check if your language has a collection suited to the task, ala pair/tuple in C++
Barring that, create a pair/tuple class and/or collection and re-use it, especially if your language supports templating.
If your function has return value(s), it's presumably returning it/them for assignment to either a variable or an implied variable (to perform operations on, for instance.) Anything you can usefully express as a variable (or a testable value) should be fair game, and should dictate what you return.
Your example mentions a row or a set of rows from a SQL query. Then you reasonably should be ready to deal with those as objects or arrays, which suggests an appropriate answer to your question.
When your in a situation where you
need to return two things in a single
method, what is the best approach?
It depends on WHY you are returning two things.
Basically, as everyone here seems to agree, #2 and #4 are the two best answers...
I understand the philosophy that a
method should do one thing only, but
say you have a method that runs a
database select and you need to pull
two columns. I'm assuming you only
want to traverse through the database
result set once, but you want to
return two columns worth of data.
If the two pieces of data from the database are related, such as a customer's First Name and Last Name, I would indeed still consider this to be doing "one thing."
On the other hand, suppose you have come up with a strange SELECT statement that returns your company's gross sales total for a given date, and also reads the name of the customer that placed the first sale for today's date. Here you're doing two unrelated things!
If it's really true that performance of this strange SELECT statement is much better than doing two SELECT statements for the two different pieces of data, and both pieces of data really are needed on a frequent basis (so that the entire application would be slower if you didn't do it that way), then using this strange SELECT might be a good idea - but you better be prepared to demonstrate why your way really makes a difference in perceived response time.
The options I have come up with:
1 Use global variables to hold returns. I personally try and avoid
globals where I can.
There are some situations where creating a global is the right thing to do. But "returning two things from a function" is not one of those situations. Doing it for this purpose is just a Bad Idea.
2 Pass in two empty variables as parameters then assign the variables
inside the method, which now is a
void.
Yes, that's usually the best idea. This is exactly why "by reference" (or "output", depending on which language you're using) parameters exist.
I don't like the idea of methods that have a side effects.
Good theory, but you can take it too far. What would be the point of calling SaveCustomer() if that method didn't have a side-effect of saving the customer's data?
By Reference parameters are understood to be parameters that contain returned data.
3 Return a collection that contains two variables. This can lead to confusing code.
True. It wouldn't make sense, for instance, to return an array where element 0 was the first name and element 1 was the last name. This would be a Bad Idea.
4 Build a container class to hold the double return. This is more self-documenting then a collection containing other collections, but it seems like it might be confusing to create a class just for the purpose of a return.
Yes and no. As you say, I wouldn't want to create an object called FirstAndLastNames just to be used by one method. But if there was already an object which had basically this information, then it would make perfect sense to use it here.
If I was returning two of the exact same thing, a collection might be appropriate, but in general I would usually build a specialized class to hold exactly what I needed.
And if if you are returning two things today from those two columns, tomorrow you might want a third. Maintaining a custom object is going to be a lot easier than any of the other options.
Use var/out parameters or pass variables by reference, not by value. In Delphi:
function ReturnTwoValues(out Param1: Integer):Integer;
begin
Param1 := 10;
Result := 20;
end;
If you use var instead of out, you can pre-initialize the parameter.
With databases, you could have an out parameter per column and the result of the function would be a boolean indicating if the record is retrieved correctly or not. (Although I would use a single record class to hold the column values.)
As much as it pains me to do it, I find the most readable way to return multiple values in PHP (which is what I work with, mostly) is using a (multi-dimensional) array, like this:
function doStuff($someThing)
{
// do stuff
$status = 1;
$message = 'it worked, good job';
return array('status' => $status, 'message' => $message);
}
Not pretty, but it works and it's not terribly difficult to figure out what's going on.
I generally use tuples. I mainly work in C# and its very easy to design generic tuple constructs. I assume it would be very similar for most languages which have generics. As an aside, 1 is a terrible idea, and 3 only works when you are getting two returns that are the same type unless you work in a language where everything derives from the same basic type (i.e. object). 2 and 4 are also good choices. 2 doesn't introduce any side effects a priori, its just unwieldy.
Use std::vector, QList, or some managed library container to hold however many X you want to return:
QList<X> getMultipleItems()
{
QList<X> returnValue;
for (int i = 0; i < countOfItems; ++i)
{
returnValue.push_back(<your data here>);
}
return returnValue;
}
For the situation you described, pulling two fields from a single table, the appropriate answer is #4 given that two properties (fields) of the same entity (table) will exhibit strong cohesion.
Your concern that "it might be confusing to create a class just for the purpose of a return" is probably not that realistic. If your application is non-trivial you are likely going to need to re-use that class/object elsewhere anyway.
You should also consider whether the design of your method is primarily returning a single value, and you are getting another value for reference along with it, or if you really have a single returnable thing like first name - last name.
For instance, you might have an inventory module that queries the number of widgets you have in inventory. The return value you want to give is the actual number of widgets.. However, you may also want to record how often someone is querying inventory and return the number of queries so far. In that case it can be tempting to return both values together. However, remember that you have class vars availabe for storing data, so you can store an internal query count, and not return it every time, then use a second method call to retrieve the related value. Only group the two values together if they are truly related. If they are not, use separate methods to retrieve them separately.
Haskell also allows multiple return values using built in tuples:
sumAndDifference :: Int -> Int -> (Int, Int)
sumAndDifference x y = (x + y, x - y)
> let (s, d) = sumAndDifference 3 5 in s * d
-16
Being a pure language, options 1 and 2 are not allowed.
Even using a state monad, the return value contains (at least conceptually) a bag of all relevant state, including any changes the function just made. It's just a fancy convention for passing that state through a sequence of operations.
I will usually opt for approach #4 as I prefer the clarity of knowing what the function produces or calculate is it's return value (rather than byref parameters). Also, it lends to a rather "functional" style in program flow.
The disadvantage of option #4 with generic tuple classes is it isn't much better than returning a collection (the only gain is type safety).
public IList CalculateStuffCollection(int arg1, int arg2)
public Tuple<int, int> CalculateStuffType(int arg1, int arg2)
var resultCollection = CalculateStuffCollection(1,2);
var resultTuple = CalculateStuffTuple(1,2);
resultCollection[0] // Was it index 0 or 1 I wanted?
resultTuple.A // Was it A or B I wanted?
I would like a language that allowed me to return an immutable tuple of named variables (similar to a dictionary, but immutable, typesafe and statically checked). But, sadly, such an option isn't available to me in the world of VB.NET, it may be elsewhere.
I dislike option #2 because it breaks that "functional" style and forces you back into a procedural world (when often I don't want to do that just to call a simple method like TryParse).
I have sometimes used continuation-passing style to work around this, passing a function value as an argument, and returning that function call passing the multiple values.
Objects in place of function values in languages without first-class functions.
My choice is #4. Define a reference parameter in your function. That pointer references to a Value Object.
In PHP:
class TwoValuesVO {
public $expectedOne;
public $expectedTwo;
}
/* parameter $_vo references to a TwoValuesVO instance */
function twoValues( & $_vo ) {
$vo->expectedOne = 1;
$vo->expectedTwo = 2;
}
In Java:
class TwoValuesVO {
public int expectedOne;
public int expectedTwo;
}
class TwoValuesTest {
void twoValues( TwoValuesVO vo ) {
vo.expectedOne = 1;
vo.expectedTwo = 2;
}
}