SSRS Linear Regression Line for data in SSAS Cube - reporting-services

End Goal: Create a scatter plot with actual data (coming from SSAS Cube) and a best fit line using basic least-squares regression.
At the present, my MDX looks like this:
SELECT NONEMPTY({[Measures].[Invoice Total]}) ON COLUMNS,
NONEMPTY( { [Billed Date].[Date].ALLMEMBERS}) ON ROWS
FROM
(
SELECT NONEMPTY(StrToMember(#StartDate,CONSTRAINED):StrToMember(#EndDate,CONSTRAINED)) ON COLUMNS,
NONEMPTY( STRTOSET(#Requestor)) ON ROWS
FROM [Task Billing]
WHERE STRTOSET(#Project)
)
WHERE STRTOSET(#Division)
As you can see, there are a large number of parameters used to filter which data should be included in the regression. I was thinking of using LinToPoint but I cannot really figure it out, since I am so new to MDX.
I am TOTALLY open to workarounds.
Any ideas on how to accomplish this? Surely it is a common issue...

You're new to MDX....and I've forgotton all the advanced stuff I once knew! Not a great combination - sorry. All I can offer is the actual MDX I once used to show a trend line amongst real data points.
with
member [Measures].[X]
as 'Rank([Time], [Time].[Week].members)'
member [Measures].[Trend]
as 'LinRegPoint(X, [Time].[Week].members, [Measures].[Gross], X)'
select
{[Time].[Week].members} on rows,
{[Measures].[Gross], Trend} on columns
from [Sales]
If you can get a static example working on your cube, using the bare bones I give above, you can plug the #parameters in later. I hope that helps in a some way. Feel free to comment and I'll try to advise, but I am veeeeery rusty.

Related

ParagraphVectors in deeplearning4j

I am new in utilizing deeplearning4j. I am running the paragraphvector classifier on a dataset including labeled and unlabeled data, and got a result. When I run it again on the same dataset using a same configuration, I will get another results! The new results is close to the previous one, but why it generates slightly different results?! What I mean by slighltly different results is like at the first run, it detects and assigns two testing samples to the first class we have, and in the second run, it assigns those two samples or probably one of them to another class. It happens normally for just one or two maybe three samples. Maybe I needed to inform you in advance that we have three classes that they are all related to cancer types diseases.
Any hint/help/advice would be highly appreciated.
I use such a below configuration:
paragraphVectors = new ParagraphVectors.Builder()
.learningRate(0.2)
.minLearningRate(0.001)
.windowSize(2)
.iterations(3)
.batchSize(500)
.workers(4)
.stopWords(stopWords())
.minWordFrequency(10)
.layerSize(100)
.epochs(1)
.iterate(iterator)
.trainWordVectors(true)
.tokenizerFactory(tokenizerFactory)
.build();
Problem turned out to be bad input with the tokenizer.

Calculating in Google Spreadsheet

I'm very new to Google spreadsheet and I've played around with functions but I'm not too sure how to create what I'm looking for.
I'm trying to make a calculator for a game where people can input their settings and based on their settings and the desired stats they want their hero to be by the end it will spit out results.
The problem I'm having is I'm not sure what formula you use. Here is the document:
https://docs.google.com/spreadsheets/d/1X5Mb24c8C4SblrsSBSoFuf8sB3fomM_X20isJO8KHLI/edit?usp=sharing
I was thinking IF D4 is 2★ and they want to get to +5 then it would follow I16's Rules.
Not sure if this makes a whole lot of sense. But basically Im trying to get
IF D4 = 2★ Then Results.
BUT if D4 = 3★ Then Different Results.
Not sure how to do this on one cell. If further explaination is needed let me know, The game is based off of 7knights if that helps at all.
Still a little bit confused on what your if statements are to contain, and what the dependent results should be - but from your last statement in the question it sounds like your just wanting a if statement to demonstrate the varying conditions?
ie:
=IF(D4="★★","RESULTS",IF(D4="★★★","OTHER RESULTS","OTHERWISE DEFAULT"))
I can easily modify this with real results/examples if you help me to understand what you actually want the end result to look like.

Endeca need to return all its values under one dimension

I need to return all values under one dimension (e.g. Product.category) in Endeca and return all its values as JSON object to content assembler. Can someone provide an optimal way to achieve this feature?
This is a tricky one, particularly because I'm assuming the product.category is a hierarchical dimension.
With a regular navigation query (such as a search results page), there's no way to bring back every level of a hierarchical dimension at once. However, using a Dimension search (and if you have --compoundDimSearch turned OFF), you can make a query like this: D=*&Dn=0&Di=10001 (where 10001 might be the dimension ID for product.category).
That will bring back every value in the dimension.
What you could do is maybe make / extend the DimensionSearchResultsHandler to help you out. In the preprocess() method, you would construct a query like the one above.
Then in the process method, you'd do something like:
ENEQueryResults results = executeMdexRequest(mMdexRequest);
NavigationState navigationState = getNavigationState();
navigationState.inform(results);
DimensionSearchResults dimensionSearchResults = new DimensionSearchResults(cartridgeConfig);
DimensionSearchResultsBuilder.build(
getActionPathProvider(),
dimensionSearchResults,
navigationState,
results.getDimensionSearch(),
cartridgeConfig.getDimensionList(),
cartridgeConfig.getMaxResults(),
cartridgeConfig.isShowCountsEnabled());
return dimensionSearchResults;
That will help you build out the Assembler objects for the results. Then if you made an Assembler query that returns JSON, these results would be returned as well.
One big caveat: The results above aren't nicely formatted. What I mean is that this will bring back every leaf value and its ancestors. If you wanted to create a nice hierarchical display, you'd have to do a bunch of formatting yourself.

Complex Gremlin queries to output nodes/edges

I am trying to implement a query and graph visualisation framework that allows a user to enter a Gremlin query, returning a D3 graph of results. The D3 graph is built using a JSON - this is created using separate vertices and edges outputs from the Gremlin query. For simple queries such as:
g.V.filter{it.attr_a == "foo"}
this works fine. However, when I try to perform a more complicated query such as the following:
g.E.filter{it.attr_a == 'foo'}.groupBy{it.attr_b}{it.outV.value}.cap.next().findAll{k,e->e.size()<=3}
- Find all instances of *value*
- Grouped by unique *attr_b*
- Where *attr_a* = foo
- And *attr_b* is paired with no more than 2 other instances of *value*
Instead, the output is of the following form:
attr_b1: {value1, value2, value3}
attr_b2: {value4}
attr_b3: {value6, value7}
I would like to know if there is a way for Gremlin to output the results as a list of nodes and edges so I can display the results as a graph. I am aware that I could edit my D3 code to take in this new output but there are currently no restrictions to the type/complexity of the query, so the key/value pairs will no necessarily be the same every time.
Thanks.
You've hit what I consider one of the key problems with visualizing Gremlin results. They can be anything. Gremlin results might not just be a list of vertices and edges. There is no way to really control this that I can think of. At the end of the day, you can really only visualize results that match a pattern that D3 expects. I'd start by trying to detect that pattern and visualize only in those cases (simply display non-recognized patterns as JSON perhaps).
Thinking of your specific example that results like this:
attr_b1: {value1, value2, value3}
attr_b2: {value4}
attr_b3: {value6, value7}
What would you want D3 to visualize there? The vertices/edges that were traversed over to get that result? If so, you might be stuck. Gremlin doesn't give you a way to introspect the pipeline to see what's passing through it. In other words, unless the user explicitly gathers vertices and edges within the pipeline that were touched you won't have access to them. It would be nice to be able to "spy" on a pipeline in that way, but at the moment it doesn't do that. There's been internal discussion within TinkerPop to create a new kind of pipeline implementation that would help with that, but at the moment, it doesn't exist.
So, without the "spying" capability, I think your only workarounds would be to:
detect vertex/edge list on your client side and only render those with d3. this would force users to always write gremlin that returned data in such a format, if they wanted visualization. put it in the users hands.
perhaps supply server-side bindings for a list of vertices/edges that a user could explicitly side-effect their vertices/edges into if their results did not conform to those expected by your visualization engine. again, this would force users to write their gremlin appropriately for your needs if they want visualization.

Where can I find a decent, representative sample dataset in json for test purposes?

I'm looking for a set of data that contains both numbers and strings (name/address, maybe), with a decent variety of data, around 1000 records, to test a JQuery-UI widget I'm developing. Does anyone know of such a dataset? Is there something floating around out there I could use?
Try this ... http://www.generatedata.com/#generator ... amazing tool to create whatever dataset you'd like