In iccube, How to link 2 many to many relations? - many-to-many

Having people stays in an hospital, they can go thru different Medical Units and receive Medical Acts.
These Medical acts are related to a Medical Unit.
Using « single » many to many relationships, we are able to filter and break everything between :
Medical Units & People Stays
Medical Acts & People Stays
But, we are unable to filter and break correctly to show the number of Acts done in 1 Medical Unit
For example, Roger (Stay Nbr 2) is a Man, is 20 years old, stayed 1 day, for an amount of 10,
He went into 2 Medical Units (Radiologie, duration 1 & Orthopedie, Duration 0)
In Radiologie, he received the Act Examen, 10000 times
We understand that it certainly has to be a « cascading » Many to many relationship. But, how to define it inside a cube ?
Here is a set of data of the 3 source tables.
<table><tbody><tr><th>People stays</th><th> </th><th> </th><th> </th><th> </th><th> </th><th> </th><th> </th></tr><tr><td>Stay Nbr</td><td>People Name</td><td>sex</td><td>Age</td><td>illness</td><td>Geo-Code</td><td>Stay Duration</td><td>Amount</td></tr><tr><td>1</td><td>Robert</td><td>Homme</td><td>30</td><td>Rhume</td><td>55555</td><td>10</td><td>1</td></tr><tr><td>2</td><td>Roger</td><td>Homme</td><td>20</td><td>Grippe</td><td>44444</td><td>1</td><td>10</td></tr><tr><td>3</td><td>Marguerite</td><td>Femme</td><td>30</td><td>Rhume</td><td>55555</td><td>10</td><td>100</td></tr><tr><td>4</td><td>Carole</td><td>Femme</td><td>20</td><td>Rougeole</td><td>44444</td><td>100</td><td>1000</td></tr><tr><td>5</td><td>Anne</td><td>Femme</td><td>50</td><td>Rougeole</td><td>99999</td><td>1000</td><td>100000</td></tr></tbody></table>
<table><tbody><tr><th>Medical Units</th><th> </th><th> </th><th> </th></tr><tr><td>Stay Nbr</td><td>ordered Mediacal Unit inside the Stay</td><td>Medical Unit</td><td>Medical Unit Duration</td></tr><tr><td>1</td><td>1</td><td>Pneumologie</td><td>0</td></tr><tr><td>1</td><td>2</td><td>Radiologie</td><td>5</td></tr><tr><td>1</td><td>3</td><td>Pneumologie</td><td>5</td></tr><tr><td>1</td><td>4</td><td>Orthopédie</td><td>0</td></tr><tr><td>1</td><td>5</td><td>Psychiatrie</td><td>0</td></tr><tr><td>2</td><td>1</td><td>Radiologie</td><td>1</td></tr><tr><td>2</td><td>1</td><td>Orthopédie</td><td>0</td></tr><tr><td>3</td><td>1</td><td>Orthopédie</td><td>10</td></tr><tr><td>4</td><td>1</td><td>Psychiatrie</td><td>100</td></tr><tr><td>5</td><td>1</td><td>Traumatologie</td><td>1000</td></tr></tbody></table>
<table><tbody><tr><th>Medical Acts</th><th> </th><th> </th><th> </th><th> </th></tr><tr><td>Stay Nbr</td><td>ordered Mediacal Unit inside the Stay</td><td>Medical Unit</td><td>Act</td><td>nbr of acts</td></tr><tr><td>1</td><td>1</td><td>Pneumologie</td><td>Examen</td><td>1</td></tr><tr><td>1</td><td>2</td><td>Radiologie</td><td>Opération</td><td>10</td></tr><tr><td>1</td><td>3</td><td>Pneumologie</td><td>Examen</td><td>1000</td></tr><tr><td>2</td><td>1</td><td>Radiologie</td><td>Examen</td><td>10000</td></tr><tr><td>3</td><td>1</td><td>Orthopédie</td><td>Opération</td><td>100000</td></tr><tr><td>3</td><td>1</td><td>Orthopédie</td><td>Examen</td><td>100000</td></tr><tr><td>4</td><td>1</td><td>Psychiatrie</td><td>Examen</td><td>1000000</td></tr></tbody></table>
Edit - Added Additional information & Standalone cube and report to help understanding.
I created a small standalone cube and iccube report to show and explain the issue, it can be downloaded there :
Cube and Report showing the issue
In the report, we can see that the system takes the Stays including « examen » acts, then, it merges these with all the stays including « Radiologie » Medical Unit, even though they are not linked…
Result expected for widget above is 1 instead of 11.
Results expected here are 11, 1, 10 instead of 11, 11, 10.

Related

Employing a large discrete observation space in OpenAI Gym

I am creating a custom environment in OpenAI Gym, and I'm having some trouble navigating the observation space.
Every timestep, the agent is given two potential students to accept or deny admission to - these are randomized and are part of the observation space. As the reward is based on which students are currently enrolled (who we have accepted in the past), we need to keep track of who has been accepted and who has not within the state space (there are a limited number of spots available to students). Each student has a 'major' (1-15) and a 'minor' (1-5) which, in the simulator I built, have weights associated with them that have a bearing on the reward, so they must be included in the state space. After a number of timesteps (varies depending on the major/minor combination), students graduate and can be removed from the list of enrolled students (and removed from being represented in the state space).
Thus, I currently have something like:
spaces = {
'potential_student_I': spaces.Tuple(((spaces.Discrete(15), spaces.Discrete(5)))),
'potential_student_II': spaces.Tuple(((spaces.Discrete(15), spaces.Discrete(5)))),
'enrolled_student_I': spaces.Tuple(((spaces.Discrete(16), spaces.Discrete(6)))),
'enrolled_student_II': spaces.Tuple(((spaces.Discrete(16), spaces.Discrete(6)))),
'enrolled_student_III': spaces.Tuple(((spaces.Discrete(16), spaces.Discrete(6)))),
}
self.observation_space = spaces.Dict(spaces)
In the above code, there's only room for three potential accepted students to be represented. These are spaces.Tuple(((spaces.Discrete(16), spaces.Discrete(6)))) rather than spaces.Tuple(((spaces.Discrete(15), spaces.Discrete(5)))) because the list doesn't necessarily need to be filled, so there are extra options for 'NULL'.
Is there a better way to do this? I thought about maybe using one-hot encoding or something similar. Ideally this environment could have up to 50 enrolled students, which obviously is not efficient if I continue representing the observation space the way I currently am. I plan on using a neural net because of the large state space, but I'm caught up on how to efficiently represent the observation space.

Is there a way to combine these variables in a way that makes sense?

Hello stack overflow community!
I am a sociology student working on a thesis project comparing home value appreciation and neighborhood racial composition over time.
I'm currently using two separate data sources and trying to combine them in a way that makes sense without aggregating anything.
The first data source is GIS data which has information on home sales in each year by home. The second is census data which has yearly estimates of racial composition by census tract. Both are in .csv formats.
My goal is to create a set of variables for each home row in the GIS data which represents the racial composition for the tract the home is in at the year it was sold (e.g. home 1 | 2010| $500,000 | Census tract 10 | 10% white).
I began doing this by going into Stata and using the following strategy:
For example, if I'm looking at a home sold in 2010 in Census tract 10 and I find that this tract was 10% white in 2010, using something like
If censustract=10 and year=2010, replace percentwhite = 10
However, this seemed incredibly time consuming, as I'm using data that go back decades and a couple dozen Census tracts.
Does anyone have any suggestions on how I might do this smarter, not harder? The first thought I had was to aggregate the data by census tract and year, but was hoping to avoid that if possible. Thank you so much in advance for your help and have a terrific day and start to the new year!
It sounds like you can simply merge census data onto your GIS data. That will be much less painful than using -replace-. Here's an example:
*GIS data: information on home sales in each year by home
clear
input censustract house_id year house_value_k
10 100 2010 200
11 101 2020 500
11 102 1980 100
end
tempfile GIS_data
sa `GIS_data'
*census data: yearly estimates of racial composition by census tract
clear
input censustract year percentwhite
10 2010 20
10 2000 10
11 2010 25
11 2000 5
end
tempfile census_data
sa `census_data'
*easy method: merge the census data onto your GIS data
use `GIS_data', clear
mer m:1 censustract year using `census_data'
drop if _merge==2
list
*hard method: use -replace-
use `GIS_data', clear
gen percentwhite=.
replace percentwhite=20 if censustract==10 & year==2010
replace percentwhite=10 if censustract==10 & year==2000
replace percentwhite=25 if censustract==11 & year==2010
replace percentwhite=5 if censustract==11 & year==2000
list
Both methods "work", but using -merge- is much easier and less prone to errors.
Note: I intentionally created the data sets so that the merge wouldn't be perfect. You will likely want to drop some of the observations in that case. In the code above I dropped when _merge==2

scrapy xpath not returning desired results. Any idea?

Please look at this page http://164.100.47.132/LssNew/psearch/QResult16.aspx?qref=15845. As you would have guessed, I am trying to scrape all the fields on this page. All fields are yield-ed properly except the Answer field. What I find odd is that the page structure for the question and answer is almost the same (Table[1] and Table[2]); the question scrapes perfectly but the Answer does not. Here are my xpaths:
question:
['q_main'] = Selector(response).xpath('//*[#id="ctl00_ContPlaceHolderMain_GridView2"]/tbody/tr/td/table[1]/tbody/tr/td/text()').extract()
works perfect
Answer:
['q_answer'] = Selector(response).xpath('//*[#id="ctl00_ContPlaceHolderMain_GridView2"]/tbody/tr/td/table[2]/tbody/tr[2]/td/text()').extract()
returns a blank. I have reproduced the full xpath, as returned by/verified in Xpath Helper and console.
What am i overlooking? What am I not able to see?
seems like your xpath has some problem,
checkout the demo from scrapy shell,
In [1]: response.xpath('//tr[td[#class="mainheaderq" and contains(font/text(), "ANSWER")]]/following-sibling::tr/td[#class="griditemq"]//text()').extract()
Out[1]:
[u'\r\n\r\n',
u'MINISTER OF STATE(I/C) FOR COAL, POWER AND NEW & RENEWABLE ENERGY (SHRI PIYUSH GOYAL)\r\n\r\n ',
u'(a) & (b): So far 29 coal mines have been auctioned under the provisions of Coal Mines (Special Provisions) \r\nAct, 2015 and the Rules made thereunder. The auction process for non-regulated sector viz. Iron and Steel, \r\nCement and Captive Power was based on forward bidding process where bidders had to submit their final price \r\noffer above the applicable floor price. In case of Power sector which is a regulated one, reverse bidding \r\nmethodology was adopted where bidders had to submit bids below the applicable ceiling price, which shall be \r\ntaken as fuel cost in determination of power tariff. In case, bid price reaches Rs. zero in reverse bidding, \r\nthe bidding is based on additional premium payable to the concerned State Government, over and above the \r\nfixed reserve price of Rs. 100/- per tonne.\r\n\r\n',
u'\r\nRevenue which would accrue to the coal bearing State Government concerned comprises of Upfront payment \r\nas prescribed in the tender document, Auction proceeds and Royalty on per tonne of coal production. State-wise \r\ndetails of 29 coal mines auctioned so far along-with specified end-uses and estimated revenue which would accrue \r\nto coal bearing state during the life of mine/lease period as given below:\r\n',
u'\r\n\r\nS.No\tState\t\tSpecified End \u2013Use\t\t\tName of Coal Mine\t\tEstimated Revenueduring \r\n\t\t\t\t\t\t\t\t\t\t\t\tthe life of mine/lease \r\n\t\t\t\t\t\t\t\t\t\t\t\tperiod (Rs. In Crores)\r\n1\tChattishgarh\tNon-Regualted Sector\t\t\tChotia\t\t\t\t51596\r\n\t\t\t\t\t\t\t\tGare Palma IV-4\t\r\n\t\t\t\t\t\t\t\tGare Palma IV-5\t\r\n\t\t\t\t\t\t\t\tGare Palma IV-7\t\r\n\t\t\t\t\t\t\t\tGare-Palma Sector-IV/8\r\n2\tJharkhand\tNon-Regualted Sector\t\t\tBrinda and Sasai\t\t49272\r\n\t\t\t\t\t\t\t\tDumri\r\n\t\t\t\t\t\t\t\tKathautia\r\n\t\t\t\t\t\t\t\tLohari\r\n\t\t\t\t\t\t\t\tMeral\r\n\t\t\t\t\t\t\t\tMoitra\r\n\t\t\tPower\t\t\t\t\tGaneshpur\r\n\t\t\t\t\t\t\t\tJitpur\r\n\t\t\t\t\t\t\t\tTokisud North\r\n3\tMadhya Pradesh\tNon-Regualted Sector\t\t\tBicharpur\t\t\t42811\r\n\t\t\t\t\t\t\t\tMandla North\r\n\t\t\t\t\t\t\t\tMandla-South\r\n\t\t\t\t\t\t\t\tSialGhoghri\r\n\t\t\tPower\t\t\t\t\tAmelia North\r\n4\tMaharashtra\tNon-Regualted Sector\t\t\tBelgaon\t\t\t\t2738\r\n\t\t\t\t\t\t\t\tMarkiMangli III\r\n\t\t\t\t\t\t\t\tNerad Malegaon\r\n5\tOdisha\t\tPower\t\t\t\t\tMandakini\t\t\t33741\r\n\t\t\t\t\t\t\t\tTalabira-I\r\n\t\t\t\t\t\t\t\tUtkal - C\r\n6\tWest Bengal\tNon-Regualted Sector\t\t\tArdhagram\t\t\t13354\r\n\t\t\tPower\t\t\t\t\tSarisatolli\r\n\t\t\t\t\t\t\t\tTrans Damodar\r\n\tTotal\t\t\t\t\t\t\t(29) coal blocks\t\t193512\r\n',
u'\r\n\r\n\r\nCoal mine has been assigned to successful bidder as Designated Custodian in view of a court case.\r\n\r\n',
u'\r\nIn addition, an estimated amount of Rs. 1,41,854 Crores would accrue to coal bearing States from allotment \r\nof 38 coal mines to Central and State PSU\u2019s.\r\n\r\n',
u'Out of these 29 coal mines, 16 are operational coal mines included in Schedule-II of the Act and 13 are \r\nnon-operational included in Schedule-III of the Act. Milestones for development and production of coal \r\nfrom the auctioned coal mines have been prescribed under the Coal Mines Development and Production Agreement \r\nsigned with the Successful Bidder. \r\n\r\n ',
u'(c) & (d): Yes, Sir. A few complaints were received regarding cartelization in bidding. It is not possible to \r\nconclusively establish the same until investigation are carried out by Competent Authority. ',
u'\r\n\r\n\r\nThe Government has not approved the recommendation of NA for declaration of successful bidder in case of \r\n4 coal mines namely Gare Palma IV/2&3, Gare Palma IV/1 and Tara as final closing bid price was not found \r\nto be reflecting fair value. ',
u'\r\n\r\n\r\n']
when you are dealing with the tables sometimes it happens and for more information you can refer this.
At least part of the source of your difficulty lies in the fact that the code you see in the console is not the source html that your spider gets as a response (and on which the selectors operate).
In particular, it is extremely common for a <table> to not include a <tbody>; but when your browser translates the html to the DOM tree, it slaps in <tbody> tags. And there was a time when much of the layout of webpages was actually accomplished with (crazily) nested tables. As a result, the DOM of such a website will typically have many more <tbody> elements than the html source.
What this means in practical terms is that:
It is generally a good idea to find a relatively simple xpath (or CSS selector, or ...) for the element(s) you want to select -- not the behemoth you sometimes get from your developer tools.
It is generally a bad idea to include /tbody in your xpath (unless there is an associated attribute, indicating that the tag exists in the source html).
For the site in question,
response.xpath('//td[#class="griditemq"]').extract()
returns a list with the first element the question and the second element the answer.

Flood analysis using Hecras

Hi every one i am performing the basic Hecgeo-ras Tutorial when i exported the geometry in Hec-ras and computed the Steady Flow Analysis in hec-ras it gives me this error 'a horizontal manning n value needs to be specified on first station'
Can any one help me about this.
When using HEC-GeoRAS, it is the user's responsibility to assign Manning's n coefficients to each cross section. That is, unless you first created a LandUse feature class in the map document. If you created an empty LandUse feature class then the "N_Value" fields will be blank and since the calculations in the program start at the first station, you will see an error just as you stated.
From the Purdue University Tutorial:
To assign Manning’s n to cross-sections, click on RAS Geometry->Manning’s n
Values->Extract n Values. Confirm LandUse for Land Use, choose N_Value for Manning Field, XSCutLines for XS Cut Lines, leave the default name Manning for XS Manning Table, and click OK.

Stock management of assemblies and its sub parts (relationships)

I have to track the stock of individual parts and kits (assemblies) and can't find a satisfactory way of doing this.
Sample bogus and hyper simplified database:
Table prod:
prodID 1
prodName Flux capacitor
prodCost 900
prodPrice 1350 (900*1.5)
prodStock 3
-
prodID 2
prodName Mr Fusion
prodCost 300
prodPrice 600 (300*2)
prodStock 2
-
prodID 3
prodName Time travel kit
prodCost 1200 (900+300)
prodPrice 1560 (1200*1.3)
prodStock 2
Table rels
relID 1
relSrc 1 (Flux capacitor)
relType 4 (is a subpart of)
relDst 3 (Time travel kit)
-
relID 2
relSrc 2 (Mr Fusion)
relType 4 (is a subpart of)
relDst 3 (Time travel kit)
prodPrice: it's calculated based on the cost but not in a linear way. In this example for costs of 500 or less, the markup is a 200%. For costs of 500-1000 the markup is 150%. For costs of 1000+ the markup is 130%
That's why the time travel kit is much cheaper than the individual parts
prodStock: here is my problem. I can sell kits or the individual parts, So the stock of the kits is virtual.
The problem when I buy:
Some providers sell me the Time Travel kit as a whole (with one barcode) and some sells me the individual parts (with a different barcode)
So when I load the stock I don't know how to impute it.
The problem when I sell:
If I only sell kits, calculate the stock would be easy: "I have 3 Flux capacitors and 2 Mr Fusions, so I have 2 Time travel kits and a Flux Capacitor"
But I can sell Kits or individual parts. So, I have to track the stock of the individual parts and the possible kits at the same time (and I have to compensate for the sell price)
Probably this is really simple, but I can't see a simple solution.
Resuming: I have to find a way of tracking the stock and the database/program is the one who has to do it (I cant ask the clerk to correct the stock)
I'm using php+MySql. But this is more a logical problem than a programing one
Update: Sadly Eagle's solution wont work.
the relationships can and are recursive (one kit uses another kit)
There are kit that does use more than one of the same part (2 flux capacitors + 1 Mr Fusion)
I really need to store a value for the stock of the kit. The same database is used for the web page where users want to buy the parts. And I should show the avaliable stock (otherwise they wont even try to buy). And can't afford to calculate the stock on every user search on the web page
But I liked the idea of a boolean marking the stock as virtual
Okay, well first of all since the prodStock for the Time travel kit is virtual, you cannot store it in the database, it will essentially be a calculated field. It would probably help if you had a boolean on the table which says if the prodStock is calculated or not. I'll pretend as though you had this field in the table and I'll call it isKit for now (where TRUE implies it's a kit and the prodStock should be calculated).
Now to calculate the amount of each item that is in stock:
select p.prodID, p.prodName, p.prodCost, p.prodPrice, p.prodStock from prod p where not isKit
union all
select p.prodID, p.prodName, p.prodCost, p.prodPrice, min(c.prodStock) as prodStock
from
prod p
inner join rels r on (p.prodID = r.relDst and r.relType = 4)
inner join prod c on (r.relSrc = c.prodID and not c.isKit)
where p.isKit
group by p.prodID, p.prodName, p.prodCost, p.prodPrice
I used the alias c for the second prod to stand for 'component'. I explicitly wrote not c.isKit since this won't work recursively. union all is used rather than union for effeciency reasons, since they will both return the same results.
Caveats:
This won't work recursively (e.g. if
a kit requires components from
another kit).
This only works on kits
that require only one of a particular
item (e.g. if a time travel kit were
to require 2 flux capacitors and 1
Mr. Fusion, this wouldn't work).
I didn't test this so there may be minor syntax errors.
This only calculates the prodStock field; to do the other fields you would need similar logic.
If your query is much more complicated than what I assumed, I apologize, but I hope that this can help you find a solution that will work.
As for how to handle the data when you buy a kit, this assumes you would store the prodStock in only the component parts. So for example if you purchase a time machine from a supplier, instead of increasing the prodStock on the time machine product, you would increase it on the flux capacitor and the Mr. fusion.