So heres the background.
I have panel data. 1972-2020.
I have 1 dependent variable (inflation) and 5 independent variables, imp, exp, tra, gro, pro.
I have 10 countries. 5 small and 5 big ones, numbered 1-5 and 6-10.
I want to show the independend variables effect on the dependendt variable, sorted by decade (70,80,90,00,10) for each country. SO the output can tell us how the regression coefficentsenter image description here differs for each country for each decade.
Related
What is the Stata code for adding region fixed-effects in ordinary least squares regression? My dependent variable is volume of sale of a product and independent one is dummy variable, 1 for red pamphlet, 0 for blue pamphlet distributed to a sample of people over five districts. I want to include region fixed effects in the model. I tried generating dummy variables for the five regions and adding the dummies in the model.
Is this approach correct? If not, which one is?
reg pamph sale income plotsize region1 region2 region3 region4 region5
There are a number of ways to control for group fixed effects.
The simplest (IMO) in your situation is to use a factor variable.
For example:
webuse nlswork
reg ln_w grade age i.ind_code
In your case this would look like:
reg pamph sale income plotsize i.region
Assuming that region is a variable with a unique id for each region.
Other options are areg (see help areg) or reghdfe (see here):
areg ln_w grade age, absorb(ind_code)
reghdfe ln_w grade age, absorb(ind_code)
I have two csv files containing countries and values that correspond to each country.
The data from CSV 1 denotes the number of times a country has been attacked on their own soil.
The data from CSV 2 denotes the number of times a country has attacked another country abroad.
There is overlap between the two sets of data and I intend to demonstrate values from both data sets in one grey scale range to be shown on a choropleth map.
I have some (obviously) phony data below to demonstrate what I'm working with.
TARGET.csv
country, code, value
Iran, IRN, 5
Russia, RUS, 4
United States, USA, 0
Egypt, EGY, 2
Spain, ESP, 1
ATTACKER.csv
country, code, value
Iran, IRN, 3
Russia, RUS, 9
United States, USA, 4
Egypt, EGY, 0
Spain, ESP, 0
There are more targets than attackers.
I want to ensure that I represent the data accurately, but do not know how I would create a normalized range of values between -1 and 1.
It is my understanding that displaying the data in this way would accurately represent the reality best, but I feel like I may be wrong.
In summation:
1) Am I thinking about this problem properly? Is this even the right way to think about displaying the data?
2) What is the proper language used to describe my question?
I am usually able to figure these things out but I'm stumped with dead-end search queries.
3) How do I make sure that my range is normalized. Notice that USA above appears as the only attacker who has never been a target, Would that make the USA the nearest value to +1, despite Russia's larger number of attacks?
I would appreciate whatever input you all can offer.
I have sales, advertising spend and price data for 10 brands of same industry from 2013-2018. I want to develop an equation to predict 2019 sales.
The variables I have are (price & ad spend by type) :PricePerUnit Magazine, News, Outdoor, Broadcasting, Print.
The confusion I have is I am not sure whether to run regression using only 2018 data with 2018 sales as Target variable and adding additional variable like Past_2Yeas_Sales(2016-17) to above price & ad spend variables (For clarity-Refer the image of data). With this type of data I will have a sample size of only 10 as there are only 10 brands. This I think is too low for linear regression to give correct results.
Second option (which will increase sample size) I figure is could be instead of having a brand as an observation, I take brand+year as an observation which will increase my sample size to 60- for e.g. Brand A has 6 observations like A-2013, A-2014, A-2014...,A-2018, B has B-2013,B-2014..B-2018 and so on for 10 brands(Refer image for data).
Is the second option valid way to run regression? What is the right way to run regression in such situations of small sample size?
I have a 3d cylinder chart that I am having some problems with. I want to effectively sort the cylinders with the highest value at the back and the lowest value at the front. Otherwise the tallest valuest cover the smallest values.
I have tried sorting both a-z and z-a but I really need it to be dynamic based on the values. I have also tried sorting the values by the actual value field. both a-z and z-a but this seems to return completely random results.
the data in the database (example) looks like. I use a parameter to separate by supplier.
Date catgeory_Type cost supplier
01/01/2013 apple $5 abc
01/01/2013 pear $10 def
01/01/2013 bannana $15 cgi
01/02/2013 apple $7 etc
01/02/2013 pear $12 etc
01/02/2013 banana $18 etc
I believe I need some form of expression that sorts the values based on cost. as both a-z and z-a in the instance would provide cylinders that blocked other cylinders.
I have tried sorting the series group by :=Sum(Fields!cost.Value, "DataSet1") and =Fields!cost.Value but this seems to return random results.
I would be happy even if I could achieve a custom sort such as sort by "bannana, pear, apple" although for some "suppliers" this would still cause me an issue.
edit 1: strangely enough this works with a line chart but not a 3d cylinder
edit 2: example
attached is an example. I want the tallest cylinders at the back. but methods mentioned above do not work
In chart area properties -> 3D-options , Enable,
series clustering
Choose this option to cluster series groups. When multiple series for
bar or column charts are clustered, they are displayed along two
distinct rows in the chart area. If series are not clustered, their
corresponding data points are displayed adjacent to each other in one
row. This option is applicable only to bar and column charts.
Also try changing the Rotation & Inclination degrees, to get a better look.
Decrease wall thickness also.
I have to track the stock of individual parts and kits (assemblies) and can't find a satisfactory way of doing this.
Sample bogus and hyper simplified database:
Table prod:
prodID 1
prodName Flux capacitor
prodCost 900
prodPrice 1350 (900*1.5)
prodStock 3
-
prodID 2
prodName Mr Fusion
prodCost 300
prodPrice 600 (300*2)
prodStock 2
-
prodID 3
prodName Time travel kit
prodCost 1200 (900+300)
prodPrice 1560 (1200*1.3)
prodStock 2
Table rels
relID 1
relSrc 1 (Flux capacitor)
relType 4 (is a subpart of)
relDst 3 (Time travel kit)
-
relID 2
relSrc 2 (Mr Fusion)
relType 4 (is a subpart of)
relDst 3 (Time travel kit)
prodPrice: it's calculated based on the cost but not in a linear way. In this example for costs of 500 or less, the markup is a 200%. For costs of 500-1000 the markup is 150%. For costs of 1000+ the markup is 130%
That's why the time travel kit is much cheaper than the individual parts
prodStock: here is my problem. I can sell kits or the individual parts, So the stock of the kits is virtual.
The problem when I buy:
Some providers sell me the Time Travel kit as a whole (with one barcode) and some sells me the individual parts (with a different barcode)
So when I load the stock I don't know how to impute it.
The problem when I sell:
If I only sell kits, calculate the stock would be easy: "I have 3 Flux capacitors and 2 Mr Fusions, so I have 2 Time travel kits and a Flux Capacitor"
But I can sell Kits or individual parts. So, I have to track the stock of the individual parts and the possible kits at the same time (and I have to compensate for the sell price)
Probably this is really simple, but I can't see a simple solution.
Resuming: I have to find a way of tracking the stock and the database/program is the one who has to do it (I cant ask the clerk to correct the stock)
I'm using php+MySql. But this is more a logical problem than a programing one
Update: Sadly Eagle's solution wont work.
the relationships can and are recursive (one kit uses another kit)
There are kit that does use more than one of the same part (2 flux capacitors + 1 Mr Fusion)
I really need to store a value for the stock of the kit. The same database is used for the web page where users want to buy the parts. And I should show the avaliable stock (otherwise they wont even try to buy). And can't afford to calculate the stock on every user search on the web page
But I liked the idea of a boolean marking the stock as virtual
Okay, well first of all since the prodStock for the Time travel kit is virtual, you cannot store it in the database, it will essentially be a calculated field. It would probably help if you had a boolean on the table which says if the prodStock is calculated or not. I'll pretend as though you had this field in the table and I'll call it isKit for now (where TRUE implies it's a kit and the prodStock should be calculated).
Now to calculate the amount of each item that is in stock:
select p.prodID, p.prodName, p.prodCost, p.prodPrice, p.prodStock from prod p where not isKit
union all
select p.prodID, p.prodName, p.prodCost, p.prodPrice, min(c.prodStock) as prodStock
from
prod p
inner join rels r on (p.prodID = r.relDst and r.relType = 4)
inner join prod c on (r.relSrc = c.prodID and not c.isKit)
where p.isKit
group by p.prodID, p.prodName, p.prodCost, p.prodPrice
I used the alias c for the second prod to stand for 'component'. I explicitly wrote not c.isKit since this won't work recursively. union all is used rather than union for effeciency reasons, since they will both return the same results.
Caveats:
This won't work recursively (e.g. if
a kit requires components from
another kit).
This only works on kits
that require only one of a particular
item (e.g. if a time travel kit were
to require 2 flux capacitors and 1
Mr. Fusion, this wouldn't work).
I didn't test this so there may be minor syntax errors.
This only calculates the prodStock field; to do the other fields you would need similar logic.
If your query is much more complicated than what I assumed, I apologize, but I hope that this can help you find a solution that will work.
As for how to handle the data when you buy a kit, this assumes you would store the prodStock in only the component parts. So for example if you purchase a time machine from a supplier, instead of increasing the prodStock on the time machine product, you would increase it on the flux capacitor and the Mr. fusion.