anova_test not returning Mauchly's for three way within subject ANOVA - anova

I am using a data set called sleep (found here: https://drive.google.com/file/d/15ZnsWtzbPpUBQN9qr-KZCnyX-0CYJHL5/view) to run a three way within subject ANOVA comparing Performance based on Stimulation, Deprivation, and Time. I have successfully done this before using anova_test from rstatix. I want to look at the sphericity output but it doesn't appear in the output. I have got it to come up with other three way within subject datasets, so I'm not sure why this is happening. Here is my code:
anova_test(data = sleep, dv = Performance, wid = Subject, within = c(Stimulation, Deprivation, Time))
I also tried to save it to an object and use get_anova_table, but that didn't look any different.
sleep_aov <- anova_test(data = sleep, dv = Performance, wid = Subject, within = c(Stimulation, Deprivation, Time))
get_anova_table(sleep_aov, correction = "GG")
This is an ideal dataset I pulled from the internet, so I'm starting to think the data had a W of 1 (perfect sphericity) and so rstatix is skipping this output. Is this something anova_test does?
Here also is my code using a dataset that does return Mauchly's:
weight_loss_long <- pivot_longer(data = weightloss, cols = c(t1, t2, t3), names_to = "time", values_to = "loss")
weight_loss_long$time <- factor(weight_loss_long$time)
anova_test(data = weight_loss_long, dv = loss, wid = id, within = c(diet, exercises, time))

Not an expert at all, but it might be because your factors have only two levels.
From anova_summary() help:
"Value
return an object of class anova_test a data frame containing the ANOVA table for independent measures ANOVA. However, for repeated/mixed measures ANOVA, it is a list containing the following components are returned:
ANOVA: a data frame containing ANOVA results
Mauchly's Test for Sphericity: If any within-Ss variables with more than 2 levels are present, a data frame containing the results of Mauchly's test for Sphericity. Only reported for effects that have more than 2 levels because sphericity necessarily holds for effects with only 2 levels.
Sphericity Corrections: If any within-Ss variables are present, a data frame containing the Greenhouse-Geisser and Huynh-Feldt epsilon values, and corresponding corrected p-values. "

Related

Measurement-repeated ANCOVA in 2x2 Mixed Design

I am calculating in R an ANOVA with repeated measures in 2x2 mixed design. For this I use one of the following inputs in R:
(1)
res.aov <- anova_test(data = datac, dv = Stress, wid = REF,between = Gruppe, within = time )
get_anova_table(res.aov)
(2)
aov <- datac %>%
anova_test(dv = Stress, wid = REF, between = Gruppe, within = time, type = 3)
aov
Both lead to the same results. Now I want to add a covariate from the 1st measurement time point. So far I could not find a suitable R input for the ANCOVA for this repeated measures design.
Does anyone of you perhaps have an idea?
Many greetings
ANOVA <- aov(Stress~time+covariate, data = data)
summary(ANOVA)
For a simple ANCOVA, the R input of an ANOVA with the addition of the covariate applies. Unfortunately, I have no idea how this works in the repeated measures design.

How to calculate SAVI with MODIS in Google Earth Engine (Getting Error: Image.select: Pattern 'B2' did not match any bands.)

I am trying to calculate SAVI vegetation index using MODIS data. But I am getting an error showing:
Image.select: Pattern 'B2' did not match any bands.
Code:
countries = ee.FeatureCollection("USDOS/LSIB_SIMPLE/2017")
canada = countries.filter(ee.Filter.eq("country_na", "Canada"))
image = ee.ImageCollection("MODIS/061/MOD09A1")\
.filterDate('2017-01-01','2017-12-31')\
.filter(ee.Filter.lt('CLOUDY_PIXEL_PERCENTAGE',10))\
.filterBounds(canada)\
.median()\
.clip(canada)
savi = image.expression(
'1.5*((NIR-RED)/(NIR+RED+0.5))',{
'NIR':image.select('B2'),
'RED':image.select('B1')
}).rename('savi')
saviVis = {'min':0.0, 'max':1, 'palette':['yellow', 'green']}
Map = geemap.Map()
Map.addLayer(savi, saviVis, 'SAVI')
Map
Why am I getting this error? Isn't B1 designated to Red and B2 to NIR?
The general thing to do when you hit this type of problem is to start examining the dataset for what is actually there — how many images are you matching, what properties and bands those images have, etc. I found two problems:
Your filter criteria matched zero images. Therefore the collection is empty, and therefore the median() image from that collection has no bands at all. (You can check this by putting the collection in a variable and printing the size() of it.) You will need to adjust the criteria.
It seems that the main reason they didn't match is that the images in MODIS/061/MOD09A1 do not have a CLOUDY_PIXEL_PERCENTAGE property.
The band names for MODIS/061/MOD09A1 are not B1, B2, ... but sur_refl_b01, sur_refl_b02 and so on. You can see this with the Inspector in the Earth Engine Code Editor, or on the dataset description page.
Perhaps you were working from information about a different dataset?
With the two problems above fixed, your code produces some results. This is the (JS) version I produced while testing (Code Editor link):
var countries = ee.FeatureCollection("USDOS/LSIB_SIMPLE/2017");
var canada = countries.filter(ee.Filter.eq("country_na", "Canada"));
var images = ee.ImageCollection("MODIS/061/MOD09A1")
.filterDate('2017-01-01','2017-12-31')
// .filter(ee.Filter.lt('CLOUDY_PIXEL_PERCENTAGE',10))
.filterBounds(canada);
// print(images);
var image = images.median().clip(canada);
Map.addLayer(canada);
Map.addLayer(image);
var savi = image.expression(
'1.5*((NIR-RED)/(NIR+RED+0.5))',{
'NIR':image.select('sur_refl_b02'),
'RED':image.select('sur_refl_b01')
}).rename('savi');
var saviVis = {'min':0.0, 'max':1, 'palette':['yellow', 'green']};
Map.addLayer(savi, saviVis, 'SAVI')

How to get dataset into array

I have worked all the tutorials and searched for "load csv tensorflow" but just can't get the logic of it all. I'm not a total beginner, but I don't have much time to complete this, and I've been suddenly thrown into Tensorflow, which is unexpectedly difficult.
Let me lay it out:
Very simple CSV file of 184 columns that are all float numbers. A row is simply today's price, three buy signals, and the previous 180 days prices
close = tf.placeholder(float, name='close')
signals = tf.placeholder(bool, shape=[3], name='signals')
previous = tf.placeholder(float, shape=[180], name = 'previous')
This article: https://www.tensorflow.org/guide/datasets
It covers how to load pretty well. It even has a section on changing to numpy arrays, which is what I need to train and test the 'net. However, as the author says in the article leading to this Web page, it is pretty complex. It seems like everything is geared toward doing data manipulation, where we have already normalized our data (nothing has really changed in AI since 1983 in terms of inputs, outputs, and layers).
Here is a way to load it, but not in to Numpy and no example of not manipulating the data.
with tf.Session as sess:
sess.run( tf.global variables initializer())
with open('/BTC1.csv') as csv_file:
csv_reader = csv.reader(csv_file, delimiter =',')
line_count = 0
for row in csv_reader:
?????????
line_count += 1
I need to know how to get the csv file in to the
close = tf.placeholder(float, name='close')
signals = tf.placeholder(bool, shape=[3], name='signals')
previous = tf.placeholder(float, shape=[180], name = 'previous')
so that I can follow the tutorials to train and test the net.
It's not that clear for me your question. You might be answering, tell me if I'm wrong, how to feed data in your model? There are several fashions to do so.
Use placeholders with feed_dict during the session. This is the basic and easier one but often suffers from training performance issue. Further explanation, check this post.
Use queue. Hard to implement and badly documented, I don't suggest, because it's been taken over by the third method.
tf.data API.
...
So to answer your question by the first method:
# get your array outside the session
with open('/BTC1.csv') as csv_file:
csv_reader = csv.reader(csv_file, delimiter =',')
dataset = np.asarray([data for data in csv_reader])
close_col = dataset[:, 0]
signal_cols = dataset[:, 1: 3]
previous_cols = dataset[:, 3:]
# let's say you load 100 row each time for training
batch_size = 100
# define placeholders like you
...
with tf.Session() as sess:
...
for i in range(number_iter):
start = i * batch_size
end = (i + 1) * batch_size
sess.run(train_operation, feed_dict={close: close_col[start: end, ],
signals: signal_col[start: end, ],
previous: previous_col[start: end, ]
}
)
By the third method:
# retrieve your columns like before
...
# let's say you load 100 row each time for training
batch_size = 100
# construct your input pipeline
c_col, s_col, p_col = wrapper(filename)
batch = tf.data.Dataset.from_tensor_slices((close_col, signal_col, previous_col))
batch = batch.shuffle(c_col.shape[0]).batch(batch_size) #mix data --> assemble batches --> prefetch to RAM and ready inject to model
iterator = batch.make_initializable_iterator()
iter_init_operation = iterator.initializer
c_it, s_it, p_it = iterator.get_next() #get next batch operation automatically called at each iteration within the session
# replace your close, signal, previous placeholder in your model by c_it, s_it, p_it when you define your model
...
with tf.Session() as sess:
# you need to initialize the iterators
sess.run([tf.global_variable_initializer, iter_init_operation])
...
for i in range(number_iter):
start = i * batch_size
end = (i + 1) * batch_size
sess.run(train_operation)
Good luck!

Django: ORM/SQL query speed significantly decreased after adding additional BooleanField or (SQL tinyint) to Django Filter

Using MySQL Latest Django:
I have a vaguely complex Django query that works quite quickly--until I add an additional "AND" with a Boolean Field--
See Below:
queriedForms = queryFormtype.form_set.filter(is_public=True)
newQuery = queriedForms.filter(formrecordattributevalue__record_value__icontains=term['TVAL'], formrecordattributevalue__record_attribute_type__pk=rtypePK)
newQuery = newQuery.filter(flagged_for_deletion=False)
logger.info(newQuery.query)
term['count'] = newQuery.count()
If I either remove the initial "is_public=True" or the final "flagged_for_deletion=False)--it works incredibly fast. If I use both as filters, it increases the time for the count() function by something like 2000%
The different QuerySet.query outputs are below:
SELECT `maqluengine_form`.`id`, `maqluengine_form`.`form_name`, `maqluengine_form`.`form_number`, `maqluengine_form`.`form_geojson_string`, `maqluengine_form`.`hierarchy_parent_id`, `maqluengine_form`.`is_public`, `maqluengine_form`.`project_id`, `maqluengine_form`.`date_created`, `maqluengine_form`.`created_by_id`, `maqluengine_form`.`date_last_modified`, `maqluengine_form`.`modified_by_id`, `maqluengine_form`.`sort_index`, `maqluengine_form`.`form_type_id`, `maqluengine_form`.`flagged_for_deletion` FROM `maqluengine_form` INNER JOIN `maqluengine_formrecordattributevalue` ON (`maqluengine_form`.`id` = `maqluengine_formrecordattributevalue`.`form_parent_id`) WHERE (`maqluengine_form`.`form_type_id` = 319 AND `maqluengine_form`.`is_public` = True AND `maqluengine_formrecordattributevalue`.`record_value` LIKE %seal% AND `maqluengine_formrecordattributevalue`.`record_attribute_type_id` = 18510 AND `maqluengine_form`.`flagged_for_deletion` = False)
SELECT `maqluengine_form`.`id`, `maqluengine_form`.`form_name`, `maqluengine_form`.`form_number`, `maqluengine_form`.`form_geojson_string`, `maqluengine_form`.`hierarchy_parent_id`, `maqluengine_form`.`is_public`, `maqluengine_form`.`project_id`, `maqluengine_form`.`date_created`, `maqluengine_form`.`created_by_id`, `maqluengine_form`.`date_last_modified`, `maqluengine_form`.`modified_by_id`, `maqluengine_form`.`sort_index`, `maqluengine_form`.`form_type_id`, `maqluengine_form`.`flagged_for_deletion` FROM `maqluengine_form` INNER JOIN `maqluengine_formrecordattributevalue` ON (`maqluengine_form`.`id` = `maqluengine_formrecordattributevalue`.`form_parent_id`) WHERE (`maqluengine_form`.`form_type_id` = 319 AND `maqluengine_form`.`is_public` = True AND `maqluengine_formrecordattributevalue`.`record_value` LIKE %seal% AND `maqluengine_formrecordattributevalue`.`record_attribute_type_id` = 18510)
The first takes about 20/30 seconds to perform the count(), while the second with only 1 of the two BooleanField's takes less than a second to perform the count()
=======================================
EDIT=======================
Apologies: since the question isn't obvious enough--why is adding an additional AND with a BooleanField increasing the query time by +2000%? Is anyone able to assist in figuring out WHY that's occurring. Thanks.
EDIT=========================
Also discovered that using a exclude(is_public=False) rather than filter(is_public=True) has the same effect as the solution below. Does anyone happen to know why an exclude() works fine--whereas the filter() does not?
==============================
Solution I came up with after a night's rest:
--I keep the query as is(I need it for later because it continues getting chain filtered)
--I need the count() from this stage--which is taking substantially longer than it should with the additional BooleanField AND
--I take a temporary values list to perform a len() on instead:
queriedForms = queryFormtype.form_set.all()
newQuery = queriedForms.filter(formrecordattributevalue__record_value__icontains=term['TVAL'], formrecordattributevalue__record_attribute_type__pk=rtypePK)
newQuery = newQuery.filter(flagged_for_deletion=False)
tempQuery = newQuery.values_list('is_public',flat=True)
finalQuery = [entry for entry in tempQuery if entry != 'False'] #Remove any indices that contain "False"
term['count'] = len(finalQuery)
The following counts that use chained filters after use the same technique--it's significantly faster--if not as fast as removing one of the Booleans from the filters.

Build network with shortcut using torch

I now have a network with 2 inputs X and Y.
X concatenates Y and then pass to network to get result1. And at the same time X will concat result1 as a shortcut.
It's easy if there is only one input.
branch = nn.Sequential()
branch:add(....) --some layers
net = nn.Sequential()
net:add(nn.ConcatTable():add(nn.Identity()):add(branch))
net:add(...)
But when it comes to two inputs I don't actually know how to do it? Besides, nngraph is not allowed.Does any one know how to do it?
You can use the table modules, have a look at this page: https://github.com/torch/nn/blob/master/doc/table.md
net = nn.Sequential()
triple = nn.ParallelTable()
duplicate = nn.ConcatTable()
duplicate:add(nn.Identity())
duplicate:add(nn.Identity())
triple:add(duplicate)
triple:add(nn.Identity())
net:add(triple)
net:add(nn.FlattenTable())
-- at this point the network transforms {X,Y} into {X,X,Y}
separate = nn.ConcatTable()
separate:add(nn.SelectTable(1))
separate:add(nn.NarrowTable(2,2))
net:add(separate)
-- now you get {X,{X,Y}}
parallel_XY = nn.ParallelTable()
parallel_XY:add(nn.Identity()) -- preserves X
parallel_XY:add(...) -- whatever you want to do from {X,Y}
net:add(parallel)
parallel_Xresult = nn.ParallelTable()
parallel_Xresult:add(...) -- whatever you want to do from {X,result}
net:add(parallel_Xresult)
output = net:forward({X,Y})
The idea is to start with {X,Y}, to duplicate X and to do your operations. This is clearly a bit complicated, nngraph is supposed to be here to do that.