Joining a .csv and a vector layer in QGIS - csv

I have a couple of layers that I need to join in QGIS. One of them is a vector one and contains the information about the geometry (a series of polygons, each one characterized by a certain id). On the other side I have a .csv file in which there is information about these polygons, but it is not a single data per polygon, here my problem with the joins. It is a temporal dataset file in which a field appears with a value assigned for each date and polygon (not continuously, but almost).
An example of the .csv file would be:
id
polygon
date
cost
1
A1
01-01
100
2
A2
01-01
500
...
...
...
...
100
A1
02-01
250
101
A2
02-01
360
102
A3
02-01
150
The idea of joining both files is to be able to make each polygon to be painted (with the help of the "temporal" tool) depending on whether it exceeds a certain value of the cost field.
I have tried to make a relation from "project" but I could only access the form.
Thank you very much!

Follow solutions works for me.
Add Delimited Text with right data types (our csv).
Vector general -> join attribute by field value. Choose the Join type "Create separate features for each matching feature.

Related

MySQL ST_CONTAINS returns false for MULTIPOLYGON with more than 1 polygon

I have a bunch of locations (points) with coordinates stored as geometry points. I've imported spatial data of provinces in my country and I'm trying to determine the province in which each location lays, which worked for 26 of 28 provinces in total.
Before importing the data I noticed that all provinces had their geometries defined as POLYGON except the two in question that were defined as MULTIPOLYGON, so for consistency's sake I converted all to MULTIPOLYGON where the majority contain data for just 1 polygon.
I am testing now with one province which contains 3 polygon geometries in its MULTIPOLYGON definition. The point I'm testing with is contained within the third polygon, I have confirmed this by testing all 3 manually using the function found in this question:
Here's my SQL which returns 0 (on pastebin because SO won't let me post it here)
https://pastebin.com/raw/bkNeBgcL
If I remove the first two polygons, then it returns 1
https://pastebin.com/raw/UfTax1K5
If I'd had to guess what's going wrong is that mysql is expecting the point to be in all 3 polygons at the same time? And I need to tell it to return true if it is in any of the polygons inside the multi-polygon, but how?

Trying to contour a CSV file in QGIS

I have rainfall data which I have imported as a csv file. It's 185 lines like this:
Name, Longitude, Latitude, Elevation, TotalPrecipitation
BURLINGTON, -72.932, 41.794, 505, M
BAKERSVILLE, -73.008, 41.842, 686, 42.40
BARKHAMSTED, -72.964, 41.921, 710, M
NORFOLK 2 SW, -73.221, 41.973, 1340, 44.22
Looking at the layer properties the latitude and longitude are brought in as "double" but the rainfall amounts come in as "text" so I can't contour them.
How can I get beyond this point and where do I go to do the contouring? Do I go to Vector:Contour? Will it understand M is missing data or will the Ms still exist if this is converted to "double?"
I'm a little confused. Thanks for the help.
I think I might have the idea of help.
Since you have the sort of points located randomly across some area you could do as follows:
Load CSV to your QGIS in order to set the point layer with an attribute table including your most important value, which is Total Precipitation. Let's call it the TEST layer
Processing Toolbox -> TIN Interpolation -> Select the TEST layer. As an Interpolation attribute choose "Total precipitation". Use the green "+" symbol for adding this selection. Don't forget about the Extent option, where you could define the bounds of your interpolation. Preferably I wouldn't exceed the layer I am working on. Output raster size is also important - avoid a small number of rows. Put them about 10 optionally in order to make your interpolation efficient.
https://www.qgistutorials.com/en/docs/3/interpolating_point_data.html
Main bar -> Raster -> Extraction -> Contour
In the input layer select TEST, Interval contours between lines can be 10 (10mm in your case), Attribute name - put PRECIPITATION -> click Run
Your precipitation lines are ready! Now, you can Right-Click -> Properties -> Symbology (change color) or _>Labels (provide labels based on your attribute column Total Precipitation).

Cumulative Frequency Tables and Chart Output

I'm working with some rather large time-series data sets related to futures prices and am in the process of converting some calculations which I previously did in Excel to R. This conversion has been relatively straightforward thus far but I m having a bit of trouble replicating my histograms with their cumulative frequency distributions in R as I had them in Excel. If you're familiar with Excel, the Histogram function in the Data Analysis Toolpack automatically creates a Cumulative Frequency Distribution table with the cumulative percentages of each, in this case, Price Level, next to the histogram.
I've had some success creating some basic histograms using ggplot, here is a snippet of that code:
ggplot(data=CrudeRaw, aes(x=CrudeRaw$X7_1_F))+
geom_histogram(breaks=seq(X7_F_M_L, X7_F_M_H, by=0.01),
col="blue",
fill="white",
alpha= 0.2)+
labs(title="X7 1 Month Price Distribution", x="Price Levels",
y="Frequency") +
xlim(c(X7_F_M_L, X7_F_M_H)) +
ylim(c(0,100))
Several questions regarding formatting and usage.
a) CrudeRaw is a dataframe which contains roughly 276 rows, and no less then 50 columns. For the purposes of this project I've chopped the data into 20 period, 60 period, 120 period, 180 period, and 240 period subsets. The data is in chronological order by date.
Question(s): ggplot cannot take numeric data types, only data frames, so I can only feed it the entire df even though I am interested in creating distributions for the aforementioned subsets. Is there a way that I can still do this?
b) How do I get every bin (price) to show up on the x-axis rather than a number marking every 5 bins (-15, -10, -5, 0, 5 ..., 15)?
c) I've successfully created a cumulative frequency table using the follow code,
round(cbind(cumsum(table(X7_F)))/NROW(X7_F),2)
But I'd like a way to a) output each of these tables (of which there are many) to a CSV file OR, ideally create a "report" of sorts with R which can be saved to a pdf, or perhaps even within the histogram which the table/data is associated with.
d) I've done some searching on how to output data to a CSV file, but it wasnt clear from the examples I went over how I could output multiple arrays to the same sheet or workbook, en masse. That is, I would like to output my 20, 60, 120, 180, and 240 period arrays of prices to the same workbook. I'm thinking that by creating another dataframe that I could then pass these subsets of the data to the ggplot function like I mentioned I was having trouble doing in part a)
e) Lastly (for now) how do I overlay the CFD onto my histograms?
Please advise if you require any additional information or colour in order to help me and many thanks in advance for your responses!

the interaction of wordlist and top features selected based on weights

In the training process for a text classification case, the wordlist generated from process documentmodule has a length of about 15000 words. On the other side, I applied feature selection module, i.e.,weight by information gain and select by weight to select top 500 features. Both wordlist and selected weights are stored. Are there any ways to apply this generated 500 weights to the wordlist and constructed the short wordlist, which exactly matches the 500 weights. In other words, I would like to have the intersection of the original wordlist (about 15000 words) and the top 500 features(or top 500 words based on the ).
The following shows the script I am using.The stored weight(circled with red) is two columns where the first column is word(attributed) and the second column is corresponding weight value. Based on which, we can select top 500 or any other top features. The original wordlist (circled with red) can have 15000 words, a matrix with 15000 rows.
My question is that how to generated a filtered wordlist object based on the ranked weight object.
I have posted this question on Rapidminer forum. Please follow the update there.
You should post a representative process. In the absence of that it's difficult to give help but my view is that you could take the 500 word example set and process it again to make a word list from it.

Importing Nodes with Coordinates to Gephi from CSV

This question seems pretty stupid but I actually fail to find a simple solution to this. I have a csv file that is structured like this:
0 21 34.00 34.00
1 23 35.00 25.00
2 25 45.00 65.00
The first column is the node's id, the second is an unimportant attribute. The 3rd and 4th attribute are supposed to be the x and y position of the nodes.
I can import the file into the Data Laboratory without problems, but I fail to explain to Gephi to use the x y attributes as the corresponding properties. All I want to achieve is that Gephi sets the x Property to the value of the x Attribute (and y respectively). Also see picture.
Thanks for your help!
In the Layout window, you can select "Geo Layout" and define which columns are used as Latitude and Longitude.
The projection might come in weird if you do not actually have GeoData, but for me, this is fine.
In Gephi 0.8 there was a plugin called Recast column. This plugin is unfortunately not ported to Gephi 0.9 yet, but it allowed you to set Standard (hidden) Columns in the Node Table, from visible values in the nodes table. Thus if you have two columns of type Float or Decimal that represent your coordinates, you could set the coordinate values of your nodes.