I'm working on a program that takes in some data from a database in MySQL, changes some numbers, and then overwrites the old data on MySQL with the new stuff. (Specifically, the data I'm taking in is output from a weather forecasting model). What I'm struggling with is being able to replace the old data in the database with the edited data.
In my program, the new data (solar radiation values) is a column of numbers in the 3rd column of the matrix WxData (so it can be accessed with WxData(:,3)).
In the MySQL database, the values I want to change are under the column titled "radiation" in the table "wrf". "dbConn" is the name of the database connection.
I tried something like
update(dbConn, 'wrf', {'radiation'}, WxData(:,3), 'WHERE radiation > -1')
The update function in the database toolbox in Matlab requires a where clause input so I just put something that is always true. But this method doesn't seem to work...it ends up changing every single radiation value in the database table to the same number (possibly the value at WxData(1,3)).
I've tried a couple other ways but nothing worked. How can I just replace the whole column of radiation values in the database with a new column? Seems like it should be simple.
SQL does not think in columns or vectors.
Therefore your method simply cannot work, it is not a syntax issue.
If you want to do this, you can assign ID's to the rows in SQL, import these into matlab as well and then put the ID in the where clause and run it for each line.
Related
I have a table that contains data from repeated experiments (for example, site A has one sample, and the lab processed the sample three times obtaining slightly different values). I need to average these results in a separate table, but what I have read on the Microsoft support site is that a query that pulls data into another table with a calculated field is not possible on Access.
Can I query multiple data points from one table into a single calculated field in another table? Thank you.
UPDATE
I ended up doing a lot of manual adjustments of the file format to create a calculated field in the existing table that averages each sites data, so my problem is, for my current purposes, solved. However I would still like to understand. Following up with you both, I think the problem was that I had repeated non-unique IDs between rows when I probably should have made data columns with unique variable names so that I could query each variable name for an average.
So, instead of putting each site separately on the y axis, I formatted it by putting the sample number for each site on the x-axis:
I was able to at least create a calculated field using this second format in order to create an average value for each site.
Would have there been a way to write a query using the first method? Luckily, my data set was not at all very hefty, so I could handle a reformat manually, but if the case were with thousands of data entries, I couldn't have done that.
Also, here is the link to the site I mentioned originally https://support.office.com/en-ie/article/add-a-calculated-field-to-a-table-14a60733-2580-48c2-b402-6de54fafbde3.
Thanks all.
I am somewhat new to MS Access and I have inherited an application with a table that uses this Lookup feature to replace a code with a value from a query to another table.
When I first used this table and exported it to Excel for analysis, I somehow got the base ID number (or whatever it would be called) rather than the translated lookup value. Now, when I do this, I get the translated text. The biggest problem is that while the base value is unique, the translated values are not, so I cannot use them for the work I am doing.
Can someone explain how to get the underlying ID value rather than the lookup value? Is there some setting I can use or some way to reference the field upon which the lookup is based. When I query the ID field, I get the lookup value. I know that the first time I did this, the spreadsheet contained the ID number not the text.
For now, I created a copy of the table and removed the lookup information from this copy, but I know I did not run into this when I did this the first time.
Thanks.
When you export to Excel, leave Export data with formatting and layout unchecked. This will create a spreadsheet with raw data values in Lookup fields.
Export settings image
I have a table with a column for different fruit names - Apple, Orange, Banana etc. These fruit names can have duplicates.
Right now if I do a SQL Select, I get the names as it is. I want to change the data so that every "Apple" gets replaced with "Sweet Apple" and every "Orange" gets replaced with "Mandarin".
I know I can use the replace function in my SQL queries. However I don't want to/can't modify my SQL queries. I was trying to leave changing the SQL as a last resort because that needs to be done on several different nodejs scripts.
I am wondering if there is some way in the database itself which can make it return these altered data automatically. Sort of like a filter / pipeline / constraint (I am not sure what to call it) which is set on a specific column of a table and makes it automatically do this replace function for any data which is queried from this table.
I would like an answer for mainly Postgres and MySQL and if possible for SQL Server too.
No, the closest would be triggers on Insert and Update to replace the data as it comes in, but you cannot override data that is being queried without specifying it in the query. You can create a view that would show the replaced strings.
It has its drawbacks, but you could do this:
Change the name of the Fruit column to something else, and then create a new Computed column that does the REPLACE you want on the newly-renamed Fruit column, and give this new column the name of the old column, so that all your existing queries will hit the new column.
Drawback is that any existing INSERT/UPDATE queries have to be changed to INSERT/UPDATE to the new name of the old column.
Well, The Idea of a database is "storing values". So, if you are querying these values, you only have little options on modification.
Your expectation of handling is clearly up to your programming language. Whenever you output a value retrieved from the database, wrap it, filter it - return whatever you need instead of the stored value, i.e.:
public static string filter(value){
if (value == "Apple")
return "Sweet Apple";
if (value == "Orange")
return "Mandarine";
return value;
}
I am quite new to Pentaho Spoon and I would like to import records of an csv file to an database table. However, only unique records should be imported into the database table. That is why I need to compare EACH record with all records of the database table in order to determine if the record should be imported or not.
So far, I tried out the suggested CRUD-pattern which looks like this:
As you can see in the picture, I merge the excel input and the table input (ignore the cast-steps. I needed to cast a value because ther differed in the float format: database format was #.000000 and the csv format of float was #.0)
After the merge join, I compare the flag (which is given by the merge rows(diff) and if the compared records are new, I import them to the database table, if they are changed, I update the record and if they are deleted or identical, I simply do nothing. So far, so good.
But here is the problem: If I shuffle the records of the csv-input-file and run the transformation anew, all the records are imported anew and consequently, there are duplicated in my database table (which I wanted to avoid). To emphasize again: The right way to solve this is that each row of the csv-input-file is compared with ALL entries in the database table.
How can I realize this? Any suggestions? Thank you so much in advance!!
The Merge Rows (diff) expect the input to be sorted. Normally, you have been warned of this by a pop-up.
Put a Sort rows step on the output flow of the Excel Input, before it reaches the Merge Rows (diff).
You should do the same between the Table Input and the Merge Rows (diff). On course you may think you could do it in the sql statement of the Table Input.
However, there is a beginner trap here. You have 3 other steps Output Rows, Update and Delete which operates on the same table. And these steps may lock the table. As in Kettle all the steps are running concurrently, you do not know which steps will fire first, and the table may be locked and never be able to read even the first record. This is known in jargon as an auto-lock, and the way to solve it is to put a Sort Row step as a buffer.
You can use the 'Dimension lookup/update' control which provides the same functionality which you are trying to achieve.
Thanks,
Nilesh
I have job in Talend that is designed to bring together some data from different databases: one is a MySQL database and the other a MSSQL database.
What I want to do is match a selection of loan numbers from the MySQL database (about 82,000 loan numbers) to the corresponding information we have housed in the MSSQL database.
However, the tables in MSSQL to which I am joining the data from MySQL are much larger (~ 2 million rows), are quite wide, and thus cost much more time to query. Ideally I could perform an inner join between the two tables based on the loan number, but since they are in different databases this is not possible. The inner join that is performed inside a tMap occurs after the Lookup input has already returned its data set, which is quite large (especially since this particular MSSQL query will execute a user-defined function for each loan number).
Is there any way to create a global variable out of the output from the MySQL query (namely, the loan numbers selected by the MySQL query) and use that global variable as an IN clause in the MSSQL query?
This should be possible. I'm not working in MySQL but I have something roughly equivalent here that I think you should be able to adapt to your needs.
I've never actually answered a Stackoverflow question and while I was typing this the page started telling me I need at least 10 reputation to post more than 2 pictures/links here and I think I need 4 pics, so I'm just going to write it out in words here and post the whole thing complete with illustrations on my blog in case you need more info (quite likely, I should think!)
As you can see, I've got some data coming out of the table and getting filtered by tFilterRow_1 to only show the rows I'm interested in.
The next step is to limit it to just the field I want to use in the variable. I've used tMap_3 rather than a tFilterColumns because the field I'm using is a string and I wanted to be able to concatenate single quotes around it but if you're using an integer you might not need to do that. And of course if you have a lot of repetition you might also want to get a tUniqueRows in there as well to save a lot of unnecessary repetition
The next step is the one that does the magic. I've got a list like this:
'A1'
'A2'
'B1'
'B2'
etc, and I want to turn it into 'A1','A2','B1','B2' so I can slot it into my where clause. For this, I've used tAggregateRow_1, selecting "list" as the aggregate function to use.
Next up, we want to take this list and put it into a context variable (I've already created the context variable in the metadata - you know how to do that, right?). Use another tMap component, feeding into a tContextLoad widget. tContextLoad always has two columns in its schema, so map the output of the tAggregateRows to the "value" column and enter the name of the variable in the "key". In this example, my context variable is called MyList
Now your list is loaded as a text string and stored in the context variable ready for retrieval. So open up a new input and embed the variable in the sql code like this
"SELECT distinct MY_COLUMN
from MY_SECOND_TABLE where the_selected_row in ("+
context.MyList+")"
It should be as easy as that, and when I whipped it up it worked first time, but let me know if you have any trouble and I'll see what I can do.