Display Cell in a Table Form - octave

I am wondering If it is possible to display Cell in a Table Form while using Octave. It would be easier for me to obtain information of the Cell. If it were displayed in a single column, I have to scroll down all the time.

The cell array default display looks kinda complicated:
octave> a = {"foo", "bar", "qux"; "baz", "nof", "zot"}
a =
{
[1,1] = foo
[2,1] = baz
[1,2] = bar
[2,2] = nof
[1,3] = qux
[2,3] = zot
}
It would be much nicer to see something such as:
octave> a = {"foo", "bar", "qux"; "baz", "nof", "zot"}
a(1:2, 1:3) =
{
"foo" "bar" "qux"
"baz" "nof" "zot"
}
However, a cell array can have anything inside each of its cells. This includes other cell arrays, structs, and very long lines with linebreaks. Because of this, a sensible display of a cell array is very dependent on its contents and what the user is interested on taking from them.
What this means, is that it is up to the user to create a function that displays what he wants. I will provide an example that is useful for the case above, which is the most common case, i.e., a 2 dimensional cell array of short one-line strings. The solution is to create a format for printf on the fly from the longest string on each column like so:
octave> a = {"foobar", "bar", "qux"; "baz", "nofnot", "zotr"};
octave> col_width = max (cellfun (#numel, a)) +4
col_width =
10 10 8
octave> row_template= [sprintf("%%%is", col_width) "\n"]
row_template = %10s%10s%8s
octave> printf (row_template, a'{:})
foobar bar qux
baz nofnot zotr
Or on a single line:
octave> printf ([sprintf("%%%is", max (cellfun (#numel, a))+4) "\n"], a'{:})
foobar bar qux
baz nofnot zotr
More complex cell arrays, with a more complex but organized structure, can instead make use of Octave's dataframe. For example:
octave> pkg load dataframe
octave> C = {"Patient Name", "Volume", "Quality", "Owner"; "Joe", 200, .95, "MR"; "Dana", 186, .93, "MR"; "Cassidy", 197, .96, "SP"};
octave> dataframe (C)
ans = dataframe with 3 rows and 4 columns
_1 Patient_Name Volume Quality Owner
Nr char double double char
1 Joe 200 0.95000 MR
2 Dana 186 0.93000 MR
3 Cassidy 197 0.96000 SP

You can use Tablicious package to display information in tabular form.
installation:
pkg install https://github.com/apjanke/octave-tablicious/releases/download/v0.3.5/tablicious-0.3.5.tar.gz;
To answer the question:
you can use a method called cell2table from tablicious package to convert cell to table.
C =
{
[1,1] = 1
[2,1] = 2
[3,1] = 3
[4,1] = 4
[5,1] = 5
[1,2] = Adam
[2,2] = Jane
[3,2] = Amit
[4,2] = Leena
[5,2] = Desh
[1,3] = 23
[2,3] = 34
[3,3] = 22
[4,3] = 45
[5,3] = 100
[1,4] = M
[2,4] = F
[3,4] = M
[4,4] = F
[5,4] = M
}
out = cell2table(C)
prettyprint(out)
Output:
Ô€out =
table: 5 rows x 4 variables
VariableNames: Var1, Var2, Var3, Var4
------------------------------
| Var1 | Var2 | Var3 | Var4 |
------------------------------
| 1 | Adam | 23 | M |
| 2 | Jane | 34 | F |
| 3 | Amit | 22 | M |
| 4 | Leena | 45 | F |
| 5 | Desh | 100 | M |
------------------------------
>>
Link to the official documentation:
https://tablicious.janklab.net/user-guide/tablicious.pdf

Related

How to import CSV file into Octave and keep the column headers

I am trying to import a CSV file so that I can use it with the k-means clustering algorithm. The file contains 6 columns and over 400 rows. Here is a picture of the excel document I used (before exporting it into a CSV file). In essence, I want to be able to use the column header names in my code so that I can use the column names when plotting the data, as well as clustering it.
I looked into some other documentation and came up with this code but nothing came as an output when I just put it into the command window:
[Player BA OPS RBI OBP] = CSVIMPORT( 'MLBdata.csv', 'columns', {'Player', 'BA', 'OPS', 'RBI', 'OBP'}
The only thing that has worked for me so far is the dlm read function, but it returns 0 when there is a String of words
N = dlmread('MLBdata.csv')
Octave
Given file data.csv with the following contents:
Player,Year,BA,OPS,RBI,OBP
SandyAlcantara,2019,0.086,0.22,4,0.117
PeteAlonso,2019,0.26,0.941,120,0.358
BrandonLowe,2019,0.27,0.85,51,0.336
MikeSoroka,2019,0.077,0.22,3,0.143
Open an octave terminal and type:
pkg load io
C = csv2cell( 'data.csv' )
resulting in the following cell array:
C =
{
[1,1] = Player
[2,1] = SandyAlcantara
[3,1] = PeteAlonso
[4,1] = BrandonLowe
[5,1] = MikeSoroka
[1,2] = Year
[2,2] = 2019
[3,2] = 2019
[4,2] = 2019
[5,2] = 2019
[1,3] = BA
[2,3] = 0.086000
[3,3] = 0.2600
[4,3] = 0.2700
[5,3] = 0.077000
[1,4] = OPS
[2,4] = 0.2200
[3,4] = 0.9410
[4,4] = 0.8500
[5,4] = 0.2200
[1,5] = RBI
[2,5] = 4
[3,5] = 120
[4,5] = 51
[5,5] = 3
[1,6] = OBP
[2,6] = 0.1170
[3,6] = 0.3580
[4,6] = 0.3360
[5,6] = 0.1430
}
From there on, you can collect that data into arrays or structs as you like and continue working. One nice option is Andrew Janke's nice 'tablicious' package:
octave:13> pkg load tablicious
octave:14> T = cell2table( C(2:end,:), 'VariableNames', C(1,:) );
octave:15> prettyprint(T)
-------------------------------------------------------
| Player | Year | BA | OPS | RBI | OBP |
-------------------------------------------------------
| SandyAlcantara | 2019 | 0.086 | 0.22 | 4 | 0.117 |
| PeteAlonso | 2019 | 0.26 | 0.941 | 120 | 0.358 |
| BrandonLowe | 2019 | 0.27 | 0.85 | 51 | 0.336 |
| MikeSoroka | 2019 | 0.077 | 0.22 | 3 | 0.143 |
-------------------------------------------------------

combining dataframes, and adding values of common elements

I have multiple data sets like this
data set 1
index| name | val|
1 | a | 1 |
2 | b | 0 |
3 | c | 3 |
data set 2
index| name | val|
1 | g | 4 |
2 | a | 2 |
3 | k | 3 |
4 | l | 2 |
I want to combine these data sets in such a way that if the both the data sets have a row with a common element name, in this example, "a", i want to have only a single row for the combined dataset, where the value is sum of that a and this a, in this case the combined row a would have a val of 3 (2+1). index number for elements does not matter. is there an effective way to do this in excel itself? I'm new to querying data, but im trying to learn. If i can do this in pandas(i'm trying to make myself familiar in this language) or sql, I will do so. My data sets are of different sizes
use:
df3 = df1.groupby('name').sum().add(df2.groupby('name').sum(), fill_value=0).reset_index()
df3['val'] = df3.fillna(0)[' val']+df3.fillna(0)['val']
df3 = df3.drop([' val'], axis=1)
print(df3)
Output:
name index val
0 a 3.0 3.0
1 b 2.0 0.0
2 c 3.0 3.0
3 g 1.0 4.0
4 k 3.0 3.0
5 l 4.0 2.0
IN Sql you can try below query:
select name,sum(val)
from
(select index,name,val from dataset1
union all
select index,name,val from dataset2) tmp
group by name
In Pandas:
df3=pd.concat([df1,df2],ignore_index=True)
df3.groupby(['name']).sum()

How to pass a cell reference to an Apps Script custom function?

Assuming that:
A1 = 3
B1 = customFunc(A1) // will be 3
In my custom function:
function customFunc(v) {
return v;
}
v will be 3. But I want access the cell object A1.
The following is transcribed from the comment below.
Input:
+---+---+
| | A |
+---+---+
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| 4 | 4 |
+---+---+
I want to copy A1:A4 to B1:C2 using a custom function.
Desired result:
+---+---+---+---+
| | A | B | C |
+---+---+---+---+
| 1 | 1 | 1 | 2 |
| 2 | 2 | 3 | 4 |
| 3 | 3 | | |
| 4 | 4 | | |
+---+---+---+---+
To achieve the desired result of splitting an input list into multiple rows, you can try the following approach.
function customFunc(value) {
if (!Array.isArray(value)) {
return value;
}
// Filter input that is more than a single column or single row.
if (value.length > 1 && value[0].length > 1) {
throw "Must provide a single value, column or row as input";
}
var result;
if (value.length == 1) {
// Extract single row from 2D array.
result = value[0];
} else {
// Extract single column from 2D array.
result = value.map(function (x) {
return x[0];
});
}
// Return the extracted list split in half between two rows.
return [
result.slice(0, Math.round(result.length/2)),
result.slice(Math.round(result.length/2))
];
}
Note that it doesn't require working with cell references. It purely deals with manipulating the input 2D array and returning a transformed 2D array.
Using the function produces the following results:
A1:A4 is hardcoded, B1 contains =customFunc(A1:A4)
+---+---+---+---+
| | A | B | C |
+---+---+---+---+
| 1 | a | a | b |
| 2 | b | c | d |
| 3 | c | | |
| 4 | d | | |
+---+---+---+---+
A1:D4 is hardcoded, A2 contains =customFunc(A1:D4)
+---+---+---+---+---+
| | A | B | C | D |
+---+---+---+---+---+
| 1 | a | b | c | d |
| 2 | a | b | | |
| 3 | c | d | | |
+---+---+---+---+---+
A1:B2 is hardcoded, A3 contains =customFunc(A1:B2), the error message is "Must provide a single value, column or row as input"
+---+---+---+---------+
| | A | B | C |
+---+---+---+---------+
| 1 | a | c | #ERROR! |
| 2 | b | d | |
+---+---+---+---------+
This approach can be built upon to perform more complicated transformations by processing more arguments (i.e. number of rows to split into, number of items per row, split into rows instead of columns, etc.) or perhaps analyzing the values themselves.
A quick example of performing arbitrary transformations by creating a function that takes a function as an argument.
This approach has the following limitations though:
you can't specify a function in a cell formula, so you'd need to create wrapper functions to call from cell formulas
this performs a uniform transformation across all of the cell values
The function:
/**
* #param {Object|Object[][]} value The cell value(s).
* #param {function=} opt_transform An optional function to used to transform the values.
* #returns {Object|Object[][]} The transformed values.
*/
function customFunc(value, opt_transform) {
transform = opt_transform || function(x) { return x; };
if (!Array.isArray(value)) {
return transform(value);
}
// Filter input that is more than a single column or single row.
if (value.length > 1 && value[0].length > 1) {
throw "Must provide a single value, column or row as input";
}
var result;
if (value.length == 1) {
// Extract single row from 2D array.
result = value[0].map(transform);
} else {
// Extract single column from 2D array.
result = value.map(function (x) {
return transform(x[0]);
});
}
// Return the extracted list split in half between two rows.
return [
result.slice(0, Math.round(result.length/2)),
result.slice(Math.round(result.length/2))
];
}
And a quick test:
function test_customFunc() {
// Single cell.
Logger.log(customFunc(2, function(x) { return x * 2; }));
// Row of values.
Logger.log(customFunc([[1, 2, 3 ,4]], function(x) { return x * 2; }));
// Column of values.
Logger.log(customFunc([[1], [2], [3], [4]], function(x) { return x * 2; }));
}
Which logs the following output:
[18-06-25 10:46:50:160 PDT] 4.0
[18-06-25 10:46:50:161 PDT] [[2.0, 4.0], [6.0, 8.0]]
[18-06-25 10:46:50:161 PDT] [[2.0, 4.0], [6.0, 8.0]]

Google Sheets script - find matching tuples and copy data between sheets

I have 2 sheets in the same spreadsheet, call them sheet1 and sheet2. In each sheet, every row describes some hardware component and its properties. The point of sheet2 is to eventually replace the outdated sheet1.
Simple example, (real sheets are hundreds of lines long):
sheet1:
componentId | prop1 | prop2 | prop3 | isvalid
---------------------------------------------
1 | x1 | y1 | z1 | yes
2 | x1 | y2 | z3 | yes
3 | x2 | y1 | z1 | yes
sheet2:
componentId | quantity | prop1 | prop2 | prop3 | prop4 | isvalid
----------------------------------------------------------------
15 | 4 | x1 | y1 | z1 | w1 | TBD
23 | 25 | x3 | y3 | z2 | w1 | TBD
33 | 3 | x1 | y2 | z3 | w2 | TBD
The final column "isValid" in sheet1 has been manually populated. What I would like to do is write a script that iterates through sheet1, producing a tuple of the property values, and then looks for matching property value tuples in sheet2. If there is a match, I would like to copy the "isValid" field from sheet1 to the "isValid" field in sheet2.
What I have so far is the following, but I am experiencing a error "The coordinates or dimensions of the range are invalid" - see comment in code below showing where error is. And, the entire thing feels really hacky. Was hoping someone could maybe point me in a better direction? Thanks in advance.
function arraysEqual(a, b) {
if (a === b) return true;
if (a == null || b == null) return false;
if (a.length != b.length) return false;
for (var i = 0; i < a.length; ++i) {
if (a[i] !== b[i]) return false;
}
return true;
}
function copySheetBasedOnRowTuples(){
var ss = SpreadsheetApp.getActiveSpreadsheet();
var sheet1 = ss.getSheetByName('sheet 1 name');
var sheet2 = ss.getSheetByName('sheet 2 name');
s2data = sheet2.getDataRange().getValues()
s1data = sheet1.getDataRange().getValues()
for( i in s1data ){
sheet1Tuple = [ s1data[i][1], s1data[i][2], s1data[i][3] ]
// Now go through sheet2 looking for this tuple,
// and if we find it, copy the data in sheet1 column 4
// to sheet2 column 6 for the rows that matched (i and j)
for ( j in s2data){
sheet2Tuple = [ s2data[j][2], s2data[j][3], s2data[j][4] ]
if ( arraysEqual(sheet1Tuple, sheet2Tuple) ){
// ERROR HAPPENS HERE
sheet2.getRange(j, 6).setValue( sheet1.getRange( i, 4 ).getValue() )
}
}
}
}
The reason of error is the start number between array and range. The index of array starts from 0. The row and column of getRange() start from 1. So how about this modification?
From :
sheet2.getRange(j, 6).setValue( sheet1.getRange( i, 4 ).getValue() )
To :
sheet2.getRange(j+1, 7).setValue( sheet1.getRange( i+1, 5 ).getValue() )
If this was not useful for you, please tell me. I would like to modify.

Query XY array pair for y value at arbitrary x in SQL

I'd like to make a database of products. Each product have characteristics described as an array of x values and corresponding y values.
And I'd like to query products for certain characteristics.
Example product data:
ProductA_x = [10, 20, 30, 40, 50]
ProductA_y = [2, 10, 30, 43, 49]
ProductB_x = [11, 22, 33, 44, 55, 66]
ProductB_y = [13, 20, 42, 35, 28, 21]
Now I'd like to get a list of products where y < 35 # x=31.
In the example data case, I should get ProductA.
If I use MySQL, what would be a good way to define table(s) to
achieve this query at SQL level?
Would it become easier if I could use PostgreSQL? (Use
Array or JSON type??)
One way I was advised was to make a table to specify xy pairs for x range. First data is for range x[0] to x[1], next data is for x[1] to x[2]. Something like this.
| ProductID | x1 | x2 | y1 | y2 |
| --------- | -- | -- | -- | -- |
| 1 | 10 | 20 | 2 | 10 |
| 1 | 20 | 30 | 10 | 30 |
| 1 | 30 | 40 | 30 | 43 |
| 1 | 40 | 50 | 43 | 49 |
| 2 | 11 | 22 | 33 | 44 |
| 2 | 22 | 33 | 20 | 42 |
| 2 | 33 | 44 | 42 | 35 |
| 2 | 44 | 55 | 35 | 28 |
| 2 | 55 | 66 | 28 | 21 |
Then I could query for (x1 > 31 AND 31 < x2) AND (y1 < 35 OR y2 < 35)
This solution is not too bad but I wonder if there is cleverer approach.
Please note that x array is guaranteed to be incremental but different product would have different starting x value, step size and number of points. And x value to be searched for may not exist as exact value in x array.
The length of real x and y arrays would be about 2000. I expect I'd have about 10,000 products.
It would be best if corresponding y value can be interpolated but searching y value at nearest x value is acceptable.
since every X corresponds to exactly one Y, the sane table definition on a classic relational database would be:
CREATE TABLE product (id serial not null unique, sku text primary key, ....);
CREATE TABLE product_xy (product_id int not null references product(id),
x int not null,
y int not null,
primary key(product_id, x));
That would make your query manageable in all cases.
On PostgreSQL 9.3 you could use a LATERAL subquery to effectively use arrays but I don't think it would be easier than just going with a relational design to start with. The only case where you would want to store the info in an array in PostgreSQL is if ordinality mattered on the x array. Then the design becomes slightly more complex because the following array combinations are not semantically the same:
array[1, 2, 3] x
array[4, 5, 6] y
and
array[2, 1, 3] x
array[5, 4, 6] y
If those need to be distinct then go with an array-based solution in PostgreSQL (note that in both cases the same x value corresponds with the same y value, but the ordering of the pairs differs). Otherwise go with a standard relational design. If you have to go with that, then your better option is to have a 2-dimensional xy array that would be something like:
array[
array[1, 2, 3],
array[4, 5, 6]
] xy
You could then have functions which could process these pairs on the array as a whole, but the point is that in this case the xy represents a single atomic value in a specific domain, where ordinality matters in both dimensions and therefore the value can be processed at once. In other words, if ordinality matters on both dimensions, then you have a single value in your domain and so this does not violate first normal form. If ordinality along either dimension does not matter, then it does violate first normal form.