Parse data from csv file into given format using Prolog - csv

I have the csv file that contains data as:
A B C
A - 4 5
B 8 - 6
C 2 3 -
I want to have facts in the following form:
num(a,b,4).
num(a,c,5).
num(b,a,8).
num(b,c,6).
num(c,a,2).
num(c,b,3).
There should not be facts for similar alphabets like num(a,a,-).
I am using prolog's csv_read_file as:
csv_read_file(Path, Rows, [functor(num), arity(4)]), maplist(assert, Rows).
and its giving me output as:
Rows = [num('', 'A', 'B', 'C'), num('A', -, 4, 5), num('B', 8, -, 6), num('C', 2, 3, -)]
It seems to be a basic question, but I am not able to think about condition to perform this. Any help will be highly appreciated.
As per Isabelle Newbie Answer:
Open :- csv_read_file('Path', Rows, [functor(num), arity(4)]), table_entry(Rows, Row).
header_row_entry(Header,Row,Entry):-
arg(1, Row, RowName),
functor(Header, _, Arity),
between(2,Arity,ArgIndex),
arg(ArgIndex, Header, ColumnName),
arg(ArgIndex, Row, Value),
Entry = num(RowName, ColumnName, Value),
writeln(Entry).
table_entry(Entries, Entry):-
Entries = [Header | Rows],
member(Row, Rows),
header_row_entry(Header, Row, Entry).
Now, can anyone explain how and where I should use maplist to convert the rows in form of facts (neglect filtering of '-' and lowercase for now) so that when I query:
?-num(A,B,X).
I should get:
X=4
Next task is, I want to implement depth first search algorithm on it. Any details regarding this will be highly appreciated.

Consider a table header num('', 'A', 'B', 'C') and a row in the table num('B', 8, -, 6). From this you want to compute a table entry identified by the row's name, which here is 'B', and by a column name: the column name being 'A' for the first value (8), 'B' for the second (-), 'C' for the third (6).
Here's a simple way to do this, involving some typing and the obligatory copy-and-paste errors:
header_row_entry(Header, Row, Entry) :-
Header = num('', ColumnName, _, _),
Row = num(RowName, Value, _, _),
Entry = num(RowName, ColumnName, Value).
header_row_entry(Header, Row, Entry) :-
Header = num('', _, ColumnName, _),
Row = num(RowName, _, Value, _),
Entry = num(RowName, ColumnName, Value).
header_row_entry(Header, Row, Entry) :-
Header = num('', _, _, ColumnName),
Row = num(RowName, _, _, Value),
Entry = num(RowName, ColumnName, Value).
This enumerates all the entries in a row on backtracking:
?- Header = num('', 'A', 'B', 'C'), Row = num('B', 8, -, 6),
header_row_entry(Header, Row, Entry).
Header = num('', 'A', 'B', 'C'),
Row = num('B', 8, -, 6),
Entry = num('B', 'A', 8) ;
Header = num('', 'A', 'B', 'C'),
Row = num('B', 8, -, 6),
Entry = num('B', 'B', -) ;
Header = num('', 'A', 'B', 'C'),
Row = num('B', 8, -, 6),
Entry = num('B', 'C', 6).
To enumerate all the entries in an entire table, it remains to enumerate all rows, then enumerate row entries as above. Here this is:
table_entry(Entries, Entry) :-
Entries = [Header | Rows],
member(Row, Rows),
header_row_entry(Header, Row, Entry).
And now, given your table:
?- Table = [num('', 'A', 'B', 'C'), num('A', -, 4, 5), num('B', 8, -, 6), num('C', 2, 3, -)], table_entry(Table, Entry).
Table = [num('', 'A', 'B', 'C'), num('A', -, 4, 5), num('B', 8, -, 6), num('C', 2, 3, -)],
Entry = num('A', 'A', -) ;
Table = [num('', 'A', 'B', 'C'), num('A', -, 4, 5), num('B', 8, -, 6), num('C', 2, 3, -)],
Entry = num('A', 'B', 4) ;
Table = [num('', 'A', 'B', 'C'), num('A', -, 4, 5), num('B', 8, -, 6), num('C', 2, 3, -)],
Entry = num('A', 'C', 5) ;
Table = [num('', 'A', 'B', 'C'), num('A', -, 4, 5), num('B', 8, -, 6), num('C', 2, 3, -)],
Entry = num('B', 'A', 8) ;
Table = [num('', 'A', 'B', 'C'), num('A', -, 4, 5), num('B', 8, -, 6), num('C', 2, 3, -)],
Entry = num('B', 'B', -) . % etc.
Depending on what you want exactly, it remains to lowercase the row and column names (the irritatingly named downcase_atom in SWI-Prolog, for example) and filter out the - entries. You can then assert the entries using a failure-driven loop or by collecting all of them using findall and asserting using maplist.
Now that we have a working solution, we might want header_row_entry to be a bit nicer. We can use arg/3 to capture more explicitly that we are trying to pair a column name and a value that are at the same argument position in their respective header and row terms:
header_row_entry(Header, Row, Entry) :-
arg(1, Row, RowName),
functor(Header, _, Arity),
between(2, Arity, ArgIndex),
arg(ArgIndex, Header, ColumnName),
arg(ArgIndex, Row, Value),
Entry = num(RowName, ColumnName, Value).
This is shorter than the above and applicable to any number of columns in the table.

Related

Musician wants to know the programming term for this function

My apologies that I don't use always use programming terms in my description - I am a musician who has only dabbled in programming.
Suppose I have a list of numbers named a:
a = (0, 2, 4, 5, 7, 9, 11, 12)
and from the list a the following list of numbers is randomly generated to create list b:
b = (4, 7, 5, 7, 9, 11, 9)
Then I want to have "transpositions" (this is a musical term) of list b, such as those shown in lists c, d, and e:
c = (5, 9, 7, 9, 11, 12, 11) or d = (2, 5, 4, 5, 7, 9, 11, 9) or e = (0, 4, 2, 4, 5, 7, 9, 7)
What do you call this "transposition" of this list of numbers in programming terms, and what bit of programming code would accomplish this task? My best guess is that it has to do with the indexing of the numbers in list a, and when list b is created the indexes of list b are put in a list such as list f:
f = (2, 4, 3, 4, 5, 6, 5)
so that the "transposition" is accomplished by adding or subtracting a specific number from each number in list f. For example, the numbers in list c are generated by adding 1 to each of the numbers in list f:
(3, 5, 4, 5, 6, 7, 6)
the numbers in list d are generated by subtracting 1 to each of the numbers in list f:
(1, 3, 2, 3, 4, 5, 4)
and the numbers in list e are generated by subtracting 2 to each of the index numbers taken from list f:
(0, 2, 1, 2, 3, 4, 3)
Or if anyone has a better idea, please let me know.
An operation like "add 1 to every member of this list, to generate a new list" is usually called a map.

How to fill the missing date in a sql table

I have a sql table related to discontinuous dates:
CREATE TABLE IF NOT EXISTS date_test1 ( items CHAR ( 8 ), trade_date date );
INSERT INTO `date_test1` VALUES ( 'a', '2020-03-20');
INSERT INTO `date_test1` VALUES ( 'b', '2020-03-20');
INSERT INTO `date_test1` VALUES ('a', '2020-03-21');
INSERT INTO `date_test1` VALUES ( 'c', '2020-03-22');
INSERT INTO `date_test1` VALUES ( 'd', '2020-03-22');
INSERT INTO `date_test1` VALUES ('a', '2020-03-25');
INSERT INTO `date_test1` VALUES ( 'e', '2020-03-26');
In this table, '2020-03-23' and '2020-03-24' are missed. I want to fill them by their previous data, in this table, the '2020-03-22' data.
Expected result:
The number of continues missing dates and of the records in one day are both uncertain.
So how to do this in mysql?
This solution uses Python and assumes that there aren't so many rows that they cannot be read into memory. I do not warrant this code free from defects; use at your own risk. So I suggest you run this against a copy of your table or make a backup first.
This code uses the pymysql driver.
import pymysql
from datetime import date, timedelta
from itertools import groupby
import sys
conn = pymysql.connect(db='x', user='x', password='x', charset='utf8mb4', use_unicode=True)
cursor = conn.cursor()
# must be sorted by date:
cursor.execute('select items, trade_date from date_test1 order by trade_date, items')
rows = cursor.fetchall() # tuples: (datetime.date, str)
if len(rows) == 0:
sys.exit(0)
groups = []
for k, g in groupby(rows, key=lambda row: row[1]):
groups.append(list(g))
one_day = timedelta(days=1)
previous_group = groups.pop(0)
next_date = previous_group[0][1]
for group in groups:
next_date = next_date + one_day
while group[0][1] != next_date:
# missing date
for tuple in previous_group:
cursor.execute('insert into date_test1(items, trade_date) values(%s, %s)', (tuple[0], next_date))
print('inserting', tuple[0], next_date)
conn.commit()
next_date = next_date + one_day
previous_group = group
Prints:
inserting c 2020-03-23
inserting d 2020-03-23
inserting c 2020-03-24
inserting d 2020-03-24
Discussion
With your sample data, after the rows are fetched, rows is:
(('a', datetime.date(2020, 3, 20)), ('b', datetime.date(2020, 3, 20)), ('a', datetime.date(2020, 3, 21)), ('c', datetime.date(2020, 3, 22)), ('d', datetime.date(2020, 3, 22)), ('a', datetime.date(2020, 3, 25)), ('e', datetime.date(2020, 3, 26)))
After the following is run:
groups = []
for k, g in groupby(rows, key=lambda row: row[1]):
groups.append(list(g))
groups is:
[[('a', datetime.date(2020, 3, 20)), ('b', datetime.date(2020, 3, 20))], [('a', datetime.date(2020, 3, 21))], [('c', datetime.date(2020, 3, 22)), ('d', datetime.date(2020, 3, 22))], [('a', datetime.date(2020, 3, 25))], [('e', datetime.date(2020, 3, 26))]]
That is, all the tuples with the same date are grouped together in a list so it becomes to easier to detect missing dates.

single table reccuring relatives

Can't name it less confusing, sorry...
Imagine the DB table with 3 columns:
object_id - some entity,
relation_key - some property of the object,
bundle_id - we must generalize different objects with this id.
Table has unique key for [object_id, relation_key]:
single object can't have duplicated relation_key,
but different objects can have equal relation_key
Some oxygen understanding with the picture:
Plenty objects can have deep relations by relation_key, all this objects will be related with bundle_id
How can I update bundle_id column with correct values using just single query?
I can write procedure but this way is unsuitable for me.
I look for statement like:
"UPDATE example [join example ON ...] SET bundle_id = ... WHERE ..."
there is "before" schema for mysql:
CREATE TABLE `example` (
`bundle_id` INT(11) DEFAULT NULL,
`object_id` INT(11) NOT NULL,
`relation_key` INT(11) NOT NULL,
PRIMARY KEY (`object_id`,`relation_key`)
);
INSERT INTO `example`(`object_id`, `relation_key`)
VALUES (1, 4), (1, 5), (1, 6), (2, 6), (2, 7), (2, 8), (3, 4),
(3, 9), (3, 10), (4, 11), (4, 12), (4, 13), (5, 14), (5, 15), (5, 16), (6, 17), (6, 11), (6, 18);
Here is the example "before": fiddle example (sqlfiddle stuck for this moment)
And "after" will look like like if you do the queries :
UPDATE `example` SET `bundle_id` = 1 WHERE `object_id` IN (1, 2, 3);
UPDATE `example` SET `bundle_id` = 2 WHERE `object_id` IN (4, 6);
UPDATE `example` SET `bundle_id` = 3 WHERE `object_id` IN (5);
object1 related to object2 by key=6,
object3 related TO object1 by key=4,
so ... objs 1, 2, 3 are related together.
here must be first bundle_id=1.
there is no other keys linking another objects to 1, 2, 3
object_id=4 related to object_id=6 by key=11
so ... obj [4, 6] are related together.
here must be second bundle_id=2,
there is no other keys linking another objects to 4, 6
object_id=5 has no relations to other objects
all object's key belong to itself.
here must be second bundle_id=3,
there is no other keys linking another objects to 5

slice scala LIst by elements value

there is List like this
List(List(9, 1, 9),
List(9, 1, 8),
List(8, 2, 7),
List(8, 2, 6),
List(7, 3, 5),
List(7, 3, 4))
i want to slice list by second element, like this
List(List(List(9, 1, 9), List(9, 1, 8)), //second elem is "1"
List(List(8, 2, 7), List(8,2,6)), //second elem is "2"
List(List(7, 3, 5), List(7,3,4)), //second elem is "3"
I can do it by using verbose for loop, but it's too verbose.
I also try to use span, but I fail to get second element.
what is the best way?
val l = List(List(9, 1, 9),
List(9, 1, 8),
List(8, 2, 7),
List(8, 2, 6),
List(7, 3, 5),
List(7, 3, 4))
l.groupBy( {case first::second::tail => second} ).values.toList

Compare two strings according to defined values in a query

I have to compare two string fields containing letters but not alphabetically.
I want to compare them according to this order :
"J" "L" "M" "N" "P" "Q" "R" "S" "T" "H" "V" "W" "Y" "Z"
So if I compare H with T, H will be greater than T (unlike alphabetically)
And if I test if a value is greater than 'H' (> 'H') I will get all the entries containing the values ("V" "W" "Y" "Z") (again, unlike alphabetical order)
How can I achieve this in one SQL query?
Thanks
SELECT *
FROM yourtable
WHERE
FIELD(col, 'J', 'L', 'M', 'N', 'P', 'Q', 'R', 'S', 'T', 'H', 'V', 'W', 'Y', 'Z') >
FIELD('H', 'J', 'L', 'M', 'N', 'P', 'Q', 'R', 'S', 'T', 'H', 'V', 'W', 'Y', 'Z')
^ your value
Or also:
SELECT *
FROM yourtable
WHERE
LOCATE(col, 'JLMNPQRSTHVWYZ')>
LOCATE('H', 'JLMNPQRSTHVWYZ')
Please see fiddle here.
You can do
SELECT ... FROM ... ORDER BY yourletterfield='J' DESC, yourletterfield='L' DESC, yourletterfield='M' DESC, ...
The equality operator will evaluate to "1" when it's true, "0" when false, so this should give you the desired order.
There's actually a FIELD() function that will make this a bit less verbose. See this article for details.
SELECT ... FROM ... ORDER BY FIELD(yourletterfield, 'J', 'L', 'M', ...)