I made a program that converts csv files into sdf files. Those files were supposed to go in another converter that turns them into something called a "Nist MS Library". The problem is that my file doesn't get accepted by the converter for "No spectra have been converted" and I don't understand why.
The files seem identical to me and I think I'm missing something about the specific file extension.
I'm really sorry if this doesn't belong here, I will delete the post if this is the case, but I really do not know where to ask.
I tried to make the "mass spectral peaks" integer, floats, delete them and put some values that I knew for sure that were accepted by the Nist converter, but nothing seems to work.
I will put 2 molecules, the first one is mine, the one that doesn't get accepted, the other one is the one that is fine for the program.
Coumarin
No Structure
0 0 0 0 0 0 0 0 0 0 0
> <NAME>
Coumarin
> <INCHIKEY>
> <FORMULA>
> <MW>
> <CASNO>
91645
> <ID>
2
> <COMMENT>
SAFC Cat. n. W526509\nColumn: SLB-5ms part#28471-U; Supelcowax-10 part#24079; Equity-1 part#28046-U;\nwww.sigmaaldrich.com |RI:1438|
> <SYNONYMS>
Coumarin
> <NUM PEAKS>
140
> <MASS SPECTRAL PEAKS>
39 1
39 233
40 38
40 0
40 0
41 0
41 1
41 5
42 2
42 2
43 35
43 0
43 12
44 26
45 20
45 67
46 7
46 4
46 2
46 1
47 0
48 4
49 32
49 3
50 5
50 166
51 183
52 20
53 35
54 7
55 1
56 0
58 0
59 2
59 23
60 6
60 11
61 89
62 213
63 503
64 164
64 13
65 15
65 7
66 9
68 4
71 0
71 1
72 2
73 1
73 19
74 0
74 32
75 39
76 14
77 11
77 1
78 0
78 0
79 0
79 9
79 0
80 2
80 1
81 0
81 0
81 1
82 0
82 0
83 0
83 0
84 6
84 0
85 17
86 2
86 2
87 28
88 7
89 523
90 581
91 25
91 48
92 40
92 36
93 4
93 5
94 1
94 0
97 2
98 6
98 0
99 2
99 0
100 0
100 1
101 5
102 3
103 0
103 0
103 0
104 0
105 0
105 0
106 0
106 0
106 0
107 0
108 0
109 0
110 0
110 0
111 0
111 0
112 0
112 0
113 0
116 0
117 6
117 0
118 1000
119 94
120 34
120 10
121 4
121 0
122 0
122 0
131 0
135 0
145 1
146 0
146 547
147 58
148 5
183 0
246 0
334 0
351 0
359 0
382 0
> <RI value>
1430.1
$$$$
ETHYL HYDROSULFIDE
(C) 2015 John Wiley & Sons, Inc.
CAS rn = 75081, Library ID = 1
3 2 0 0 0 0 0 0 0 0999 V2000
0.0000 0.2061 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
0.7146 -0.2061 0.0000 S 0 0 0 0 0 0 0 0 0 0 0 0
-0.7146 -0.2061 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
1 2 1 0 0 0 0
1 3 1 0 0 0 0
M END
> <NAME>
Ethyl hydrosulfide
> <SYNONYMS>
Ethanethiol
$:28DNJIEGIFACGWOD-UHFFFAOYSA-N
$:29n=703/0/1 p=620/0/1
> <FORMULA>
C2H6S
> <MW>
62
> <CASNO>
75081
> <ID>
1
> <COMMENT>
WileyID="LM_FFNSC3_1" RI1="703 (SLB-5MS (Hydro))" RI2="392 (SLB-5MS (FAMEs))" RI3="620 (Supelcowax-10 (FAMEs)" RI4="568 (Supelcowax-10 (FAEEs)" Contributor="Prof. L. Mondello (Chromaleont s.r.l./Univ. Messina, Italy)"
> <NUM PEAKS>
21
> <MASS SPECTRAL PEAKS>
44 20
45 235
46 147
47 727
48 20
49 32
50 2
51 2
52 2
53 2
54 2
55 2
56 16
57 84
58 115
59 84
60 16
61 155
62 999
63 36
64 44
$$$$
I have a table of wind directions and strengths over a 24 hour period, sample data at the bottom of this question.
only directions that have strengths are stored in the database, I'm currently using the following SQL:
SELECT winddirection, avespeed
FROM wp_weather_data
WHERE ID%10 = 0
what I would like to return is an entry for every degree (0 value for any degree not in the db) and where there are multiple entries for a given degree to only return the highest value. Oh, and they need to be in ascending order of degrees.
Is this possible?
This is so I can plot a wind distribution chart on a polar chat plugin in WordPress.
sample data returned from the above sql:
294 2
271 3
269 2
285 3
289 2
123 1
130 1
144 1
160 0
168 0
161 0
135 0
138 0
331 0
115 0
136 0
161 0
267 0
114 0
265 0
204 0
248 1
206 0
199 1
250 2
244 3
257 3
272 5
267 5
208 3
221 3
223 4
253 6
233 5
I'm trying to fine-tune tesseract 4.1.1 on my own specific data according to this guide. I want it to become able to detect and recognize text in boxes like that:
I have generated a number of images like that and corresponding to them .box files containing bounding boxes with text. To reproduce my issue here i'm going to show my pipeline using only one image. Here is the .box file for the image above:
0 1804 1659 1858 1813 0
5 1804 1659 1858 1813 0
9 1804 1659 1858 1813 0
9 1266 715 1334 1169 0
7 1266 715 1334 1169 0
8 1266 715 1334 1169 0
3 1266 715 1334 1169 0
6 1266 715 1334 1169 0
8 1266 715 1334 1169 0
0 1266 715 1334 1169 0
5 1266 715 1334 1169 0
3 1266 715 1334 1169 0
2 876 303 930 607 0
7 876 303 930 607 0
2 876 303 930 607 0
8 876 303 930 607 0
2 876 303 930 607 0
2 876 303 930 607 0
8 1671 120 1725 224 0
0 1671 120 1725 224 0
5 300 1278 354 1482 0
2 300 1278 354 1482 0
3 300 1278 354 1482 0
7 300 1278 354 1482 0
7 917 1451 975 1605 0
6 917 1451 975 1605 0
4 917 1451 975 1605 0
1 1058 1310 1132 1716 0
9 1058 1310 1132 1716 0
8 1058 1310 1132 1716 0
7 1058 1310 1132 1716 0
7 1058 1310 1132 1716 0
1 1058 1310 1132 1716 0
8 1058 1310 1132 1716 0
6 1058 1310 1132 1716 0
3 998 76 1070 382 0
4 998 76 1070 382 0
4 998 76 1070 382 0
8 998 76 1070 382 0
3 998 76 1070 382 0
6 998 76 1070 382 0
3 722 548 776 652 0
2 722 548 776 652 0
7 1782 1332 1838 1586 0
7 1782 1332 1838 1586 0
2 1782 1332 1838 1586 0
6 1782 1332 1838 1586 0
2 1782 1332 1838 1586 0
1 714 140 768 244 0
2 714 140 768 244 0
0 220 500 278 754 0
5 220 500 278 754 0
5 220 500 278 754 0
6 220 500 278 754 0
6 220 500 278 754 0
8 1676 1052 1742 1406 0
4 1676 1052 1742 1406 0
5 1676 1052 1742 1406 0
9 1676 1052 1742 1406 0
1 1676 1052 1742 1406 0
2 1676 1052 1742 1406 0
4 1676 1052 1742 1406 0
5 357 161 419 317 0
1 357 161 419 317 0
4 357 161 419 317 0
9 1424 848 1480 952 0
8 1424 848 1480 952 0
0 438 324 498 478 0
6 438 324 498 478 0
9 438 324 498 478 0
8 1503 1246 1559 1700 0
1 1503 1246 1559 1700 0
8 1503 1246 1559 1700 0
5 1503 1246 1559 1700 0
3 1503 1246 1559 1700 0
0 1503 1246 1559 1700 0
5 1503 1246 1559 1700 0
5 1503 1246 1559 1700 0
4 1503 1246 1559 1700 0
8 1553 477 1609 581 0
4 1553 477 1609 581 0
3 527 258 581 512 0
7 527 258 581 512 0
7 527 258 581 512 0
9 527 258 581 512 0
1 527 258 581 512 0
6 1665 1592 1727 1748 0
8 1665 1592 1727 1748 0
3 1665 1592 1727 1748 0
5 595 1362 651 1766 0
9 595 1362 651 1766 0
3 595 1362 651 1766 0
9 595 1362 651 1766 0
4 595 1362 651 1766 0
3 595 1362 651 1766 0
3 595 1362 651 1766 0
1 595 1362 651 1766 0
I have also converted the image into .tiff format and placed it in the same directory with .box file. Lets say we have 87.tiff and 87.box inside the directory.
Next i generate 87.lstmf file using
tesseract 87.tiff 87 lstm.train
Next i extract model using
combine_tessdata -e /usr/share/tesseract-ocr/4.00/tessdata/rus.traineddata lstm_model/rus.lstm
Next i create train.txt file containing the single line: 87.lstmf
Finally, i create bash script train.sh
/usr/bin/lstmtraining \
--model_output output/fine_tuned \
--continue_from lstm_model/rus.lstm \
--traineddata /usr/share/tesseract-ocr/4.00/tessdata/rus.traineddata \
--train_listfile train.txt \
--eval_listfile train.txt \
--max_iterations 400\
--debug_level -1
And when i run it, i have the following logs:
$ bash train.sh
Loaded file lstm_model/rus.lstm, unpacking...
Warning: LSTMTrainer deserialized an LSTMRecognizer!
Continuing from lstm_model/rus.lstm
Loaded 1/1 lines (1-1) of document 87.lstmf
Loaded 1/1 lines (1-1) of document 87.lstmf
Compute CTC targets failed!
Compute CTC targets failed!
Compute CTC targets failed!
Compute CTC targets failed!
The message "Compute CTC targets failed!" repeats infinitely until i interrupt the script.
What am i doing wrong? I'm also concerned about message "Loaded 1/1 lines (1-1)" since i have multiple bounding boxes on the image.
For lstm training on images you may need to have the .box files in the new lstm format (these can be generated by running tesseract with the lstmbox argument):
TrainingTesseract-4.00
so after each line of text mark it with a special line:
<tab> <left> <bottom> <right> <top> <page>
then run lstm.train .
I'm trying to solve a MySQL problem without going crazy. Not sure if it is feasible or not.
Data come from a door/light sensor to detect if toilet is occupied. When door is closed or opened, I get the info + light info. If I have info of closed door and light<10, I say that toilet is not occupied, if light>10, toilet is occupied, and if door is open, toilet is not occupied.
Here is an example of my data :
id wc_id door_open light time
138 0 1 64 2018-10-10 12:28:51
139 0 0 58 2018-10-10 12:34:00
140 0 0 54 2018-10-10 12:34:38
141 0 1 68 2018-10-10 12:35:11
142 0 1 3 2018-10-10 12:35:36
143 0 0 60 2018-10-10 12:37:56
144 0 0 60 2018-10-10 12:37:57
145 0 0 57 2018-10-10 12:38:30
146 0 1 65 2018-10-10 12:43:53
147 0 1 3 2018-10-10 12:44:17
148 0 0 63 2018-10-10 13:10:55
149 0 0 59 2018-10-10 13:11:16
150 0 1 71 2018-10-10 13:12:09
151 0 1 4 2018-10-10 13:12:14
152 0 1 1 2018-10-10 13:15:07
153 0 0 62 2018-10-10 13:17:18
154 0 0 58 2018-10-10 13:18:01
155 0 1 68 2018-10-10 13:19:20
156 0 1 3 2018-10-10 13:19:56
157 0 1 42 2018-10-10 13:26:41
158 0 0 63 2018-10-10 13:26:44
159 0 0 58 2018-10-10 13:27:39
160 0 1 71 2018-10-10 13:27:40
161 0 1 3 2018-10-10 13:28:37
The idea is at the end to have only a series of door_open to 0 to 1, it's not possible to have two 0 or two 1 consecutively.
So I need to keep first door_open=0 with light>10 following a door_open=1, and first door_open=1 after door_open=0, whatever light value.
Is it possible with MySQL? I use MariaDB 10.3.9.
Thanks for your ideas.
The output should be like that :
id wc_id door_open light time
139 0 0 58 12:34:00
141 0 1 68 12:35:11
143 0 0 60 12:37:56
146 0 1 65 12:43:53
148 0 0 63 13:10:55
150 0 1 71 13:12:09
153 0 0 62 13:17:18
155 0 1 68 13:19:20
158 0 0 63 13:26:44
160 0 1 71 13:27:40
(I simplified the time, it's not really important here)
Here is a fiddle
This query should do what you want. It uses a MySQL variable to delay the value of door_open by 1 row, and then returns rows where door_open=0 with light>10 following a door_open=1, and first door_open=1 after door_open=0, whatever light value:
SELECT events.*, #door_open := door_open
FROM events
JOIN (SELECT #door_open := 1) do
WHERE #door_open = 0 AND door_open = 1 OR
#door_open = 1 AND door_open = 0 AND light > 10
Output (from your fiddle data):
id toilet_id door_open light time #door_open := door_open
101 0 false 62 2018-10-10T11:39:31Z 0
103 0 true 69 2018-10-10T11:39:34Z 1
104 0 false 62 2018-10-10T11:42:16Z 0
106 0 true 68 2018-10-10T11:45:50Z 1
109 0 false 56 2018-10-10T12:13:11Z 0
Updated SQLFiddle
Here is the potential answer to my problem, after working on Nick solution. I had to reorder my table (after deleting rows) to avoid an order mess.
select es.id,
es.idNext,
es.toilet_id,
es.time,
es.nextTime,
timediff(es.nextTime, es.time) AS duration
from (
SELECT id, toilet_id, time,
#door_open := door_open as door_open,
lead(id, 1) OVER(ORDER BY id) idNext,
lead(time, 1) OVER(ORDER BY id) nextTime
FROM events e
JOIN (SELECT #door_open := 1) do
WHERE #door_open = 0 AND door_open = 1 OR
#door_open = 1 AND door_open = 0 AND light > 20
) es
where
es.door_open=0 and
timediff(es.nextTime, es.time)>5
Next thing is to update the query to use a partition over toilet_id to separate data from each id.
I have 1000+ customers. I require customer report.
Here debit = potato + onion + ginger. Credit is commission.Balance will be updated every time. It will be balance - debit and balance + credit alternatively.
Grocery data report is as: Data is filled through php form with mysql_fetch_array query. Here few customers are as sample. and few data fields.
id cus_id cus_name potato onion ginger debit credit balance
1 12 munna 10 25 28 63 0 37
2 16 anil 24 56 84 164 0 136
3 34 palash 17 47 51 115 0 85
4 45 dimpy 35 64 39 138 0 112
Table grocery before and after entering new data:
id cus_id cus_name potato onion ginger debit credit balance
1 12 munna 10 25 28 63 0 37
2 16 anil 24 56 84 164 0 136
3 34 palash 17 47 51 115 0 85
4 45 dimpy 35 64 39 138 0 112
5 12 munna 0 0 0 0 6 43
6 16 anil 0 0 0 0 16 152
7 34 palash 0 0 0 0 12 97
8 45 dimpy 0 0 0 0 14 126
My problem is :
I am unable to update balance column, cus_name wise and cus_id wise and insert all data into mysql database. Suggest me with mysql query.