I have a CSV file that look something like below: i.e. not in Prolog format
james,facebook,intel,samsung
rebecca,intel,samsung,facebook
Ian,samsung,facebook,intel
I am trying to write a Prolog predicate that reads the file and returns a list that looks like
[[james,facebook,intel,samsung],[rebecca,intel,samsung,facebook],[Ian,samsung,facebook,intel]]
to be used further in other predicates.
I am still a beginner and have found some good information from SO and modified them to see if I can get it but I`m stuck because I only generate a list that looks like this
[[(james,facebook,intel,samsung)],[(rebecca,intel,samsung,facebook)],[(Ian,samsung,facebook,intel)]]
which means when I call the head of the inner lists I get (james,facebook,intel,samsung) and not james.
Here is the code being used :- (seen on SO and modified)
stream_representations(Input,Lines) :-
read_line_to_codes(Input,Line),
( Line == end_of_file
-> Lines = []
; atom_codes(FinalLine, Line),
term_to_atom(LineTerm,FinalLine),
Lines = [[LineTerm] | FurtherLines],
stream_representations(Input,FurtherLines)
).
main(Lines) :-
open('file.txt', read, Input),
stream_representations(Input, Lines),
close(Input).
The problem lies with term_to_atom(LineTerm,FinalLine).
First we read a line of the CSV file into a list of character codes in
read_line_to_codes(Input,Line).
Let's simulate input with atom_codes/2:
?- atom_codes('james,facebook,intel,samsung',Line).
Line = [106, 97, 109, 101, 115, 44, 102, 97, 99|...].
Then we recompose the original atom read in into FinalLine (this seems wasteful, there must be a way to hoover up a line into an atom directly)
?- atom_codes('james,facebook,intel,samsung',Line),
atom_codes(FinalLine, Line).
Line = [106, 97, 109, 101, 115, 44, 102, 97, 99|...],
FinalLine = 'james,facebook,intel,samsung'.
The we try to map this atom in FinalLine into a term, LineTerm, using term_to_atom/2
?- atom_codes('james,facebook,intel,samsung',Line),
atom_codes(FinalLine, Line),
term_to_atom(LineTerm,FinalLine).
Line = [106, 97, 109, 101, 115, 44, 102, 97, 99|...],
FinalLine = 'james,facebook,intel,samsung',
LineTerm = (james, facebook, intel, samsung).
You see the problem here: LineTerm is not quite a list, but a nested term using the functor , to separate elements:
?- atom_codes('james,facebook,intel,samsung',Line),
atom_codes(FinalLine, Line),
term_to_atom(LineTerm,FinalLine),
write_canonical(LineTerm).
','(james,','(facebook,','(intel,samsung)))
Line = [106, 97, 109, 101, 115, 44, 102, 97, 99|...],
FinalLine = 'james,facebook,intel,samsung',
LineTerm = (james, facebook, intel, samsung).
This ','(james,','(facebook,','(intel,samsung))) term will thus also be in the final result, just written differently: (james,facebook,intel,samsung) and packed into a list:
[(james,facebook,intel,samsung)]
You do not want this term, you want a list. You could use atomic_list_concat/2 to create a new atom that can be read as a list:
?- atom_codes('james,facebook,intel,samsung',Line),
atom_codes(FinalLine, Line),
atomic_list_concat(['[',FinalLine,']'],ListyAtom),
term_to_atom(LineTerm,ListyAtom),
LineTerm = [V1,V2,V3,V4].
Line = [106, 97, 109, 101, 115, 44, 102, 97, 99|...],
FinalLine = 'james,facebook,intel,samsung',
ListyAtom = '[james,facebook,intel,samsung]',
LineTerm = [james, facebook, intel, samsung],
V1 = james,
V2 = facebook,
V3 = intel,
V4 = samsung.
But that's rather barbaric.
We must do this whole processing in fewer steps:
Read a line of comma-separated strings on input.
Transform this into a list of either atoms or strings directly.
DCGs seem like the correct solution. Maybe someone can add a two-liner.
I have this SQL that started to run very slow on production server ( >20s ). When I run it locally on my laptop on the same DB it runs under 1s.
I know that SQL is not optimal but I tried to run EXPLAIN but I don't quite understand the result.
SELECT ST.id,
ST.lot_id,
ST.warehouse_location_id,
WL.name,
P.id AS product_id,
P.sku,
P.code,
P.NAME AS product_name,
P.has_serial,
Sum(ST.quantity) AS qty,
Group_concat(ST.id) AS stock_transaction_ids
FROM stock_transactions ST LEFT JOIN products P
ON (P.id = ST.product_id)
LEFT JOIN warehouse_locations WL
ON (WL.id = ST.warehouse_location_id)
LEFT JOIN
(
SELECT stock_transaction_id,
Group_concat(serial_id) AS serials
FROM stock_transactions_serials
GROUP BY stock_transaction_id ) STS
ON STS.stock_transaction_id = ST.id
WHERE (
ST.document_type = "Goods transfer" AND
ST.document_id IN (806, 807, 808, 809, 810, 811, 812, 824, 827, 831, 835, 838, 839, 841, 868, 869, 870, 871, 872, 873, 874, 875, 876, 877, 878, 879, 880, 881, 882, 883, 884, 885, 886, 887, 888, 889, 890, 944, 945, 946, 954, 955, 956, 957, 961, 965, 966, 967, 2240, 2241, 2242, 2243, 2244, 2245, 2246, 2247, 2248, 2249, 2250, 2251, 2252, 2253, 2254, 2255, 2256, 2257, 2258, 2259, 2260, 2261, 2262, 2263, 2264, 2265, 2266, 2267, 2268, 2269, 2270, 2271, 2272, 2273, 2274, 2275, 2276, 2277, 2278, 2279, 2280, 2281, 2282, 2283, 2284, 2285, 2286, 2287, 2288, 2289, 2290, 2291, 2292, 2293, 2294, 2295, 2296, 2297, 2298, 2299, 2300, 2301, 2302, 2303, 2304, 2305, 2306, 2307, 2308, 2309, 2310, 2311, 2312, 2313, 2314, 2315, 2316, 2317, 2318, 2319, 2320, 2321, 2322, 2323, 2324, 2325, 2326, 2327, 2328, 2329, 2330, 2331, 2332, 2333, 2334, 2335, 2337, 2338, 2339, 2340, 2341, 2342, 2343, 2344, 2345, 2346, 2347, 2348, 2349, 2350, 2351, 2352, 2353, 2354, 2355, 2356, 2357, 2358, 2359, 2360, 2361, 2362, 2363, 2364, 2365, 2366, 2367, 2368, 2369, 2370, 2371, 2372, 2373, 2374, 2375, 2376, 2377, 2378, 2379, 2380, 2381, 2382, 2383, 2384, 2385, 2386, 2387, 2388, 2389, 2390, 2391, 2392, 2393, 2394, 2395, 2396, 2397, 2398, 2399, 2400, 2401, 2402, 2403, 2404, 2405, 2406, 2407, 2408, 2409, 2410, 2411, 2412, 2413, 2414, 2415, 2416, 2417, 2418, 2419, 2420, 2421, 2422, 2423, 2424, 2425, 2426, 2427, 2428, 2429, 2430, 2431, 2432, 2437, 2438, 2439, 2440, 2441, 2442, 2443, 2444, 2445, 2446, 2447, 2448, 2449, 2450, 2451, 2452, 2453, 2454, 2455, 2456, 2457, 2458, 2459, 2460, 2461, 2462, 2463, 2464, 2465, 2466, 2467, 2468, 2469, 2470, 2471, 2472, 2473, 2474, 2475, 2476, 2477, 2478, 2479, 2480, 2481, 2482, 2483, 2484, 2485, 2486, 2487, 2488, 2489, 2490, 2491, 2492, 2493, 2494, 2495, 2496, 2497, 2498, 2499, 2500, 2501, 2502, 2503, 2504, 2505, 2506, 2507, 2508, 2509, 2510, 2511, 2512, 2513, 2514, 2515, 2516, 2517, 2518, 2519, 2520, 2521, 2522, 2523, 2524, 2525, 2526, 2527, 2528, 2529, 2530, 2531, 2532, 2534, 2535, 2536, 2537, 2538, 2539, 2540, 2541, 2542, 2543, 2544, 2545, 2546, 2547, 2548, 2549, 2550, 2551, 2552, 2553, 2554, 2555, 2556, 2557, 2558, 2559, 2560, 2561, 2562, 2563, 2564, 2565, 2566, 2567, 2568, 2569, 2570, 2571, 2572, 2573, 2574, 2575, 2576, 2577, 2578, 2579, 2580, 2581, 2583, 2584, 2585, 2586, 2587, 2588, 2589, 2590, 2591, 2592, 2593, 2594, 2595, 2596, 2597, 2598, 2599, 2600, 2601, 2602, 2603, 2604, 2605, 2606, 2607, 2608, 2609, 2610, 2611, 2612, 2613, 2614, 2615, 2616, 2617, 2618, 2619, 2620, 2621, 2622, 2623, 2624, 2625, 2626, 2627, 2628, 2629, 2630, 3707, 3708, 10194, 10244, 10246, 10247, 10248, 10249, 10426, 10428, 27602, 27603, 28587, 28588, 28589, 28590, 28591, 28592, 28593, 28594, 28595, 28596, 28597, 28598, 28599, 28600, 28601, 28602, 28603, 28604, 28605, 28606, 28607, 28608, 28609, 28610, 28611, 28612, 28613, 28614, 28615, 28616, 28617, 28618, 28619, 28620, 28621, 28622, 28623, 28624, 28625, 28626, 28627, 28628, 28629, 28630, 28631, 28632, 28633, 28634, 28635, 28636, 28637, 28638, 28639, 28640, 28641, 28642, 28643) )
GROUP BY ST.product_id,
ST.lot_id,
ST.warehouse_location_id
HAVING qty > 0
Explain gets me this:
Can someone help understand what's going on? I will see if I can come back with SQL fiddle as well.
It looks like the problem is your derived table:
(
SELECT stock_transaction_id,
Group_concat(serial_id) AS serials
FROM stock_transactions_serials
GROUP BY stock_transaction_id ) STS
ON STS.stock_transaction_id = ST.id
I can't see you using serials anywhere else in the query, so I'm guessing this is not doing very much for you, but the cost is high - the left join isn't using an index when joining. I'd suggest rewriting that as a join on stock_transaction_serials, rather than a derived table.
you have
ON STS.stock_transaction_id = ST.id
WHERE (
ST.document_type = "Goods transfer" AND
ST.document_id IN (806, 807, 808, 809, 810, 811, 812, -- etc
If these values are in goods_transfer_items in a field called id then this would be exactly functionally the same:
ON STS.stock_transaction_id = ST.id
JOIN goods_transfer_items ON goods_transfer_items.id = ST.document_id
AND goods_transfer_items.work_order_number=12345'
AND ST.document_type = "Goods transfer"
Not only is this much shorter but it will also be much faster because SQL servers can optimize joins.
This is functionally IDENTICAL to the NOT IN
prior answer below
You should take the values listed here
ST.document_id IN (806, 807, 808, 809, 810, 811, 812, 824, 827, 831, 835, 838, 839, 841, 868, 869, 870, 871, 872, 873, 874, 875, 876, 877, 878, 879, 880, 881, 882, 883, 884, 885, 886, 887, 888, 889, 890, 944, 945, 946, 954, 955, 956, 957, 961, 965, 966, 967, 2240, 2241, 2242, 2243, 2244, 2245, 2246, 2247, 2248, 2249, 2250, 2251, 2252, 2253, 2254, 2255, 2256, 2257, 2258, 2259, 2260, 2261, 2262, 2263, 2264, 2265, 2266, 2267, 2268, 2269, 2270, 2271, 2272, 2273, 2274, 2275, 2276, 2277, 2278, 2279, 2280, 2281, 2282, 2283, 2284, 2285, 2286, 2287, 2288, 2289, 2290, 2291, 2292, 2293, 2294, 2295, 2296, 2297, 2298, 2299, 2300, 2301, 2302, 2303, 2304, 2305, 2306, 2307, 2308, 2309, 2310, 2311, 2312, 2313, 2314, 2315, 2316, 2317, 2318, 2319, 2320, 2321, 2322, 2323, 2324, 2325, 2326, 2327, 2328, 2329, 2330, 2331, 2332, 2333, 2334, 2335, 2337, 2338, 2339, 2340, 2341, 2342, 2343, 2344, 2345, 2346, 2347, 2348, 2349, 2350, 2351, 2352, 2353, 2354, 2355, 2356, 2357, 2358, 2359, 2360, 2361, 2362, 2363, 2364, 2365, 2366, 2367, 2368, 2369, 2370, 2371, 2372, 2373, 2374, 2375, 2376, 2377, 2378, 2379, 2380, 2381, 2382, 2383, 2384, 2385, 2386, 2387, 2388, 2389, 2390, 2391, 2392, 2393, 2394, 2395, 2396, 2397, 2398, 2399, 2400, 2401, 2402, 2403, 2404, 2405, 2406, 2407, 2408, 2409, 2410, 2411, 2412, 2413, 2414, 2415, 2416, 2417, 2418, 2419, 2420, 2421, 2422, 2423, 2424, 2425, 2426, 2427, 2428, 2429, 2430, 2431, 2432, 2437, 2438, 2439, 2440, 2441, 2442, 2443, 2444, 2445, 2446, 2447, 2448, 2449, 2450, 2451, 2452, 2453, 2454, 2455, 2456, 2457, 2458, 2459, 2460, 2461, 2462, 2463, 2464, 2465, 2466, 2467, 2468, 2469, 2470, 2471, 2472, 2473, 2474, 2475, 2476, 2477, 2478, 2479, 2480, 2481, 2482, 2483, 2484, 2485, 2486, 2487, 2488, 2489, 2490, 2491, 2492, 2493, 2494, 2495, 2496, 2497, 2498, 2499, 2500, 2501, 2502, 2503, 2504, 2505, 2506, 2507, 2508, 2509, 2510, 2511, 2512, 2513, 2514, 2515, 2516, 2517, 2518, 2519, 2520, 2521, 2522, 2523, 2524, 2525, 2526, 2527, 2528, 2529, 2530, 2531, 2532, 2534, 2535, 2536, 2537, 2538, 2539, 2540, 2541, 2542, 2543, 2544, 2545, 2546, 2547, 2548, 2549, 2550, 2551, 2552, 2553, 2554, 2555, 2556, 2557, 2558, 2559, 2560, 2561, 2562, 2563, 2564, 2565, 2566, 2567, 2568, 2569, 2570, 2571, 2572, 2573, 2574, 2575, 2576, 2577, 2578, 2579, 2580, 2581, 2583, 2584, 2585, 2586, 2587, 2588, 2589, 2590, 2591, 2592, 2593, 2594, 2595, 2596, 2597, 2598, 2599, 2600, 2601, 2602, 2603, 2604, 2605, 2606, 2607, 2608, 2609, 2610, 2611, 2612, 2613, 2614, 2615, 2616, 2617, 2618, 2619, 2620, 2621, 2622, 2623, 2624, 2625, 2626, 2627, 2628, 2629, 2630, 3707, 3708, 10194, 10244, 10246, 10247, 10248, 10249, 10426, 10428, 27602, 27603, 28587, 28588, 28589, 28590, 28591, 28592, 28593, 28594, 28595, 28596, 28597, 28598, 28599, 28600, 28601, 28602, 28603, 28604, 28605, 28606, 28607, 28608, 28609, 28610, 28611, 28612, 28613, 28614, 28615, 28616, 28617, 28618, 28619, 28620, 28621, 28622, 28623, 28624, 28625, 28626, 28627, 28628, 28629, 28630, 28631, 28632, 28633, 28634, 28635, 28636, 28637, 28638, 28639, 28640, 28641, 28642, 28643) )
Put them in a table, create an index on it and then inner join to it instead of using IN
I have an old vb6 program which queries an access 2000 database. I have a fairly long query which looks something like this:
Select * from table where key in ( 0, 1, 2, 3, 4, 5, 6, 7, 11, 12, 13, 14, 15, 19, 20, 21, 24, 27, 29, 30, 35, 38, 39, 40, 42, 43, 44, 46, 47, 49, 50, 53, 56, 59, 60, 61, 63, 64, 65, 66, 67, 68, 72, 76, 80, 84, 86, 89, 90, 91, 93, 94, 98, 99, 10041, 10042, 10045, 10046, 10047, 10049, 10057, 10060, 10089, 32200, 32202, 32203, 32204, 32205, 32207, 32214, 32245, 32303, 32314, 32403, 32405, 32414, 32415, 32503, 32703, 32803, 32903, 33003, 33014, 33102, 33103, 33303, 33403, 33405, 33601, 33603, 33604, 33614, 33705, 33714, 33901, 33903, 33914, 34001, 34105, 34114, 34203, 34303, 34401, 34501, 34601, 34603, 34604, 34605, 34803, 41001, 41005, 41007, 41013, 42001, 42005, 42007, 42013, 43001, 43002, 44001, 44007, 46001, 46007, 99999, 9999999)
However, when I look at the RecordSource of the data object, it seems that the query is being truncated to this (which is obviously not syntactically valid and throws an error):
Select * from table where key in ( 0, 1, 2, 3, 4, 5, 6, 7, 11, 12, 13, 14, 15, 19, 20, 21, 24, 27, 29, 30, 35, 38, 39, 40, 42, 43, 44, 46, 47, 49, 50, 53, 56, 59, 60, 61, 63, 64, 65, 66, 67, 68, 72, 76, 80, 84, 86, 89, 90, 91, 93, 94, 98, 99, 100
My data source looks like this:
Begin VB.Data dtaList
Caption = "dtaList"
Connect = "Access 2000;"
DatabaseName = ""
DefaultCursorType= 0 'DefaultCursor
DefaultType = 2 'UseODBC
Exclusive = 0 'False
Height = 345
Left = 960
Options = 0
ReadOnly = 0 'False
RecordsetType = 1 'Dynaset
RecordSource = ""
Top = 4440
Visible = 0 'False
Width = 2295
End
I've tried running the full query in the access database itself which works fine.
Is this a limitation in the VB.Data object, or is there some other explanation? Is there any way I can get around this issue?
Unfortunately I am unable to upgrade to a newer version of access.
The truncated version of the SQL statement you posted is 246 characters long, so it appears that something along the line is limiting the length of the SQL string to somewhere around 255 characters. As you have discovered by pasting the query into Access itself, the actual size limit of an Access query string is much larger (around 64,000 characters, I believe).
I remember running across a similar issue years ago but my problem was an INSERT statement that was writing some rather long strings to the database. The workaround in that case was to use a parameter query (which I realize, in hindsight, that I should have been using anyway). It greatly shortened the length of the SQL string because the parameters were passed separately. Unfortunately that workaround probably wouldn't help you because even if you dynamically created a parameterized version of the query it wouldn't be all that much shorter than the current SQL string.
Another workaround would be to write all of those numbers for the IN clause as rows in a temporary table named something like [inValues], and then use the query
SELECT [table].*
FROM
[table]
INNER JOIN
[inValues]
ON [table].[key] = [inValues].[key]