I'm need implement that gulp task, where stream splits into multiple threads.
gulp.src("./src/index.coffee")
|
|
V
.pipe(coffee())
|
...
|
V
_____________________________________________
| | . . . | |
| | . . . | |
V V V V
.pipe(replace("a", "b")) .pipe(replace("a", "c")) .pipe(replace("a", "y"))
| | | |
| | | |
V V V V
.pipe(gulp.dest("./b")) .pipe(gulp.dest("./c")) .pipe(gulp.dest("./y"))
Anyone have ideas how to?
Related
I'm currently trying to build a MySQL (version 8) query to get a list of filters with an article count. I know I can use Elasticsearch to achieve the desired result, but the requirement is to use MySQL.
Data
DB Fiddle
Query
SELECT sf.name, sff.title, sff.key, COUNT(DISTINCT sfa.id) AS articles_count
FROM shop_filters AS sf
INNER JOIN shop_filter_facets AS sff ON sff.filter_id = sf.id
LEFT JOIN shop_facetables AS sfa ON (
sfa.facet_id = sff.id AND sfa.facetable_id IN (
SELECT sfa.facetable_id
FROM shop_filter_facets AS sff
INNER JOIN shop_facetables AS sfa ON sfa.facet_id = sff.id
INNER JOIN shop_filters AS sf ON sf.id = sff.filter_id
GROUP BY sfa.facetable_id
HAVING (
sf.name = 'filter_1' AND MAX(sff.key = 1252884110) = 1
OR MAX(sff.key = 1741157870) = 1
)
)
)
GROUP BY sf.name, sff.title, sff.key
Output
As you can see, the other filter_1 items have a count of 0. They should display a count higher than zero. What am I missing in the query above?
Expected output
An example of how the faceted search should behave:
You just needed to change an AND to an OR in the HAVING clause:
SELECT sf.name, sff.title, sff.key, COUNT(DISTINCT sfa.id) AS articles_count
FROM shop_filters AS sf
INNER JOIN shop_filter_facets AS sff ON sff.filter_id = sf.id
LEFT JOIN shop_facetables AS sfa ON (
sfa.facet_id = sff.id AND sfa.facetable_id IN (
SELECT sfa.facetable_id
FROM shop_filter_facets AS sff
INNER JOIN shop_facetables AS sfa ON sfa.facet_id = sff.id
INNER JOIN shop_filters AS sf ON sf.id = sff.filter_id
GROUP BY sfa.facetable_id
HAVING (
sf.name = 'filter_1' OR MAX(sff.key = 1252884110) = 1
OR MAX(sff.key = 1741157870) = 1
)
)
)
GROUP BY sf.name, sff.title, sff.key;
The Results:
| name | title | key | articles_count |
| -------- | --------- | ---------- | -------------- |
| filter_1 | Facet 1-A | 1741157870 | 5 |
| filter_1 | Facet 1-B | 9401707597 | 4 |
| filter_1 | Facet 1-C | 8395537669 | 27 |
| filter_1 | Facet 1-D | 1252884110 | 18 |
| filter_1 | Facet 1-E | 885500301 | 1 |
| filter_2 | Facet 2-A | 5454540233 | 4 |
| filter_2 | Facet 2-B | 2418516648 | 3 |
| filter_2 | Facet 2-C | 2808696733 | 4 |
| filter_2 | Facet 2-D | 8692535611 | 5 |
| filter_2 | Facet 2-E | 6389292333 | 0 |
| filter_2 | Facet 2-F | 5107586138 | 4 |
| filter_2 | Facet 2-G | 9464620325 | 3 |
| filter_2 | Facet 2-H | 1166556565 | 0 |
| filter_2 | Facet 2-I | 2739765054 | 0 |
| filter_3 | Facet 3-A | 1112385648 | 23 |
| filter_4 | Facet 4-A | 2883255908 | 2 |
| filter_4 | Facet 4-B | 1507996583 | 3 |
| filter_4 | Facet 4-C | 7632658109 | 3 |
| filter_4 | Facet 4-D | 2990697496 | 2 |
| filter_5 | Facet 5-A | 2051629771 | 16 |
| filter_5 | Facet 5-B | 6620949318 | 6 |
| filter_5 | Facet 5-C | 8962757449 | 2 |
| filter_5 | Facet 5-D | 2020077129 | 2 |
View on DB Fiddle
How would I be able to retrieve a full-tree from the current structure, or refactor the current table structure to allow for an optimized recursive query?
Issue
Unable to retrieve full-tree of components from base component without iteration.
A single component can have an undefined number of connections (depth).
Components do not have a parent property, as each component can be related to multiple components.
Unable to recursively update affected attribute values of a component.
For example if the price of a component changes, the price is updated for all related components_of.
Current Structure
component
primary key (id)
| id | price |
|----|------ |
| A | 1 |
| B | 1 |
| C | 1 |
| D | 2 |
| E | 2 |
component_closure
unique index (component, component_of)
index (component_of)
FK (component) References component (id)
FK (component_of) References component (id)
| component | component_of |
|--------------------------|
| D | B |
| D | C |
| B | A |
| E | C |
| E | A |
Resulting Graph Model:
Example query:
UPDATE component
SET price = 2
WHERE id = 'A';
Desired Result (* indicates recursively updated values)
| id | price |
|----|------ |
| A | 2 |
| B | 2 | *
| C | 1 |
| D | 3 | *
| E | 3 | *
I am thinking I would need to store the entire tree relationship in the component_closure table, so that I would be able to retrieve the component_of relationships of all components and use a depth column to determine the order of components. Though that seems wasteful when the full-tree is not needed, such as for immediate components_of.
For example:
| component | component_of | depth |
|-----------|--------------|-------|
| D | A | 1 |
| D | B | 2 |
| D | C | 1 |
Yes, if you want to store the transitive closure, you need to store all paths.
For some operations, it's even helpful to store the path of length 0:
| component | component_of | depth |
|-----------|--------------|-------|
| D | D | 0 |
| D | A | 1 |
| D | B | 2 |
| C | C | 0 |
| B | B | 0 |
| B | A | 1 |
| A | A | 0 |
In MySQL 8.0, none of this will be needed. We'll finally be able to use recursive queries.
This question already has an answer here:
MySQL pivot row into dynamic number of columns
(1 answer)
Closed 5 years ago.
https://i.stack.imgur.com/5Pw2L.png
I have a problem that I would like to solve for pre-processing in MySQL. I have a select that returns multiple rows and columns for an id. I would like to transform the rows into columns for the same id, grouping them as per the attached figure. The column names are not important to me because I only need the values for each id.
+---+---+---+---+-----+---+
| 1 | a | b | c | ... | x |
+---+---+---+---+-----+---+
| 1 | d | e | f | ... | y |
+---+---+---+---+-----+---+
| 2 | g | h | i | ... | z |
+---+---+---+---+-----+---+
| 2 | j | k | l | ... | q |
+---+---+---+---+-----+---+
| 3 | m | n | o | ... | w |
+---+---+---+---+-----+---+
| 3 | p | q | r | ... | t |
+---+---+---+---+-----+---+
+---+---+---+---+-----+---+---+---+---+-----+---+
| 1 | a | b | c | ... | x | d | e | f | ... | y |
+---+---+---+---+-----+---+---+---+---+-----+---+
| 2 | g | h | i | ... | z | j | k | l | ... | q |
+---+---+---+---+-----+---+---+---+---+-----+---+
| 3 | m | n | o | ... | w | p | q | r | ... | t |
+---+---+---+---+-----+---+---+---+---+-----+---+
Unfortunately there is no way to create columns like that on the fly, if you have a variable amount of id repetitions in a table.
You could use group concat to get the same columns into one comma separated column
Select Id, Group_Concat(Col1) As Col1,
Group_Concat(Col2) As Col2,
Group_Concat(Col3) As Col3, ...
Group_Concat(Coln) As Coln
From table
Group By Id
I'm just wondering if anyone can tell me if I'm on the right track here.
I have a database that contains temperature values tracked by a device and sorted by datetime. My goal is to create a reporting chart (ex. line chart) via ChartJS. Now the thing is that this table contains thousands of rows and I've never worked with this much data before.
I'm thinking of prompting for a date range and using a date query similar here. I would then return it as JsonResult and have ChartJS make use of it. Is this good enough?
Below are the results from some naive tests I ran (code included too) with 1 chart and 1 dataset on IE11. You'd have to run your own tests specific to the type of chart you are using by adjusting each of the chart options available (read ahead before you start :-)).
Returning a subset of the data will definitely have a positive impact, but the question on whether this is noticeable to compensate for the compromise is very subjective and will need actual measurement (if justified) to figure out.
When you are considering end to end performance, there is no alternative to instrumentation, turning and more instrumentation with a production like environment and of course micro-optimization is the root of all evil (and many missed coffee breaks). A short (and by no means complete) list of other factors to consider would be serialization / deserialization performance, network time, server configuration (compression, et. al), etc.
The below tests are for the client side on a desktop and that too just for time. If mobile is a target environment, you definitely want to run some tests for the environment to look at memory / CPU usage as well.
Stupidly Simple Test
var POINTS = 5000;
var ANIMATION = false;
var BEZIERCURVE = false;
var SCALEOVERRIDE = false;
var ITERATIONS = 10;
// date generation
var data = [];
var labels = [];
for (var i = 0; i < POINTS; i++) {
labels.push("A")
data.push(Math.random() * 100)
}
var chartData = {
labels: labels,
datasets: [
{
data: data
}
]
};
// our charting function
function drawChart() {
var chart;
var startTime = (new Date()).getTime();
chart = new Chart(document.getElementById("myChart").getContext("2d")).Line(chartData, {
animation: ANIMATION,
scaleOverride: SCALEOVERRIDE,
scaleSteps: 10,
scaleStepWidth: 10,
scaleStartValue: 0,
bezierCurve: BEZIERCURVE,
onAnimationComplete: function () {
output.push((new Date()).getTime() - startTime);
if (chart !== undefined)
chart.destroy();
j++;
if (j < ITERATIONS)
setTimeout(drawChart, 0);
else
console.log(output);
}
});
}
var j = 0;
var output = [];
drawChart();
Results
To be taken with a pinch of salt, lime and tequila. Times are in ms and based on a 10 iterations.
----------------------------------------------------------------------------------------
| Type | Points | Animation | Bezier | Scale Override | Mean | Median |
----------------------------------------------------------------------------------------
| Bar | 10 | N | - | N | 2.7 | 3 |
| Bar | 100 | N | - | N | 14 | 13.5 |
| Bar | 1000 | N | - | N | 128.5 | 127.5 |
| Bar | 5000 | N | - | N | 637.4 | 626.5 |
| Bar | 10 | Y | - | N | 997.2 | 997 |
| Bar | 100 | Y | - | N | 1003.5 | 1006.5 |
| Bar | 1000 | Y | - | N | 3417.1 | 3418.5 |
| Bar | 5000 | Y | - | N | 17086.6 | 17085 |
| Bar | 10 | N | - | Y | 3.2 | 3 |
| Bar | 100 | N | - | Y | 14.5 | 14.5 |
| Bar | 1000 | N | - | Y | 127.2 | 125.5 |
| Bar | 5000 | N | - | Y | 638 | 632.5 |
| Bar | 10 | Y | - | Y | 996.6 | 997 |
| Bar | 100 | Y | - | Y | 999.4 | 999 |
| Bar | 1000 | Y | - | Y | 3441.9 | 3433.5 |
| Bar | 5000 | Y | - | Y | 16985.6 | 16959.5 |
| Line | 10 | N | Y | Y | 3.6 | 4 |
| Line | 100 | N | Y | Y | 16.4 | 16 |
| Line | 1000 | N | Y | Y | 146.7 | 145.5 |
| Line | 5000 | N | Y | Y | 821.5 | 820.5 |
| Line | 10 | N | N | Y | 2.9 | 3 |
| Line | 100 | N | N | Y | 14.3 | 14 |
| Line | 1000 | N | N | Y | 131 | 127 |
| Line | 5000 | N | N | Y | 643.9 | 635.5 |
| Line | 10 | N | N | N | 3.1 | 3 |
| Line | 100 | N | N | N | 15.6 | 15 |
| Line | 1000 | N | N | N | 131.9 | 133.5 |
| Line | 5000 | N | N | N | 666 | 660 |
As expected, scale overrides have an impact (but only a little), turning off Bezier curves has a noticeable impact, there not much difference between using a bar chart vs line chart (at least for the configurations I ran). Animation have a noticeable impact as the number of points go up (however I'd assume a simpler easing function will be faster)
I have some language data in a MySQL table containing about 3.8 million rows (with indexes on virtually all fields):
+---------+-----------+----------+--------+----------------+----------+--------+---------+---------+
| theWord | lcTheWord | spelling | thePOS | theUSAS | register | period | variety | theDate |
+---------+-----------+----------+--------+----------------+----------+--------+---------+---------+
| to | to | l | TO | Z5 | p | 1 | b | 1608 |
| direct | direct | l | VVI | M6 | p | 1 | b | 1608 |
| others | others | l | NN2 | A6.1-/Z8 | p | 1 | b | 1608 |
| . | . | o | . | PUNC | p | 1 | b | 1608 |
| Both | both | u | DB2 | N5 | p | 1 | b | 1608 |
| his | his | l | APPGE | Z8m | p | 1 | b | 1608 |
| eyes | eyes | l | NN2 | B1 | p | 1 | b | 1608 |
| are | are | l | VBR | A3+ | p | 1 | b | 1608 |
| never | never | l | RR | T1/Z6 | p | 1 | b | 1608 |
| at | at | l | RR21 | N3.8+[i281.2.1 | p | 1 | b | 1608 |
So the same word can (and often will) be contained in the table multiple times, some with "l" for lowercase and some with "u" for uppercase.
I would now like to compare capitalisation of individual words across time-periods (e.g. 1 vs. 8), variety ("b" = British English, "a" = American English) etc. by creating output that is ranked by the proportion of upper to lowercase spelling. I will at some stage also want to restrict the data to certain parts-of-speech tags (thePOS) or semantic tags (theUSAS).
Unfortunately, my knowledge in SQL is very limited - and although I've tried quite a few things (e.g. joining the table with itself and trying to work out things from there), I have so far failed miserably.
Just to give you an example of the kind of things I have been trying:
SELECT l.theWord, count(l.theWord) as freq_low, count(u.theWord) as freq_up
FROM table_name l
INNER JOIN table_name u ON l.lcTheWord = u.lcTheWord
group by l.lcTheWord;
This is clearly the wrong approach, as it doesn't seem to use the necessary indexes (and takes too long for me to even see what it does...)
I realise this is a far less specific question than the guidelines suggest. Apologies! However, I'm wondering whether some kind soul could give me some pointers so that I can go on from there...?
Many thanks in advance!
Sebastian
I do not think that you need a self join here - a GROUP BY should be sufficient. You can count words with 'u's and 'l's in the spelling column like this:
SELECT
lcTheWord
, SUM(CASE spelling WHEN 'u' THEN 1 ELSE 0 END) AS UpperCount
, SUM(CASE spelling WHEN 'l' THEN 1 ELSE 0 END) AS LowerCount
FROM table_name
GROUP BY lcTheWord