SQL query with avg and group by

If I understand what you need, try this: SELECT id, pass, AVG(val) AS val_1 FROM data_r1 GROUP BY id, pass; Or, if you want just one row for every id, this: SELECT d1.id, (SELECT IFNULL(ROUND(AVG(d2.val), 4) ,0) FROM data_r1 d2 WHERE d2.id = d1.id AND pass = 1) as val_1, (SELECT IFNULL(ROUND(AVG(d2.val), 4) ,0) FROM … Read more

Calculating the averages for each KEY in a Pairwise (K,V) RDD in Spark with Python

Now a much better way to do this is to use the rdd.aggregateByKey() method. Because this method is so poorly documented in the Apache Spark with Python documentation — and is why I wrote this Q&A — until recently I had been using the above code sequence. But again, it’s less efficient, so avoid doing … Read more

Mysql Average on time column?

Try this: SELECT SEC_TO_TIME(AVG(TIME_TO_SEC(`login`))) FROM Table1; Test data: CREATE TABLE `login` (duration TIME NOT NULL); INSERT INTO `login` (duration) VALUES (’00:00:20′), (’00:01:10′), (’00:20:15′), (’00:06:50′); Result: 00:07:09

Exponential Moving Average Sampled at Varying Times

This answer based on my good understanding of low-pass filters (“exponential moving average” is really just a single-pole lowpass filter), but my hazy understanding of what you’re looking for. I think the following is what you want: First, you can simplify your equation a little bit (looks more complicated but it’s easier in code). I’m … Read more

Finding moving average from data points in Python

As numpy.convolve is pretty slow, those who need a fast performing solution might prefer an easier to understand cumsum approach. Here is the code: cumsum_vec = numpy.cumsum(numpy.insert(data, 0, 0)) ma_vec = (cumsum_vec[window_width:] – cumsum_vec[:-window_width]) / window_width where data contains your data, and ma_vec will contain moving averages of window_width length. On average, cumsum is about … Read more

How to efficiently compute average on the fly (moving average)?

Your solution is essentially the “standard” optimal online solution for keeping a running track of average without storing big sums and also while running “online”, i.e. you can just process one number at a time without going back to other numbers, and you only use a constant amount of extra memory. If you want a … Read more