Averaging over the last hour and last eight hours.

This is fairly simple but I'm not sure what a "good" approach is. Basically I need to track how many times a user has performed an action over the last hour and last eight hours. I'll receive a notification of some sort when they perform this action and could get the current time, but I'm not sure what a fast way of determining how many times this has occurred in a given time period is.

I was thinking perhaps an array with the time of each in milliseconds and just find the first index that was < 8 hours or < 1 hour ago and then subtract the index from the current size. I think that would be problematic though. To keep the array from continuously growing I'd have to shift everything over to the left which means an expensive arraycopy. Perhaps if I only did the resizing when there were > 20 or elements I no longer care about or something would help, but then I'd have to iterate over 20 elements I don't care about everytime it's calculated, which is once a second.

Any algorithms that will solve this?

[1047 byte] By [kablaira] at [2007-10-2 9:17:52]
# 1

Here is a method with a granularity of a minute (i.e. it only updates the average value every minute - you can change it to give whatever granularity you want)

Keep an array with 60 elements in it. The array counts the number of times that the user performed the action each minute.

The array in indexed modulo real time, and you make entries based on actual time of day. So when a user performs an action at 8:38:45 you increment slot 38. Each slot counts the number of actions that took place in that particular minute. When the next minute rolls around, you move to the next slot, set the value to zero (thus throwing out the events that happened over an hour ago, and prepare to count the new events)

The sum of the entire array is of course the total number of events that have taken place in the last hour.

You do not need to actually run through the array and total the events every time you want to calculate the average. You can use the fact that you increase the total by one every time you get a new event and you decrease the total by X everytime you zero out a slot that had X for its event count.

So you can just keep a single value, Total, along with your array that goes up by one on every event and then drops back a little every minute, when you throw out the hour old events.

marlin314a at 2007-7-16 23:24:58 > top of Java-index,Other Topics,Algorithms...
# 2

> Here is a method with a granularity of a minute (i.e.

> it only updates the average value every minute - you

> can change it to give whatever granularity you want)

>

> Keep an array with 60 elements in it. The array

> counts the number of times that the user performed

> the action each minute.

>

> The array in indexed modulo real time, and you make

> entries based on actual time of day. So when a user

> performs an action at 8:38:45 you increment slot 38.

> Each slot counts the number of actions that took

> place in that particular minute. When the next minute

> rolls around, you move to the next slot, set the

> value to zero (thus throwing out the events that

> happened over an hour ago, and prepare to count the

> new events)

>

> The sum of the entire array is of course the total

> number of events that have taken place in the last

> hour.

>

> You do not need to actually run through the array and

> total the events every time you want to calculate the

> average. You can use the fact that you increase the

> total by one every time you get a new event and you

> decrease the total by X everytime you zero out a slot

> that had X for its event count.

>

> So you can just keep a single value, Total, along

> with your array that goes up by one on every event

> and then drops back a little every minute, when you

> throw out the hour old events.

That is probably a good solution to the question I posed, but I realize now I was not thorough or accurate enough in describing the context and requirements. I don't think that will work without a lot of unnecessary memory, but maybe I'm wrong.

The user action is actually the successful completion of entering a form, a bill to be more precise. I expect this to happen 100-300 times an hour to give an idea of the frequency. Given a period of time, if they've been at it for that period of time or longer I need to find the number of bills completed in that most recent period of time. Otherwise, I need to use what is available to predict what they will reach in that period of time. In other words, if the time period is an hour I need to figure out how many bills were completed in the last hour, but if they've been going for only five minutes I need to predict what they will have completed in an hour including those last five minutes. Then I need to do the same thing for seven hours, using the last seven hours or less of data.

I guess I'd just have to use a granularity of a minute or less and calculate how many bills over how many minutes and make a prediction on that? I suppose I was hoping for something more precise, but now that I think about it that will probably suffice.

How would you implement it though? Create a "minute" array of size 60 initially, then when reaching the end of that array adding the total for that array to an "hourly" array (keeps track of bills in that hour) and then start over? I guess then you'd just have to add each from the hourly array to the current total and calculate based upon that. At any given time I'd have the number of bills processed in the last x hours and y minutes, which is enough to calculate how much can be completed at that rate in a given hour or seven hour day.

I'll try this tomorrow when I get to work, thank you very much I was needlessly complicating it with timestamps. In fact, this inadvertently (or perhaps purposefully?) solves an issue with not calculating time when a user is at lunch or on break.At least I think it does.

kablaira at 2007-7-16 23:24:58 > top of Java-index,Other Topics,Algorithms...
# 3

I just realized a problem. If I total the amount for a given hour and add that I lose the minute granularity for the prievious hour, which I need. I guess that's pretty trivial to fix though by simply having a "previous hour" array with a granularity of a minute. Plus, I suppose 60 integers isn't much memory at all, so I don't know why I'm so overly concerned about it.

Premature optimization is the devil.

kablaira at 2007-7-16 23:24:58 > top of Java-index,Other Topics,Algorithms...
# 4

Even if you went with the most obvious approach, it would be something like this, I would think.

Say a scenarious somewhat worse than you were indicating: 5000 entries over 8 hours.

Each entry is a long at 8 bytes.

40k. Not so bad, right? On Windows the VM secures itself 11MB the instant you fire it up.

Drake

Drake_Duna at 2007-7-16 23:24:58 > top of Java-index,Other Topics,Algorithms...
# 5

> Even if you went with the most obvious approach, it

> would be something like this, I would think.

>

> Say a scenarious somewhat worse than you were

> indicating: 5000 entries over 8 hours.

>

> Each entry is a long at 8 bytes.

>

> 40k. Not so bad, right? On Windows the VM secures

> itself 11MB the instant you fire it up.

>

> Drake

Yes, hence my slapping myself for even worrying about it. Besides, there would be no reason to use anything bigger than a short. Plus, there's only going to be 2406 of those with a granularity of 3 seconds. With a granularity of a minute it would only be 126. That's nothing.

kablaira at 2007-7-16 23:24:58 > top of Java-index,Other Topics,Algorithms...
# 6

Just from an organization point of view, I would a List of Objects that hold all the times in an hour (in a List). There's no reason to use an array directly. That's re-inventing the wheel.

When the hour ends, you create a new Hour Object and statrt marking times. When the list is n + 1 hours long, you can drop the first one.

dubwaia at 2007-7-16 23:24:58 > top of Java-index,Other Topics,Algorithms...
# 7

I think the use of the array as described is a circular buffer - used for queues of information that has to be passed between threads (or at least that's the context I learned of it in).

As for using the list, unless this is in 5.0 (and even then), using an array may simply be more convenient.

~Cheers

Adeodatusa at 2007-7-16 23:24:58 > top of Java-index,Other Topics,Algorithms...
# 8

> I think the use of the array as described is a

> circular buffer - used for queues of information that

> has to be passed between threads (or at least that's

> the context I learned of it in).

>

> As for using the list, unless this is in 5.0 (and

> even then), using an array may simply be more

> convenient.

>

> ~Cheers

I have a hard time seeing it that way. Implementing a circular buffer isn't terribly hard but does take some coding and that means testing and probably bugs.

dubwaia at 2007-7-16 23:24:58 > top of Java-index,Other Topics,Algorithms...