VoIP Billing: Handling that BIG Cisco file
Working on an interesting problem. The file I have to process is 16 gig. Each record represents half of a phone call; the problem is, the other record can be before OR after its matching record in the file, within a few records (fewer than 10). Now, I can't a) read the file into memory and deal with the records inmem, I can't b) make a sort of the file (because this routine has to be run often and the file could be even bigger, so resources are an unknown), I can't c) make a second file with only type 2 records for lookup, also for resource reasons. What to do.
worked on a subset of 20000 records to start out with.
My solution depends upon knowing that, while the records may be out of sequence, they are dependably within a few records of each other. I keep a cache of records I haven't found matches for yet. When I read a new record, I either find a match for it in the cache, or I save it in the cache. This is the driving algorithm:
if (isVoipCallHistoryRecord(str)) {
  if (isVoipCallLeg1LandlineHalfRecord(str)) {
  callleg1++;
  if (isCallLeg2VoipHalfFoundAlready(str))
  processCall12();
  else
  saveCallLeg1LandlineHalf(str);
} else {
  if (isVoipCallLeg2VoipHalfRecord(str)) {
  callleg2++;
  if (isCallLeg1LandlineHalfFoundAlready(str))
  processCall21();
  else
  saveCallLeg2VoipHalf(str);
  }
  }
  }
else {
  otherrecords++;
}
If I try to remove the record from cache after I've found it, things slow down too much. If I try to make the cache so big that I never have to remove anything, it's STILL too slow because that huge array has to be searched to find any record . (started arraylist with an initial capacity).
the removerange(int i, int i) method is what's required here. I create an arraylist of capacity 100 records, then, when the 100th record is added, I remove records 0-90. Thus, I don't remove too often, and I don't remove the last few records, which might still contain type1 records which haven't been found!
The trick here is to subclass arraylist. I need the removerange method, but it's a protected method! Therefore, it's only available to classes that extend arraylist, not to arraylist itself. A curious design decision...
These are rough notes for the article I want to write in a bit.
12:39:23 PM
|
|