Brownian motion

The quick brown fox jumps over the lazy dog

 

Posts Tagged ‘FreeRadius’

Python is the winner?-)

Please see some info here. I will translate this post into English, i hope.

;;
 
(use-modules (ice-9 rdelim) (ice-9 regex))
 
(define argv (program-arguments))
 
(define filename (car (cdr (cdr argv))))
(define inputfile (open-input-file filename))
 
(define pattern-string (cadr argv))
(define pattern (make-regexp pattern-string))
 
 
(define records 0)
(define lines 0)
(define selected 0)
 
(define (read-all-lines)
    (let loop ((stack '()) (good #f))
        (let ((line (read-line inputfile)))
            (set! stack (append stack (list line)))
            (set! lines (+ 1 lines))
 
            (if (regexp-exec pattern line)
                (set! good #t)
                (if (equal? "" line)
                    (begin
                        (if (eq? good #t)
                            (begin
                                (for-each (lambda (line) (display line) (newline)) stack)
                                (set! selected (+ 1 selected))))
                        (set! good #f)
                        (set! stack '())
                        (set! records (+ 1 records)))))
 
            (if (not (eof-object? (peek-char inputfile)))
                (loop stack good)))))
 
(read-all-lines)
 
(use-modules (ice-9 format))
 
(format (current-error-port)
"~d records (~d lines) processed
~d records matched
Pattern was: '~a'
" records lines selected pattern-string)
 
;; vim: ts=2:

Scheme:

$ time guile -s fradlog_extract.scm 'Station-Id = \"4494.....\"' detail-20090519 > part-scheme
183764 records (4563405 lines) processed
447 records matched
Pattern was: 'Station-Id = "4494....."'
 
real    0m27.653s
user    0m27.550s
sys     0m0.110s

awk:

$ time awk -f fradlog_extract.awk pattern='Station-Id = \"4494.....\"' detail-20090519 > part-awk
183764 records (4563405 lines) processed
447 records matched
Pattern was: 'Station-Id = "4494....."'
 
real    0m21.680s
user    0m21.490s
sys     0m0.090s

python:

$ time python fradlog_extract.py 'Station-Id = "4494....."' detail-20090519 > part-python
183764 records (4563405 lines) processed
447 records matched
Pattern was: 'Station-Id = "4494....."'
 
real    0m9.766s
user    0m9.670s
sys     0m0.060s

Extract part of FreeRadius’ log — Python

I just have wrote about Extract part of FreeRadius” log with a little awk script. Then I decided that it whould be easier and quicker than with Python.

Here is a Python script, which does the same (and written in the same way):

#!/usr/bin/python
#
#
 
import sys, re
 
pattern = sys.argv[1]
file = open(sys.argv[2])
 
cp = re.compile(pattern)
 
total = 0
selected = 0
good = False
lines = 0
 
set = []
 
while True:
    line = file.readline()
    if not line:
        break
 
    lines += 1
 
    set.append(line)
 
    if cp.search(line):
        good = True
 
    if line == '\n':
        if good:
            print ''.join(set),
            selected += 1
 
        good = False
        set = []
        total += 1
 
sys.stderr.write("%i records (%i lines) processed\n" %(total, lines))
sys.stderr.write("%i records matched\n" % selected)
sys.stderr.write("Pattern was: '%s'\n\n" % pattern)

Nothing special, you see.

What I considered interesting in this script? — it works near 30% faster than awk. And I don’t know how to optimize my awk script :-)

Take a look, this is awk:

time awk -f cutlog.awk pattern='Station-Id = \"XXXYYZ[0-2]\"' detail-YYYYMMDD > detail-YYYYMMDD.part
276358 records (6874776 lines) processed
49574 records matched
Pattern was: 'Station-Id = "XXXYYZ[0-2]"'
 
33.90user 0.29system 0:34.19elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+306minor)pagefaults 0swaps

This is Python:

time python cutlog.py 'Station-Id = "XXXYYZ[0-2]"' detail-YYYYMMDD > detail-YYYYMMDD.part
276358 records (6874776 lines) processed
49574 records matched
Pattern was: 'Station-Id = "XXXYYZ[0-2]"'
 
26.60user 0.24system 0:26.85elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+732minor)pagefaults 0swaps

update:
Well, I have tried to «optimize»:

    # in place of:
    if cp.search(line):
        good = True
    # to write:
    if not good and cp.search(line):
        good = True
        continue
# and the same for awk script.

There is no significant difference. And no significant difference when to change order of checking (first «if line is empty» and then «if line matches» or otherwise), use print instead of printf etc.

Extract part of FreeRadius’ log

Today I needed to extract some records from FreeRadius” log — those with Called- or Calling-Station-Id like XXXYYZ[0-2]. Started to type #!/usr/bin/python in new file, but «suddenly» decided to do it with awk.

If you don’t know — FreeRadius’s log contains «sections» separated by empty lines. Every section contains records Called-Staion-Id = <number> (or Calling-Station-Id), and i need to extract sections regarding calls to/from specific numbers.

So, the five-minutes-working-code:

#
# cutlog.awk :
#
 
BEGIN {
    # flag to indicate "good" set of lines:
    good = 0
    # counters:
    selected = 0
    total = 0
}
{
    # add line to set:
    set = set $0 "\n"
 
    # if wanted condition occur, flag set as good:
    if ($0 ~ pattern) {
        good = 1
    }
 
    # if empty line...
    if (!length) {
        # print set if it is "good":
        if (good) {
            printf ("%s", set)
            selected += 1
        }
        total += 1
        # start to work again:
        good = 0
        set = ""
    }
}
END {
    printf("%i records (%i lines) processed\n", total, NR) &gt; "/dev/stderr"
    printf("%i records matched\n", selected) &gt; "/dev/stderr"
    printf("Pattern was: '%s'\n\n", pattern) &gt; "/dev/stderr"
}

Now we can extract needed records:

awk -f cutlog.awk pattern='Station-Id = \"XXXXYYZ[0-2]\"' detail-YYYYMMDD > detail-YYYYMMDD.part

UPDATE: Python is better ;-)

Pages

Recent Posts

Most Rated

Highest Rated

Tags

Archives