AWS Glue - Icecast Classifier

Posted on Mon 17 February 2020 in Computing

I've been using Athena + Glue to get stats from Icecast logs.

The Icecast log format seems to be the same as the default apache log format but with one extra field

By adding a custom classifier we can get Glue to detect the schema correctly.

This can be added by going to:

AWS Glue->Crawlers->Classifiers->Add classifier

Use the following Grok pattern:

%{TIMESTAMP_ISO8601} %{COMBINEDAPACHELOG} %{NUMBER:duration}

References

[http://lists.xiph.org/pipermail/icecast/2006-May/010446.html]

aws