NEWSLETTER depends-on-the-definition

Hey there, you just received the monthly depends-on-the-definition newsletter for september 2019.

Post of the month: Detecting Network Attacks with Isolation Forests

In this post, I will show you how to use the isolation forest algorithm to detect attacks to computer networks in python. The term isolation means ‘separating an instance from the rest of the instances’.

Since anomalies are ‘few and different’ and therefore they are more susceptible to isolation. An isolation forest is a forest of randomized trees that makes use of this property.

Paper pick

The ability of an ML model to deal with noisy training data depends in great part on the loss function used in the training process. For classification tasks, the standard loss function used for training is the logistic loss. However, the logistic loss function falls short when handling noisy training examples due to two properties:

Outliers far away can dominate the overall loss
Mislabeled examples nearby can stretch the decision boundary

Google Research tackles these problems in a recent paper by introducing a “bi-tempered” generalization of the logistic loss endowed with two tunable parameters that handle those situations well.

Tips & Tricks

How to set up a perfect Python project https://sourcery.ai/blog/python-best-practices/

Newsletter

Post of the month: Detecting Network Attacks with Isolation Forests

Paper pick

Tips & Tricks

Recommended reading