How to leverage a flow matrix for network monitoring
A flow matrix is a representation of the IP traffic map; it can be used in many ways to troubleshoot, monitor and optimize network infrastructures. Let's take a closer look at all the use cases for the traffic matrix!
What is a traffic / flow matrix?
Here is a general definition of a traffic matrix: « it is an abstract representation of the traffic volume flowing between sets of source and destination pairs. Each element in the matrix denotes the amount of traffic between a source and destination pair. There are many variants: depending on the network layer under study, sources and destinations could be routers or even whole networks. And “Amount” is generally measured in the number of bytes or packets, but could refer to other quantities such as connections. ».
The network flow matrix is used to display the geography of network traffic between host groups: the most common traffic matrix shows the quantity of traffic sent from one IP subnet to another.
Nevertheless, a flow matrix can use other grouping criteria (e.g. VLAN) and other metrics than the traffic volume (number of packets, sessions, performance metrics, etc…).
Although network traffic matrixes are most used by network engineering teams to drive network optimizations, designs and anomaly detection, the use cases for flow matrix correspond to very distinct situations.
The different use cases
Here is a list of common uses of flow mapping for:
- Traffic volume and geography analysis
Network engineers and architects need data to drive their design and optimization decisions. In complex and highly distributed networks, keeping track of all types of network usage is a complex task. A network flow matrix is a way to represent this complexity in a simple way.
- Performance monitoring
A network map can also be extremely useful to troubleshoot performance degradations, provided it can display other metrics than volumes; for example, flow matrix showing network and application performance indicators such as:
- Packet loss, retransmission, TTL expired
- Network latency
- End user response times
Can be extremely powerful to accelerate the resolution of slowdowns.
- Security monitoring
Finally, network traffic matrixes are greatly helpful when it comes to monitoring threats through the network traffic.
1. Capacity plannig
To make sure the network infrastructure is offering capacities which are in line with demand, network teams need to have a constant view of who requires what capacity or bandwidth for which application / usage.
2. Anomaly detection
In case of bandwidth hogs, misuse or unplanned bandwidth requirement, network team need to be able to locate easily where the excessive demand is coming to be able to mitigate its impact on the other network applications (stop, delay, compress, optimize).
3. Infrastructure migration and change management
When planning important infrastructure migration (datacenter move, change in key network devices like routers and firewalls), network teams need to have a complete visibility of:
- Who is communicating with whom?
- Who is using common services, who is not (e.g. DHCP, DNS, …)?
- What are the dependencies between servers taking part into an application chain?
This data is mandatory to make sure the new equipment will be configured appropriately, that the migration will not generate any outage or performance leak.
In a second, a flow matrix can be used to identify any configuration which still requires an update; here are some examples of patterns which can be recognized easily with a flow matrix:
- Systems trying to communicate with hosts in deprecated IP subnets
- Flows ending up in error, not reaching their destination (by showing the one way flows for example, or mapping the ICMP error message data).
On this topic, you may be interested in reading further information in this article: "How to mitigate the performance risk of data center migrations"
4. Spotting performance holes
A matrix which shows across all datacenters which flows are impacted by a packet loss increase or a network slowdown can save hours in a troubleshooting operation. Instantly pointing out the source / destination pair(s) impacted enables network administrators to focus their attention on the right network paths and set of devices. If you are interested in this topic, you may be interested in this specific article "how to handle IT performance complaints".
5. End user experience mapping
Monitoring where the users having a bad experience when accessing applications (and for that purpose being able to compare the performance rates with all the other user groups and datacenters) helps IT operations team focus on the performance holes and pinpoint the root cause of application delivery failures. To learn more about this, you should read our section on Real User Monitoring.
6. Identify changes in the network traffic pattern
Having a baseline of the geography of the network traffic helps immediately to pinpoint where the traffic pattern has changed in a complex environment.
7. Track viral infections
Keeping track of machines communicating to non-existing / non routed subnets can help identify machines infected by viruses and worms. The network matrix provides a single view to identify such patterns.
What are the prerequisites to build a network traffic matrix
To build a usable network matrix, you need to make sure your network monitoring is capable of :
- Showing a wide angle view of network traffic (from all the user locations to all the datacenters : no layer 2 or 3 filtering)
- All the datacenter traffic is also represented (including the east-west traffic carried in your virtual and cloud environments - read "Best practices for performance troubleshooting in a virtual / cloud data center" for more information on this)
- Scales to handle the traffic load and render the data fast enough
- Offers sufficient retention times to provide the ability to build a baseline and compare normal and abnormal periods
- Provides a complete set of metrics (not just traffic volume, but also performance indicators)
To make the most of network traffic to troubleshoot performance degradations and monitor end user response times, you need to take a new approach at how you analyse and gather information from your network traffic.
We have summarised our vision of how you can accelerate your diagnostics using wiredata in a short guide; download it now!