-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extracting payload hash from network traffic #12
Comments
Hi Arun, we did experiment with hashing of the TCP or UDP Data field of a packet as a way to detect retransmissions and duplicated packets. In some branch or another, I think there is is code to print out the data field as a hex number. Is this the sort of thing you had in mind? Thx! |
Exactly, that was what I was looking for. If I could get the data field, then I could compute hash of it - in my case, a sha256 hash of the payload will suffix. |
Since there is no need for cryptographic collision resistance, and there is a need for speed, I had used the xxhash library https://github.com/Cyan4973/xxHash. It performed quite well in tests. I can't find the code that I had experimented with; I think it was never committed into the git repo. It added a new JSON element that holds the xxhash of the entire TCP data field of packet, something like this: {"tcp":{"data_hash":"474554202f20485454502"}, "src_ip":"192.168.113.237", "dst_ip":"35.224.99.156", "protocol":6, "src_port":53560, "dst_port":80, "event_start":1565200503.658237} The hash provides a practical way to detect duplicated packets, which seem to happen all the time in network capture environments, by detecting duplicate data_hash values in whatever JSON processing is being done. I think the data_hash output could be a useful aid in debugging network capture systems, especially ones with multiple capture interfaces. However, what I'd personally find more useful would be a mercury option that detected duplicate packets and ignored them (by only processing and reporting on the first packet, and ignoring any following ones). Does that line up with your thinking, or do you have some other use cases in mind? Thanks! |
Yes, that is my requirement - to detect duplicate packet based on the payload hash value. One reason for using mercury is that it is able to handle high amount of traffic. Is there any way I could help or contribute to integrate that feature in mercury? Thanks! |
Thanks for the offer to help. I have a bunch of other changes in progress. After those are done, how about I add a hash-based deduplicator as a compile-time option, and you can build it with that option and test it out in your environment. |
Sure, that will be great. Thanks for your help. In the meantime, I will also work on it. |
I was wondering whether it would be possible to extract payload or the payload hash from network traffic along with the fingerprints using mercury. Are there any options for it? We can do it with tcpdump but it does not give fingerprints. Any pointers will be helpful. Thanks.
The text was updated successfully, but these errors were encountered: