-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segmentation fault during Windows pcap processing (maybe caused by detect-MHR) #3534
Comments
Hmm. The plot thickens. I thought that while waiting to hear back on this one I'd try testing with the rest of the many test pcaps at https://archive.wrccdc.org/pcaps/2018/ but with that It didn't take long to find https://archive.wrccdc.org/pcaps/2018/wrccdc.2018-03-23.010356000000000.pcap.gz also triggers a segfault. So maybe that has a different root cause or maybe the root cause is the same and commenting out Anyway, just figured I'd throw that on the pile in case it helps with isolating the problem and verifying a fix. |
Well detect-mhr specifically is potentially very relevant. That script does one thing: look for downloads of a certain mime type: https://github.com/zeek/zeek/blob/master/scripts/policy/frameworks/files/detect-MHR.zeek#L18-L24 Then for any matches it looks up the file hash using a TXT DNS query via
There's only one other script that does this that might be triggering on a random pcap run: ssh/interesting-hostnames.zeek. Does that other pcap crash if you disable both the MHR and the ssh script? |
Thanks @JustinAzoff! Indeed, your theory panned out. I just commented out the SSH (I also just realized that I pasted the wrong pcap into my prior comment, but I just went back and fixed it. Should have been https://archive.wrccdc.org/pcaps/2018/wrccdc.2018-03-23.010356000000000.pcap.gz.) So does knowing the problem reproduces reliably on Windows actually make it any easier to fix? 😬 😄 |
Does this script make things crash?
We have some tests for dns, but I think they all run using the fake 'test' resolver, so if it's the real resolver that has the issue this could have been missed. |
Assuming I did it right, it doesn't seem to cause a crash. I put your script into a file
|
It's definitely possible there's some bug in c-ares on Windows. I'll try to get a backtrace out of a Windows build tomorrow for this. |
Got a more useful backtrace finally (after I actually read your repro steps above 🤦🏼♂️ ):
|
Ah, @philrz another thing you can try is use the stock scripts, but set ZEEK_DNS_FAKE=1 in the environment. if it crashes without that, but runs ok with that set, then yeah, it's definitely something with c-ares or as the backtrace above shows, the event loop. |
@JustinAzoff: Ok, I just tried that, and it seems to validate your "it's definitely something" theory. I was able to use the stock scripts (i.e., both |
It's odd because DNS_Mgr and c-ares aren't doing anything untoward that I can tell. It drops the nameserver connection a couple of times but re-establishes at the same time both times. I've tried running against another large pcap and it's not failing there either, though it's doing the same thing. I'll open an issue on the kqueue repo with the crash and see if they have any pointers about what could cause that memory to be invalid. |
Update: I spotted mheily/libkqueue#155 as the issue @timwoj mentioned in the last comment above. |
I've reproduced this issue using Zeek v6.0.2 that I compiled on a Windows 2019 Server AWS EC2 instance using the instructions from https://docs.zeek.org/en/master/install.html.
I've used my compiled Zeek to successfully generate logs from small/medium pcaps. However when I try to process the pcap at https://archive.wrccdc.org/pcaps/2018/wrccdc.2018-03-23.010014000000000.pcap.gz (after
gunzip
ing) I can fairly consistently trigger a segmentation fault if I invoke thelocal
script (unmodified from what shipped with the release).It seems the error is best presented if I drop into Git Bash after calling the BAT script to set the necessary environment variables to run my compiled Zeek.
I then started commenting out lines in
local.zeek
to try to narrow it down, and it seems that if I can avoid the segfault if I comment out this line:Ten successful runs in a row with that line commented out:
The text was updated successfully, but these errors were encountered: