Increase the amount of MaxRecvRetries for thrift socket #5390
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary:
to eliminate the effect of dropping privileges in other threads causing poll-ing EINTR errors in thrift.
According to ref to bugzilla.redhat in case of changing privileges
glibc
sends SIGRT_1 to other threads which lead to poll be interrupted. On posix we can not have different credentials for thread of one process. Therefore the solution is either to do not use dropping privileges for the whole osquery process or patch all usages of poll in thrift code. I like first option more because playing with permissions of the wholeosqueryd
can cause unpredicted interferences between threads. For instance the same table can provide different results because some other thread dropping and regaining privileges at the same time.So, the solution for now I'd like to suggest is remove dropping privileges from safe places like reading files with known hostnames or shell history files. And because we can not interact with apt/rpm/yum databases as root and should drop to none user for it I'd suggest to increase the number of attempts to poll in case of EINTR. It can significantly eliminate the problem for now.
To address the problem in issue: #5326
Thanks fmanco for the help to investigate this problem.
Differential Revision: D13781886