Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: Use RecursiveMutex and read locking on getEventSubscriber #5841

Merged

Conversation

theopolis
Copy link
Member

I noticed osquery_events_tests_inotifytests-test is a flaky test because of getEventSubscriber:

  if (!exists(name_id)) {
    LOG(ERROR) << "Requested unknown event subscriber: " + name_id;
    return nullptr;
  }
  return ef.event_subs_.at(name_id);

This is an unprotected race condition and one (of few) callsite is within EventPublisher::fire. This test exercises the race because it may remove the specific event subscriber before all inotify events have been serviced. Thus the exists works and the at throws.

There are two options IMO to address the issue:

  • Add a lock (which means an embedded ReadLock within a same-thread's WriteLock)
  • Refactor to include a getEventSubscriberUnsafe that does not lock and keep one that waterfalls, which does.

The first option requires a RecursiveMutex since additional callsites are requesting a WriteLock then needing a ReadLock. This seems like the simplest approach.

There is no performance impact measured at the millisecond granularity incurred by introducing the lock and moving to a RecursiveMutex. Note that this code change highlights a few places where the code could be improved, specifically not searching a map twice. I do not want to perform these optimizations within the same PR.

@theopolis theopolis added bug events Related to osquery's evented tables or eventing subsystem labels Sep 26, 2019
@theopolis theopolis force-pushed the fix_events_getEventSubscriber_race branch from 3483222 to 05a23bd Compare September 30, 2019 15:44
@directionless
Copy link
Member

Nice debugging

@theopolis theopolis merged commit 0f3ad48 into osquery:master Oct 2, 2019
muffins pushed a commit to muffins/osquery that referenced this pull request Oct 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug events Related to osquery's evented tables or eventing subsystem
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants