Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support pid_with_namespace in more tables #7132

Merged
merged 12 commits into from
Aug 23, 2021

Conversation

nabilschear
Copy link
Contributor

@nabilschear nabilschear commented Jun 1, 2021

Several tables have support for looking inside of docker containers (using the pid_with_namespace option) from the host OS (e.g., deb_packages). This PR adds the same support for several other osquery tables that rely on filesystem access for ground truth. This is just an elaborate cut and paste of existing code from deb_packages into other tables to use the existing generateInNamespace function. I've tested each of these changes by inspecting a docker container from the host.

@nabilschear nabilschear requested review from a team as code owners June 1, 2021 21:00
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Jun 1, 2021

CLA Signed

The committers are authorized under a signed CLA.

@nabilschear nabilschear force-pushed the namespaces branch 2 times, most recently from 376c1bb to 8ce98ea Compare June 1, 2021 21:07
@theopolis
Copy link
Member

heads up @Smjert

@nabilschear nabilschear changed the title Namespaces Support pid_with_namespace in more tables Jun 2, 2021
@nabilschear
Copy link
Contributor Author

ok, i fixed the broken test (I think)

@nabilschear
Copy link
Contributor Author

nabilschear commented Jun 3, 2021

woohoo all the checks passed @Smjert can you take a look? I think it's your code that I copy and pasted =)

@directionless
Copy link
Member

If I remember correctly, there was some question about whether this could be added to the underlying virtual implementation, without needing to update/maintain it everywhere.

@nabilschear
Copy link
Contributor Author

@Smjert or @theopolis any chance you can get a look at this? I'd like to get this into the next release. Thanks!

@theopolis
Copy link
Member

This seems correct to me, but I am not very confident that I understand these features well.

Copy link
Member

@Smjert Smjert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay, I had a quick look at this initially but I was worried about more tricky and hidden issues.

I had a better look now and there are unfortunately some blockers.
There are mainly a couple of things that have to be checked when adding support for namespace joining to a table (one is a subset of the other).
One is that the table should not try to take use a library that's being potentially used by some other thread, which is sharing locks with.
This will potentially cause a deadlock.

The other is specifically glog macros, which should not be used to do logging in tables. Which is why you have the Logger logger instance that should be used instead.

In these table I see a couple of VLOG() which have not been changed to use the Logger, but that's the minor part.

There are a good portion of tables that use the readFile or forensicReadFile API; unfortunately these APIs have a VLOG and LOG usage here

LOG(WARNING) << "Cannot read file that exceeds size limit: "
<< path.string();
VLOG(1) << "Cannot read " << path.string()
<< " size exceeds limit: " << file_size << " > " << read_max;

To solve this, and ideally in general, APIs shouldn't write anything in the logs, and should return messages via Status that then the caller logs.

The authorized_keys table ends up doing a query against another table via the selectAllFrom.
If I'm not wrong this path contains LOG/VLOG calls too. In general we should avoid doing queries to other tables to implement a table logic, while instead write an API that can be shared and that doesn't log.

There are other tables that use the listInAbsoluteDirectory and resolveFilePattern APIs that in turn use the genGlobs, which in the checkForLoops function contains other logging calls.

@nabilschear
Copy link
Contributor Author

nabilschear commented Jun 14, 2021

Ok @Smjert I took a pass at the fixes you recommended. Here's the status:

  • etc_hosts: readFile -> changed to return status instead of logging
  • dns_resolvers: no changes needed?
  • users: no direct file I/O but there is a mutex for pwdEnumerationMutex for calling getpwnam or getpwuid. refactored to use the _r thread-safe versions of getpwnam etc. Removed mutex.
  • apt_sources: remove VLOG call, calls resolveFilePattern -> genGlobs -> checkForLoops. Removed logging call for symlink loops entirely (this was changed from a LOG to VLOG before, I think it shouldn't be logged at all see Change "Symlink loop" message from warning to verbose #6545)
  • authorized_keys: forensicReadFile-> readFile is good, calls select * on users. See users above.
  • crontab: listFilesInDirectory->genGlobs->checkForLoops already dealt with. forensicReadFile->readFile
  • suid_bin: pass logger into genSuidBinsFromPath
  • yum_sources: pass the logger around. resolveFilePattern->genGlobs->checkForLoops
  • python_packages: pass the logger down into genPackage. Calls readFile (already done). and listDirectoriesInDirectory which eventually gets to checkForLoops.
  • ssh_keys: listFilesInDirectory, forensicReadFile already patched. Calls select * from users

The problem with this is that it is kinda hard to trace all the issues. I tried reading through the code looking for potentially worrying internal library calls.

@nabilschear
Copy link
Contributor Author

nabilschear commented Jun 14, 2021

ok, I went and implemented the users table using the re-entrant _r versions of getpwent, getpwuid, and getpwnam. Code adapted from the man pages.

@nabilschear
Copy link
Contributor Author

@Smjert how does this look? Ready to merge?

Copy link
Member

@Smjert Smjert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I have only done a quick pass, but thank you for making these changes!

The problem with this is that it is kinda hard to trace all the issues. I tried reading through the code looking for potentially worrying internal library calls.

I understand, but unfortunately there isn't really a way around it. This was documented in the original blueprint (#5975).

The plan was always to slowly add support to new tables exactly because the osquery core needed some refactors.

osquery/filesystem/filesystem.cpp Outdated Show resolved Hide resolved
osquery/tables/system/linux/user_groups.cpp Outdated Show resolved Hide resolved
osquery/tables/system/linux/users.cpp Outdated Show resolved Hide resolved
@nabilschear nabilschear force-pushed the namespaces branch 2 times, most recently from da36f9f to a239e2b Compare July 5, 2021 18:26
@nabilschear
Copy link
Contributor Author

ok, I've cleaned up all the readFile related logging and other suggested changes. It's ready for review now.

@nabilschear
Copy link
Contributor Author

@Smjert can you make a final review? I think i've cleaned up everything and fixed the logging in the impacted tables.

@nabilschear nabilschear force-pushed the namespaces branch 2 times, most recently from 0d3bd94 to 7d9f657 Compare July 19, 2021 22:06
@nabilschear
Copy link
Contributor Author

ok, i finally fixed my silly #include issues after rebasing against master. Ready for review @Smjert

@nabilschear
Copy link
Contributor Author

hi, would love to see this merged before the next release! @Smjert @theopolis

@nabilschear
Copy link
Contributor Author

Another ping for getting this merged @Smjert and @theopolis

Copy link
Member

@theopolis theopolis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all of the effort here. Just a few minor requests and then it should be good to go.

Heads up that I did not audit all of the code to assure that ONLY the new logger is used. But I did my best to check at least the code within the table implementations.

osquery/filesystem/filesystem.cpp Outdated Show resolved Hide resolved
osquery/tables/networking/etc_hosts.cpp Outdated Show resolved Hide resolved
osquery/tables/system/linux/user_groups.cpp Outdated Show resolved Hide resolved
osquery/tables/system/linux/user_groups.cpp Outdated Show resolved Hide resolved
osquery/tables/system/linux/user_groups.cpp Outdated Show resolved Hide resolved
osquery/tables/system/posix/apt_sources.cpp Outdated Show resolved Hide resolved
osquery/tables/system/posix/authorized_keys.cpp Outdated Show resolved Hide resolved
osquery/tables/system/posix/crontab.cpp Outdated Show resolved Hide resolved
osquery/tables/system/python_packages.cpp Outdated Show resolved Hide resolved
osquery/tables/system/ssh_keys.cpp Outdated Show resolved Hide resolved
@nabilschear
Copy link
Contributor Author

@theopolis i got all the changes you suggested in

Copy link
Member

@theopolis theopolis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last round of nitpicks, sorry about this. The thing I really care about is the == -> > change. I think that is my fault for having made the wrong originally code suggestion in the GitHub code review UI.

osquery/tables/system/linux/groups.cpp Outdated Show resolved Hide resolved
osquery/tables/system/linux/user_groups.cpp Outdated Show resolved Hide resolved
osquery/tables/system/linux/users.cpp Outdated Show resolved Hide resolved
@theopolis theopolis added this to the 5.0.0 milestone Aug 21, 2021
@mike-myers-tob mike-myers-tob merged commit 482a751 into osquery:master Aug 23, 2021
aikuchin pushed a commit to aikuchin/osquery that referenced this pull request Jul 11, 2023
…1 to master

* commit 'f72b7c5510b8cd78c9d0450cbd1f31903681caa5': (53 commits)
  Add `TimeoutStopSec` to systemd unit files (osquery#7190)
  Prevent osquery from killing itself when the --force flag is used (osquery#7295)
  Linux: Support AF_PACKET sockets. (osquery#7282)
  libs: update openssl to 1.1.1l (osquery#7293)
  Correct macOS installed app bundle path in osqueryctl and doc (osquery#7289)
  macos path fix in launchd plist (osquery#7288)
  Update osquery installed artifacts default paths in code (osquery#7285)
  Update osquery installed artifacts paths in the documentation (osquery#7286)
  Update packaging SHA (osquery#7279)
  Change to the `disk_encryption` table to support QueryContext (osquery#7209)
  Add feature to skip denylist for event-based queries (osquery#7158)
  Support pid_with_namespace in more tables (osquery#7132)
  audit: socket_events improvements (osquery#7269)
  [linux][packaging] Update packaging paths (osquery#7271)
  Change logger_mode flag to be actually interpreted as an octal (osquery#7273)
  Update `uptime` table descrption (osquery#7270)
  [macOS][packaging] Create an app bundle along with other package_data (osquery#7263)
  Add case sensitive pragma to the pragma/actions authorizer allow list (osquery#7267)
  Fix audit rule removal upon osquery exit (osquery#7221)
  Fix osquery_info build_platform column value on Linux (osquery#7254)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants