Add messages to distributed query results #6352

chrisbroome · 2020-03-31T15:21:41Z

Currently the distributed write API sends back status codes that indicate whether or not a particular query was successful. However, in the event that a query was not successful, we have no other information as to what the failure was. This PR solves the issue by adding a new top-level field to the distributed write results:

messages: A map of queryID -> error message for queries that fail. Successful queries will have an empty string for the message.

Also another top-level key was added for the original sql query statement corresponding to the distributed write request:

statements: A map of queryID -> sql statement.

Together, these fields help TLS distributed api implementors to inform clients about why an error happened - as opposed to just letting them know that some error happened.

terracatta · 2020-03-31T18:57:55Z

@chrisbroome Nice!

Kolide's SaaS Product, K2 provides error messages today, by correlating status logs with the distributed results. This PR will be much nicer because we can deliver the messages simultaneously with the other information.

The key thing that we needed to have in our UX that wasn't obvious at first, was not only do we want to show execution error messages, but also warning messages for queries that succeed.

As an example, consider a user who is remote querying SELECT * FROM chrome_extensions;. The query runs successfully but 0 rows returned. This may lead the user to incorrectly conclude that the remotely queried device doesn't have any chrome extensions, but the reality is, that they are getting zero results because they forgot to JOIN with the users table.

Here is how we show that status in our app today.

I would want those warning messages included in the result output. Is that possible to do?

chrisbroome · 2020-04-01T14:12:54Z

Hmm... I'll have to take a look at the C++ SQL api to see if it's easy to get those warning messages. I'm hoping that it is. The error messages were easy because we were already using them in error log output - just not in the distributed results.

chrisbroome · 2020-04-01T15:49:21Z

All the warnings that appear are currently only LOG messages. They never get returned to calling APIs via Status responses, so I don't think it's a trivial change to make that happen.

Relevant code:

https://github.com/osquery/osquery/blob/master/osquery/sql/virtual_table.cpp#L958-L969

That said, I agree that we need these warnings to be in the distributed query results. We probably need some feedback from regular contributors as to what can be done about that.

zwass · 2020-04-03T22:15:50Z

@chrisbroome Can you talk about the motivation for including the query SQL along with the results?

I am 100% on board with returning the error messages. Been wanting this for a long time and never really looked into it.

chrisbroome · 2020-04-06T13:59:52Z

@zwass The sql statement isn't 100% necessary. But there are a couple of benefits of having it in the distributed write result:

The distributed write result will contain all the information about the query. This is in contrast to the current situation where you get a distributed write result sent to your TLS endpoint and then, to correlate what query it came from, you have to do some kind of lookup (usually to a database) by query ID to get the information about the original query.
It provides an easier debugging experience. It the case where a query fails, you'll usually want to log that somewhere. In my opinion, it's really nice to have the sql statement right there in the response so that (again) I don't have to do another database lookup to get that information. I can just log it and move on.

directionless · 2020-04-09T16:51:51Z

The sql statement isn't 100% necessary. But there are a couple of benefits of having it in the distributed write result

Do we need more discussion about this item?

directionless · 2020-04-09T16:53:16Z

All the warnings that appear are currently only LOG messages. They never get returned to calling APIs via Status responses, so I don't think it's a trivial change to make that happen.

My gut sense is that this is an important addition, but isn't something we need to block on. Is there somewhere to leave a TODO, or an obvious place to fix that?

directionless · 2020-04-14T19:10:24Z

We talked about this again in office hours today. There are still some concerns about including the sql -- among them is that it greatly increases the amount of network traffic and storage bytes.

Do you think you could split the statuses from the statements, so we can merge the former and keep discussing the latter?

chrisbroome · 2020-04-15T05:07:23Z

Sure I can do that!

chrisbroome · 2020-04-30T00:56:01Z

Updated the PR to not include the SQL statements. For some reason the LGTM c/c++ build step is failing. I don't understand why.

Smjert · 2020-04-30T14:40:43Z

Updated the PR to not include the SQL statements. For some reason the LGTM c/c++ build step is failing. I don't understand why.

The LGTM step can be ignored, it's a WIP and it's not required :).

chrisbroome force-pushed the new-distributed-result-fields branch from bca8bb5 to c660f20 Compare April 1, 2020 19:57

chrisbroome force-pushed the new-distributed-result-fields branch 2 times, most recently from f50e3c4 to f002e38 Compare April 8, 2020 19:23

chrisbroome force-pushed the new-distributed-result-fields branch from f002e38 to c3937b1 Compare April 13, 2020 15:43

Add messages to distributed query results

86d5516

chrisbroome force-pushed the new-distributed-result-fields branch from c3937b1 to 86d5516 Compare April 29, 2020 18:53

chrisbroome changed the title ~~Add messages and statements to distributed query results~~ Add messages to distributed query results Apr 29, 2020

theopolis approved these changes May 4, 2020

View reviewed changes

theopolis merged commit 26d94ce into osquery:master May 4, 2020

chrisbroome deleted the new-distributed-result-fields branch May 4, 2020 15:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add messages to distributed query results #6352

Add messages to distributed query results #6352

chrisbroome commented Mar 31, 2020

terracatta commented Mar 31, 2020 •

edited

chrisbroome commented Apr 1, 2020

chrisbroome commented Apr 1, 2020

zwass commented Apr 3, 2020

chrisbroome commented Apr 6, 2020 •

edited

directionless commented Apr 9, 2020

directionless commented Apr 9, 2020

directionless commented Apr 14, 2020

chrisbroome commented Apr 15, 2020

chrisbroome commented Apr 30, 2020

Smjert commented Apr 30, 2020

Add messages to distributed query results #6352

Add messages to distributed query results #6352

Conversation

chrisbroome commented Mar 31, 2020

terracatta commented Mar 31, 2020 • edited

chrisbroome commented Apr 1, 2020

chrisbroome commented Apr 1, 2020

zwass commented Apr 3, 2020

chrisbroome commented Apr 6, 2020 • edited

directionless commented Apr 9, 2020

directionless commented Apr 9, 2020

directionless commented Apr 14, 2020

chrisbroome commented Apr 15, 2020

chrisbroome commented Apr 30, 2020

Smjert commented Apr 30, 2020

terracatta commented Mar 31, 2020 •

edited

chrisbroome commented Apr 6, 2020 •

edited