Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Processor functions order lead to problems with timestamps #3820

Open
AlonZivony opened this issue Jan 23, 2024 · 5 comments
Open

Processor functions order lead to problems with timestamps #3820

AlonZivony opened this issue Jan 23, 2024 · 5 comments
Assignees
Labels
Milestone

Comments

@AlonZivony
Copy link
Collaborator

Description

Recently we changed the pipeline to have the processEvent stage, in which processor functions are called for the specific event.
We added support there for processor functions which are applied to all events, and used the normalizeEventCtxTimes this way.
However, this introduced some difficulties.
The processor functions for all events are called only after all the specific events processor functions are called.
This means that no processor function can use the timestamps as they are still monotonic (instead of absolute since epoch).
An example for a place struggling with it is the processor for sched_process_exec which creates the capture exec files. It uses the timestamp for the files captured, so since the change the times are not useful anymore.
This is not the only place which struggles with this.
We should introduce a mechanism for assigning processor functions both before after the time fix, according to their need.
I guess that in the future we can expect more problems like this caused by ordering.

Output of tracee version:

(paste your output here)

Output of uname -a:

(paste your output here)

Additional details

@NDStrahilevitz
Copy link
Collaborator

I think we can do something like AllBefore and AllAfter, explicit and gets the point across.

@AlonZivony
Copy link
Collaborator Author

This is possible and will solve the current issue (I guess you mean "before all processors" by the AllBefore).
The only reason not to do it is if we think that in the future a more detailed ordering will be needed, which will in the end require to implement priority mechanism.

@yanivagman yanivagman added this to the v0.22.0 milestone May 9, 2024
@yanivagman
Copy link
Collaborator

Recently I thought about pushing the time normalization to the sink stage.
This is due to the fact that normalization is required for user-facing outputs, and for internal usage we can do something similar to what suggested in #3726.
By moving the normalization to the sink stage, we don't need to worry about processors order or calling the same processor methods more than once (e.g. for signatures/derived events), which will cause "double normalization" with wrong values.

@AlonZivony
Copy link
Collaborator Author

If #3726 is implemented, I don't think there is a need as well.
However, to create such a solution will probably use the processor functions as well...
Either way this will solve the problem if the time change will occur before all other processor functions

@yanivagman
Copy link
Collaborator

yanivagman commented May 20, 2024

If #3726 is implemented, I don't think there is a need as well. However, to create such a solution will probably use the processor functions as well... Either way this will solve the problem if the time change will occur before all other processor functions

Processor functions can use the GetEpoch method of the timestamp (as described in #3726 ) and don't need to care about how the time is represented internally, so there is no need to make any time change before processor functions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants