Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update cppcheck to version 2.6.3 and skip analysis for third party code #7455

Merged
merged 5 commits into from
Mar 2, 2022

Conversation

Smjert
Copy link
Member

@Smjert Smjert commented Jan 22, 2022

Also improve the analysis performance by skipping analysis
on all third party code.

In practice the check_source_code job has gone from ~30m to ~5m.

Also improve the analysis performance by skipping analysis
on all third party code.
@Smjert Smjert added the CI/CD Anything about the Continuous Integration or Continuous Deployment tool used by the repository label Jan 22, 2022
@Smjert Smjert requested review from a team as code owners January 22, 2022 18:10
@Smjert Smjert added the Linux label Jan 22, 2022
Copy link
Member

@directionless directionless left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the second COPY will cause the final image to be slightly larger. If you think it's worth it, we could. Otherwise see comments.

FROM base3 AS cppcheck
ENV cppcheckVer 2.6.3
WORKDIR /root
RUN case $(uname -m) in amd64|x86_64) git clone https://github.com/danmar/cppcheck.git \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I can live wit this, it is unnecessary (and IMO confusing). The docker pattern of creating large RUN lines is to avoid bloating the final image with unneeded layers.

But we achieve that in the final FROM scratch AS builder anyhow.

Copy link
Member Author

@Smjert Smjert Jan 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But here I need to conditionally run the operations, so the long list of commands has to run on amd64, while the very last mkdir is for arm64 to avoid failing the COPY operation later.

Are you suggesting to separate each command on it's own RUN and put an if everytime to check if the platform is amd64?
Wouldn't that be even more noisy?

Meanwhile I realized I've slightly messed up the order between the cppcheck step and the cleaning of the apt database.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I didn't even notice the if. I don't know if there's anything cleaner.

cppcheck step and the cleaning of the apt database

If you do the layers right, you don't need to clean the apt db.

Copy link
Member Author

@Smjert Smjert Jan 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you do the layers right, you don't need to clean the apt db.

How do you mean?
The only way I know to leave things behind is to do a separate FROM x AS y as we've been doing and then copy portions from it, but stuff installed from apt can land in various folders, I don't know if it's worth to try to piece things like that in the final step via copies?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think so. Still from apt can get sprayed everywhere, and it's hard to clean up. But that doesn't matter if you're throwing the whole container away and only doing COPY --from=cppcheck /root/cppcheck/install/usr/local/ /usr/local/

I think we can probably keep this, but I find it hard to read.

Copy link
Member Author

@Smjert Smjert Mar 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The apt cleaning step is not meant for the stuff installed in the cppcheck step, but for the initial installation of packages:
https://github.com/osquery/osquery/pull/7455/files#diff-dc9d3419831a3c9190d25d38293841d6a8268c2146641a5b9b55e7ee70927669L4-L28

This stuff installed here has to persist, it's not thrown away.

Maybe I'm doing it wrong; the objective is to remove the cached packages that the very first step has left behind, so the intention is to run those commands in the base3 layer, which is the source for base4, which is the destination for the cppcheck layer.

To be even more clear, I hope, the way I'm thinking about it is that we have a base layer/view which will contain the final state of the container.
This is the view where we do the first installation of various packages via APT; what APT installs has to be permanent and found in the final view.
The difficulty of APT changing many things in the system I was mentioning is this: If it was easy to know perfectly what are all the files that APT changed, excluding its caches, then we could copy all of those into a new layer where APT has not run, and so you drop everything else behind.
But this is not the case.

Then what happens is that the base view/layer is copied multiple times (new layers are created) and new pieces are added to it.

Then we come to the cppcheck layer; while it is based from this evolving view, it's temporary. Only a piece of it it's copied, because it's easily contained under a single folder and thrown away.

Finally we cleanup the base view/layer from stuff we don't need.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're talking cross purposes, or mostly agreeing, or something like that.

I'm observing that the length of the RUN command in the cppcheck container makes it hard to follow, and that since we're going to throw most of this container away later, we can use multiple RUN lines.

You, reasonable, pointed out that it's a mechanism to switch on platform. And possible the simplest.

For cleaning up the apt db stuff, if we wanted to toss any rm things in into base3 that seems reasonable. I don't think we need to have a working apt system at the end of this,

Comment on lines 69 to 70
COPY --from=base3 / /
COPY --from=cppcheck /root/cppcheck/install/usr/local/ /usr/local/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds a second COPY, which I think counts as a second layer. Somewhat mixed on that.

I'd probably:

FROM base3 as base4
COPY --from=cppcheck /root/cppcheck/install/usr/local/ /usr/local/

FROM scratch as build
COPY --from=base4 / /

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is me not knowing enough Docker and having the wrong expectations; I'll change it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love this architecturally, but AFAIK it's what we get

Copy link
Member

@directionless directionless left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll approve, since I think the exclusions are worth getting in soon. Even if I think the docker stuff is hard to read

@directionless directionless changed the title Update cppcheck to version 2.6.3 Update cppcheck to version 2.6.3 and skip analysis for third party code Mar 2, 2022
@Smjert Smjert merged commit 415caf2 into osquery:master Mar 2, 2022
@Smjert Smjert deleted the stefano/improvement/cppcheck2 branch March 2, 2022 15:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI/CD Anything about the Continuous Integration or Continuous Deployment tool used by the repository Linux
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants