-
Notifications
You must be signed in to change notification settings - Fork 8.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check context every 128 labels instead of 100 #14118
Check context every 128 labels instead of 100 #14118
Conversation
Follow up on prometheus#14096 As promised, I bring a benchmark, which shows a very small improvement if context is checked every 128 iterations of label instead of every 100. It's much easier for a computer to check modulo 128 than modulo 100. This is a very small 0-2% improvement but I'd say this is one of the hottest paths of the app so this is still relevant. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Oleg.
@jesusvazquez nerd-sniped me, so I'll do the same with the rest, here's the difference in assembly between doing modulo 128 and modulo 100: TEXT main.modulo128eq0(SB), NOSPLIT|NOFRAME|ABIInternal, $0-8
FUNCDATA $0, gclocals·g2BeySu+wFnoycgXfElmcg==(SB)
FUNCDATA $1, gclocals·g2BeySu+wFnoycgXfElmcg==(SB)
FUNCDATA $5, main.modulo128eq0.arginfo1(SB)
FUNCDATA $6, main.modulo128eq0.argliveinfo(SB)
PCDATA $3, $1
TESTQ $127, AX
SETEQ AL
RET
TEXT main.modulo100eq0(SB), NOSPLIT|NOFRAME|ABIInternal, $0-8
FUNCDATA $0, gclocals·g2BeySu+wFnoycgXfElmcg==(SB)
FUNCDATA $1, gclocals·g2BeySu+wFnoycgXfElmcg==(SB)
FUNCDATA $5, main.modulo100eq0.arginfo1(SB)
FUNCDATA $6, main.modulo100eq0.argliveinfo(SB)
PCDATA $3, $1
MOVQ $-8116567392432202711, CX
IMULQ AX, CX
MOVQ $368934881474191032, DX
ADDQ DX, CX
ROLQ $62, CX
MOVQ $184467440737095516, DX
CMPQ DX, CX
SETCC AL
RET |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM learned something new today with this
a % 128 = 0
is the same as a & 127 = 0
Which is a faster operation since it just requires doing the AND and not the whole divison and remainer comparison https://godbolt.org/z/4ETM4W8b7
Small clarifications: there is no need to explicitly rewrite |
Follow up on #14096
As promised, I bring a benchmark, which shows a very small improvement if context is checked every 128 iterations of label instead of every 100.
It's much easier for a computer to check modulo 128 than modulo 100. This is a very small 0-2% improvement but I'd say this is one of the hottest paths of the app so this is still relevant.