-
Notifications
You must be signed in to change notification settings - Fork 369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache TTLs for individual IP addresses in DNS responses. #6732
Conversation
d491609
to
5fbf0f1
Compare
ipMetaDataHolder[ipStr] = ipWithTTL{ | ||
ip: ipMeta.ip, | ||
expirationTime: ipMeta.expirationTime, | ||
} | ||
minTimeToReQuery = minTime(ipMeta.expirationTime, minTimeToReQuery) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since thees lines appear four times at least, might make sense to make it a helper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Dyanngg Yes, that was a nice improvement.
@@ -76,11 +76,16 @@ func (fs *fqdnSelectorItem) matches(fqdn string) bool { | |||
// expirationTime of the records, which is the DNS response | |||
// receiving time plus lowest applicable TTL. | |||
type dnsMeta struct { | |||
expirationTime time.Time | |||
//expirationTime time.Time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove if no longer needed
@@ -253,8 +258,8 @@ func (f *fqdnController) getIPsForFQDNSelectors(fqdns []string) []net.IP { | |||
} | |||
for fqdn := range fqdnsMatched { | |||
if dnsMeta, ok := f.dnsEntryCache[fqdn]; ok { | |||
for _, ip := range dnsMeta.responseIPs { | |||
matchedIPs = append(matchedIPs, ip) | |||
for _, ipWithMetaData := range dnsMeta.responseIPs { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: s/ipWithMetaData/ipData
|
||
currentTime := time.Now() | ||
ipWithTTLMap := make(map[string]ipWithTTL) | ||
minTimeToReQuery := time.Unix(1<<63-62135596801, 999999999) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what time is this supposed to be? At the very least there should be a comment to explain. But based on usage, I am not sure why we are not just using the zero value for time.Time
:
var minTimeToReQuery time.Time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@antoninbas Actually ,
minTimeToReQuery := time.Unix(1<<63-62135596801, 999999999)
i had chosen above expression to establish a reference point for tracking the minimum time to re-query DNS requests as IPs expire. Since our goal in this PR for this bug/issue is to send requests as and when an IP expires, initializing minTimeToReQuery
to a maximum time value allows us to compare and update it as we process various time values , ultimately allowing us to determine the earliest re-query time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What you could do though is to make earlierOf
take time.Time
pointers as parameters, and return the non-nil value if one of t1/t2 is nil. That way you would not need to initialize minTimeToReQuery
. Note that in that case the return value should also be a pointer IMO (since earlierOf(nil, nil) is nil), and when it's used eventually you should check that minTimeToReQuery
is not nil
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Dyanngg , yes i think using that approach we can avoid initialising minTimeToReQuery
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am fine with the pointer solution, but it seems to me that we could just pick a reasonable time such as now + 24*time.Hour
and use that (with an appropriate comment of course). The time is only used if len(ipWithTTLMap) > 0
anyway. And it seems reasonable to resend a query after 24h even if we don't really need to? At least it won't impact correctness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, i think this was the point i was missing . Maybe i had set it to too large a value under assumption that i needed a very large value. The 24 hour time frame seems more reasonable as a replacement to existing large value that i had put.
minTimeToReQuery := time.Unix(1<<63-62135596801, 999999999) | ||
addressUpdate := false | ||
|
||
ipMapFiller := func(ip string, ipMeta ipWithTTL, minTime *time.Time) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the name of this function (ipMapFiller
) is really hurting the readability of the code. I would suggest something like addIPtoMap
/ addIPToCache
/ addIPToNewCache
/ setIPInMap
/ ...
Additionally, minTime
does not need to be a parameter to this function: you are always using &minTimeToReQuery
as the argument value, and minTimeToReQuery
is captured by the closure (by reference).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes , my bad. The closure captures the variable minTimeToReQuery
and should not be passed as an argument.
t.Errorf("Expected TTL for %s to be %v, got %v", ipStr, expectedTTL, ipMeta.expirationTime) | ||
} | ||
} | ||
println("\n\n") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't use a print statement in a test like this
// sample IP with time simulating expired time, i thought that using negative time will | ||
// simulate expired time as it will always equate to before when compared to any time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
be more concise in your comments:
// expired IP that should be removed from the cache when onDNSResponse is called
// we get new IP | ||
"192.0.2.3": {ip: net.ParseIP("192.0.2.3"), expirationTime: currentTime.Add(10 * time.Second)}, | ||
// and an exisiting IP | ||
"192.0.2.1": {ip: net.ParseIP("192.0.2.1"), expirationTime: currentTime.Add(10 * time.Second)}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't the test more useful if the new TTL for this IP is different from the current one? If the new TTL in the response is greater than the existing one, the new one should be sued. If it is less than the existing one, the current one should remain in use.
}, | ||
}, | ||
{ | ||
name: "old IP expired", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a strict subset of the previous test, and doesn't need to be included
there are 2 possible approaches here:
- have a single unit test case with multiple different IPs, ensuring that all the branches / cases of the
onDNSResponse
are tested - have multiple test cases, each testing a different branch / scenario
Usually, we prefer the second approach because it makes test failures easier to identify / troubleshoot.
test/e2e/framework.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these changes are unrelated to the rest of the PR, and should be removed
f.setFQDNMatchSelector(fqdn, selectorItem) | ||
for ipStr, ipMeta := range newDNSresponseIPs { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not make sense to be in part of the for loop... We don't want to call addIPToCache
for as many times as the number of selectorItems matching the fqdn
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Dyanngg yes updated it.
if mustCacheResponse { | ||
|
||
if len(ipWithTTLMap) > 0 { | ||
addressUpdate = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is correct. Isn't addressUpdate
supposed to be determined by whether FQDN <-> IP mappings have changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Dyanngg updated .
@@ -405,80 +409,102 @@ func (f *fqdnController) deleteRuleSelectedPods(ruleID string) error { | |||
|
|||
func (f *fqdnController) onDNSResponse( | |||
fqdn string, | |||
responseIPs map[string]net.IP, | |||
lowestTTL uint32, | |||
newDNSresponseIPs map[string]ipWithTTL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the camel format, it should be newDNSResponseIPs
, however, I feel DNS and response is kind of duplicate with the function name, and given you name the struct ipWithTTL
, I would just name the parameter newIPWithTTLs
, or newIPWithTTLMap
to correspond to the variable at L424.
//minTimeToReQuery Establishes a maximum reference time for tracking the minimum re-query time to DNS, as IPs expire. | ||
minTimeToReQuery := time.Now().Add(24 * time.Hour) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
//minTimeToReQuery Establishes a maximum reference time for tracking the minimum re-query time to DNS, as IPs expire. | |
minTimeToReQuery := time.Now().Add(24 * time.Hour) | |
// minTimeToRequery establishes a maximum reference time for tracking the minimum requery time to DNS, as IPs expire. | |
minTimeToRequery := time.Now().Add(24 * time.Hour) |
// This IP entry has already expired and not seen in the latest DNS response. | ||
// It should be removed from the cache. | ||
|
||
for cachedIpStr, cachedIpMeta := range cachedDNSMeta.responseIPs { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for cachedIpStr, cachedIpMeta := range cachedDNSMeta.responseIPs { | |
for cachedIPStr, cachedIPMeta := range cachedDNSMeta.responseIPs { |
// It should be removed from the cache. | ||
|
||
for cachedIpStr, cachedIpMeta := range cachedDNSMeta.responseIPs { | ||
if _, exist := newDNSresponseIPs[cachedIpStr]; !exist { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if _, exist := newDNSresponseIPs[cachedIpStr]; !exist { | |
if newIPMeta, exist := newDNSresponseIPs[cachedIpStr]; !exist { |
then use it at L461, instead of querying the map another time.
if mustCacheResponse { | ||
|
||
if len(ipWithTTLMap) > 0 { | ||
addressUpdate = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when no IP is changed, I think the map is not empty, why this is set to true?
responseIPs map[string]ipWithTTL | ||
} | ||
|
||
type ipWithTTL struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be ipWithExpiration
(we store the expiration time, not the TTL)
@@ -405,80 +408,115 @@ func (f *fqdnController) deleteRuleSelectedPods(ruleID string) error { | |||
|
|||
func (f *fqdnController) onDNSResponse( | |||
fqdn string, | |||
responseIPs map[string]net.IP, | |||
lowestTTL uint32, | |||
newIPWithTTLMap map[string]ipWithTTL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment: it should be newIPWithExpirationMap
.
currentTime := time.Now() | ||
ipWithTTLMap := make(map[string]ipWithTTL) | ||
|
||
//minTimeToRequery establishes a maximum reference time for tracking the minimum re-query time to DNS, as IPs expire. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: leave whitespace after //
if !addressUpdate { | ||
addressUpdate = true | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typically, you just unconditionally write addressUpdate = true
in this case. I'm pretty sure there is no advantage in checking the value first. If anything, the extra branching probably makes it worse.
if !addToCache { | ||
addToCache = true | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
if !exists { | ||
t.Errorf("Expected %s to be found in dns cache.", ipStr) | ||
continue | ||
} | ||
|
||
if cachedIPMeta.expirationTime.Before(time.Now()) { | ||
t.Errorf("Expected %s to be found with a valid TTL,but it has expired.", ipStr) | ||
} | ||
|
||
if !cachedIPMeta.expirationTime.Equal(expectedTTL) { | ||
t.Errorf("Expected TTL for %s to be %v, got %v", ipStr, expectedTTL, cachedIPMeta.expirationTime) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use testify assertions for consistency, not t.Error
test/e2e/framework.go
Outdated
randomPatchAnnotationKey = "test.antrea.io/random-value" | ||
annotationValueLen = 8 | ||
iperfPort = 5201 | ||
iperfSvcPort = 9999 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I pointed this out before: please remove these unrelated changes from the PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reminder
//minTimeToRequery establishes a maximum reference time for tracking the minimum re-query time to DNS, as IPs expire. | ||
minTimeToRequery := currentTime.Add(24 * time.Hour) | ||
|
||
addIPtoIPWithTTLMap := func(ip string, ipMeta ipWithTTL) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some nits: addIPtoIPWithTTLMap -> updateMapWithIP, also minTimeToRequery -> timeToRequery as the former sounds like a duration rather than a timestamp
} else { | ||
// First time seeing this domain. | ||
// check if this needs to be tracked, by checking its presence in the datapath rules. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// check if this needs to be tracked, by checking its presence in the datapath rules. | |
// check if this needs to be tracked, by checking if it matches any Antrea FQDN policy selectors. |
// If a FQDN policy had been applied then there must be rule records but because it's | ||
// not in cache hence its FQDN:SelectorItem mapping may not be present. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand this comment, but I'm pretty sure that it can be covered by the previous one
// timeToRequery establishes a maximum reference time for tracking the minimum re-query time to DNS, as IPs expire. | ||
timeToRequery := currentTime.Add(24 * time.Hour) | ||
|
||
updateMapWithIP := func(ip string, ipMeta ipWithExpiration) { | ||
ipWithExpirationMap[ip] = ipMeta | ||
if ipMeta.expirationTime.Before(timeToRequery) { | ||
timeToRequery = ipMeta.expirationTime | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// timeToRequery establishes a maximum reference time for tracking the minimum re-query time to DNS, as IPs expire. | |
timeToRequery := currentTime.Add(24 * time.Hour) | |
updateMapWithIP := func(ip string, ipMeta ipWithExpiration) { | |
ipWithExpirationMap[ip] = ipMeta | |
if ipMeta.expirationTime.Before(timeToRequery) { | |
timeToRequery = ipMeta.expirationTime | |
} | |
} | |
// timeToRequery establishes a maximum reference time for tracking the minimum re-query time to DNS, as IPs expire. | |
var timeToRequery *time.Time | |
updateMapWithIP := func(ip string, ipMeta ipWithExpiration) { | |
ipWithExpirationMap[ip] = ipMeta | |
if timeToRequery == nil || ipMeta.expirationTime.Before(*timeToRequery) { | |
timeToRequery = &ipMeta.expirationTime | |
} | |
} |
To avoid introducing the vague duration 24 * time.Hour
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another way is to move least expiration calculation to a function of dnsMeta
, then there is no need to have this function, and it can save a copy in "First time seeing this domain" case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome. I also had earlier thought on similar lines but i thought of it using as :
var timeToRequery time.Time
but as you know it meant timeToRequery
gets set to its zero value and would not have worked because its zero value would have always been lesser, making me to set an arbitrary value to it.
// timeToRequery establishes a maximum reference time for tracking the minimum re-query time to DNS, as IPs expire. | ||
timeToRequery := currentTime.Add(24 * time.Hour) | ||
|
||
updateMapWithIP := func(ip string, ipMeta ipWithExpiration) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updateMapWithIP := func(ip string, ipMeta ipWithExpiration) { | |
updateIPWithExpiration := func(ip string, ipMeta ipWithExpiration) { |
// check for presence of already cached IPs in the new response. | ||
for cachedIPStr, cachedIPMeta := range cachedDNSMeta.responseIPs { | ||
if newIPMeta, exist := newIPWithExpiration[cachedIPStr]; !exist { | ||
// An already cached IP has been found in new response. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment says the opposite of the code.
@@ -38,6 +39,12 @@ func newMockFQDNController(t *testing.T, controller *gomock.Controller, dnsServe | |||
if dnsServer != nil { | |||
dnsServerAddr = *dnsServer | |||
} | |||
mockDnsQueryQueue := workqueue.NewTypedRateLimitingQueueWithConfig( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mockDnsQueryQueue := workqueue.NewTypedRateLimitingQueueWithConfig( | |
mockDNSQueryQueue := workqueue.NewTypedRateLimitingQueueWithConfig( |
test/e2e/framework.go
Outdated
randomPatchAnnotationKey = "test.antrea.io/random-value" | ||
annotationValueLen = 8 | ||
iperfPort = 5201 | ||
iperfSvcPort = 9999 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reminder
Could you remove Also, it's perhaps time to squash the commits and write a good commit message. |
{ | ||
name: "new IP added", | ||
existingDNSCache: map[string]dnsMeta{ | ||
"fqdn-test-pod.lfx.test": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can use testFQDN
.
assert.Equal(t, len(cachedDnsMetaData.responseIPs), len(tc.expectedIPs), "Expected %d IPs in cache, got %d", len(tc.expectedIPs), len(cachedDnsMetaData.responseIPs)) | ||
|
||
for ipStr, expectedTTL := range tc.expectedIPs { | ||
cachedIPMeta, exists := cachedDnsMetaData.responseIPs[ipStr] | ||
|
||
assert.True(t, exists, "Expected %s to be found in dns cache.", ipStr) | ||
assert.False(t, cachedIPMeta.expirationTime.Before(currentTime), "Expected %s to have a valid TTL, but it has expired.", ipStr) | ||
assert.Equal(t, expectedTTL, cachedIPMeta.expirationTime, "Expected TTL for %s to be %v, got %v", ipStr, expectedTTL, cachedIPMeta.expirationTime) | ||
|
||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you could just change tc.dnsResponseIPs
to map[string]ipWithExpiration
, then one line code can validate the result and cover all cases.
assert.Equal(t, tc.expectedIPs, cachedDNSMetaData.responseIPs)
workqueue.NewTypedItemExponentialFailureRateLimiter[string](1*time.Millisecond, 100*time.Millisecond), | ||
workqueue.TypedRateLimitingQueueConfig[string]{ | ||
Name: "fqdn", | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should use a fake Clock
to speed up enqueue, see
antrea/pkg/agent/controller/ipseccertificate/ipsec_certificate_controller_test.go
Line 280 in 47ce51e
fakeController := newFakeController(t, fakeClock) |
|
||
} | ||
|
||
if tc.expectedItem == testFQDN { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if tc.expectedItem == testFQDN { | |
if tc.expectedItem != "" { |
…onses. - Implement tracking of individual IP TTLs for FQDNs in the DNS cache. - Upon expiration of an IP's TTL, trigger a new DNS query. - Evict only those IPs that are no longer present in the latest DNS response and have exceeded their original TTL. Signed-off-by: Hemant <hkbiet@gmail.com>
eb3f4ba
to
e98030f
Compare
} | ||
f.dnsQueryQueue.AddAfter(fqdn, recordTTL.Sub(time.Now())) | ||
f.dnsQueryQueue.AddAfter(fqdn, timeToReQuery.Sub(currentTime)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment explaining why timeToReQuery
cannot be nil here. Theoretically I think it cannot, but
it is not obvious at all
fakeClock := clocktesting.NewFakeClock(time.Now()) | ||
currentTime := fakeClock.Now() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
each test case should have its own clock, otherwise with your current code, one test advancing the clock will impact subsequent test cases
you can do something like this:
currentTime := time.Now()
// use currentTime to define your test cases, as you are doing now
// declare fakeClock := clocktesting.NewFakeClock(currentTime) at the beginning of each test case
currentTime := time.Now() | ||
ipWithExpirationMap := make(map[string]ipWithExpiration) | ||
|
||
// timeToReQuery establishes a maximum reference time for tracking the minimum re-query time to DNS, as IPs expire. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// timeToReQuery establishes a maximum reference time for tracking the minimum re-query time to DNS, as IPs expire. | |
// timeToRequery establishes the time after which a new DNS query should be sent for the FQDN. | |
// timeToRequery will be set to the minimum expiration time of all IPs added or updated in the cache. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggesting this as I think the current comment is not very helpful or accurate
ipWithExpirationMap := make(map[string]ipWithExpiration) | ||
|
||
// timeToReQuery establishes a maximum reference time for tracking the minimum re-query time to DNS, as IPs expire. | ||
var timeToReQuery *time.Time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can we rename timeToReQuery
to timeToRequery
?
f, _ := newMockFQDNController(t, controller, nil) | ||
mockDNSQueryQueue := workqueue.NewTypedRateLimitingQueueWithConfig( | ||
workqueue.NewTypedItemExponentialFailureRateLimiter[string](minRetryDelay, maxRetryDelay), | ||
workqueue.TypedRateLimitingQueueConfig[string]{ | ||
Name: "fqdn-test", | ||
Clock: fakeClock, | ||
}, | ||
) | ||
f.dnsQueryQueue = mockDNSQueryQueue | ||
f.dnsEntryCache = tc.existingDNSCache |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while this may work, a better approach is to have a clock.Clock
as a field in fqdnController
, and make sure the entire implementation of fqdnController
uses that clock (e.g., calls clock.Now()
instead of time.Now()
). The default implementation will use clock.RealClock{}
, but newMockFQDNController
should support injecting a custom clock (fake clock).
with this approach, the test function won't need to build a custom workqueue
} | ||
|
||
if tc.expectedItem != "" { | ||
fakeClock.Step(5 * time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this 5 * time.Second
values comes out of nowhere. I know it is based on the inputs to the test case (existingDNSCache
, etc.) but it should itself be a part of the test case definition, otherwise it looks like some magical value
what I would do is remove tc.expectedItem
and add tc.expectedRequeryAfter *time.Duration
as a test case input. This way you only have one input for this. You already know that the expected item is testFQDN
, so this doesn't need to be an input.
I'm also fixing a couple of issues in the snippet below:
if tc.expectedRequeryAfter != nil {
fakeClock.Step(*tcexpectedRequeryAfter)
// needed to avoid blocking on Get() in case of failure
require.Eventually(t, func() bool { f.dnqQueryQueue.Len() > 0 }, 1*time.Second, 10*time.Millisecond)
item, _ := f.dnsQueryQueue.Get()
f.dnsQueryQueue.Done(item)
assert.Equal(t, testFqdn, item)
} else {
// make sure that there is no requery
assert.Never(t, func() bool { f.dnqQueryQueue.Len() > 0 }, 100*time.Millisecond, 10*time.Millisecond)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for feedback and proper outline on this . I got some logical insights from this, for instance, I was directly calling Get()
on the queue after advancing the time, without checking if the queue really contained any items or not because i "assumed" that it really must contain the item. This could have lead to blocking indefinitely until an item was actually available, and using require.Eventually
helps ensure that the queue actually has items before attempting to retrieve them.
- Replace time.Now() usages with the clock set in fqdnController. - Inject fakeClock for tests. Removed mockDnsQueryQueue implementation. Signed-off-by: Hemant <hkbiet@gmail.com>
@@ -17,7 +17,7 @@ package networkpolicy | |||
import ( | |||
"context" | |||
"fmt" | |||
"math" | |||
"k8s.io/utils/clock" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be in the second import group
@@ -152,6 +156,12 @@ type fqdnController struct { | |||
ipv4Enabled bool | |||
ipv6Enabled bool | |||
gwPort uint32 | |||
// clock allows for time-dependent operations within the fqdnController. | |||
// By using a clock.Clock as a field, we ensure that all time-related calls throughout the | |||
// implementation of the fqdnController utilize this clock, for example , calls to clock.Now() instead of time.Now() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// implementation of the fqdnController utilize this clock, for example , calls to clock.Now() instead of time.Now() | |
// implementation of the fqdnController utilize this clock |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"This patch enables the caching of TTLs for individual IPs in DNS responses." is not a good commit message title. Please follow the "git commit" link in https://github.com/antrea-io/antrea/blob/main/CONTRIBUTING.md#getting-reviewers to write a proper one.
{ | ||
name: "existing DNS cache not impacted", | ||
existingDNSCache: map[string]dnsMeta{ | ||
"fqdn-test-pod.lfx.test": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"fqdn-test-pod.lfx.test": { | |
testFQDN: { |
Same comment applies to other cases
expectedIPs: map[string]ipWithExpiration{ | ||
"192.1.1.3": {ip: net.ParseIP("192.1.1.3"), expirationTime: currentTime.Add(5 * time.Second)}, | ||
}, | ||
expectedRequeryAfter: getDuration(5 * time.Second), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expectedRequeryAfter: getDuration(5 * time.Second), | |
expectedRequeryAfter: ptr.To(5 * time.Second), |
and no need to create getDuration
}, | ||
}, | ||
{ | ||
name: "old IP resent in DNS response is retained with an updated TTL fetched from response", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could add another IP to existingDNSCache
and dnsResponseIPs
, the former has shorter expirationTime than the latter, to validate it's always the later expiration time is used.
…upport - Implement tracking of TTLs for individual IPs linked to FQDNs in the DNS cache. - Automatically trigger a new DNS query when an IP's TTL expires. - Evict only those IPs that are no longer present in the latest DNS response and have surpassed their original TTL. - Added clock field of type clock.Clock in fqdnController. - Pass clock while initializing fqdn controller. Signed-off-by: Hemant <hkbiet@gmail.com>
…antrea into cache-individual-ip-ttl
Signed-off-by: Hemant <hkbiet@gmail.com>
Signed-off-by: Hemant <hkbiet@gmail.com>
// timeToRequery passed below is guaranteed to be non-nil here because updateIPWithExpiration() is called on each IP in | ||
// newIPWithExpiration that either isn't cached in the existing DNS cache or it remains unexpired. This ensures that timeToReQuery is | ||
// always set to the earliest expiration time in the response, hence also enabling proper DNS re-query scheduling. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try to wrap lines to 100 chars at the max
…tuality test to 1 second Signed-off-by: Hemant <hkbiet@gmail.com>
…ueue. Updated onDNSResponse() to requeue DNS queries even when same IPs are receieved in a response. Added and updated comments as per review. Signed-off-by: Hemant <hkbiet@gmail.com>
Signed-off-by: Hemant <hkbiet@gmail.com>
Updated comment clarifying the case when ipWithExpirationMap is of zero length in onDNSResponse. Updated test case which ensures the requery time to be the later time amongst the TTL already cached and TTL received in DNS response for cases involving same IPs, as present in DNS cache, being received. Signed-off-by: Hemant <hkbiet@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only one small comment on my side, otherwise LGTM
regplaced dnsQueryQueue.AddAfter logic as before. Signed-off-by: Hemant <hkbiet@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except a format issue
…ts from external imports Signed-off-by: Hemant <hkbiet@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…NSResponse test. - Seemingly a race condition between the event of creation of Timer by DelayeingQueue and Step function of fakeClock is failing the test. - The fix Uses the idea in antrea-io#6407 - The fix uses the ,already implemented wrapper, around fakeClock with a method that counts the number of times DelayedQueue's Timer has been called. Signed-off-by: Hemant <hkbiet@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Signed-off-by: Hemant <hkbiet@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/test-all |
This is a draft PR for work towards #6722 .