Job:
#OCPBUGS-23746issue3 weeks agoopenshift-apiserver ClusterOperator should not blip Available=False on brief missing HTTP content-type New
Issue 15637203: openshift-apiserver ClusterOperator should not blip Available=False on brief missing HTTP content-type
Description: h2. Description of problem:
 
 Seen [in 4.15 update CI|https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.15-e2e-azure-ovn-upgrade/1727427846533550080]:
 {code:none}
 : [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available expand_less
 Run #0: Failed expand_less	1h28m25s
 {  1 unexpected clusteroperator state transitions during e2e test run 
 
 Nov 22 21:47:32.876 - 1s    E clusteroperator/openshift-apiserver condition/Available reason/APIServices_Error status/False APIServicesAvailable: rpc error: code = Unknown desc = malformed header: missing HTTP content-type}
 {code}
 While the Kube API server, if that's what's missing the header, is supposed to always be available, an issue that only persists for 1s is not long enough to warrant [immediate admin intervention|https://github.com/openshift/api/blob/c3f7566f6ef636bb7cf9549bf47112844285989e/config/v1/types_cluster_operator.go#L149-L153]. Teaching the openshift-apiserver operator to stay {{Available=True}} for this kind of brief hiccup, while still going {{Available=False}} for issues where [least part of the component is non-functional, and that the condition requires immediate administrator intervention|https://github.com/openshift/api/blob/c3f7566f6ef636bb7cf9549bf47112844285989e/config/v1/types_cluster_operator.go#L149-L153] would make it easier for admins and SREs operating clusters to identify when intervention was required.
 h2. Version-Release number of selected component (if applicable):
 {code:none}
 $ w3m -dump -cols 200 'https://search.ci.openshift.org/?maxAge=48h&type=junit&search=clusteroperator/openshift-apiserver+should+not+change+condition/Available' | grep '^periodic-.*4[.]15.*failures match' | sort
 periodic-ci-openshift-multiarch-master-nightly-4.15-ocp-e2e-ibmcloud-ovn-multi-ppc64le (all) - 4 runs, 100% failed, 25% of failures match = 25% impact
 periodic-ci-openshift-multiarch-master-nightly-4.15-ocp-e2e-ibmcloud-ovn-multi-s390x (all) - 4 runs, 25% failed, 200% of failures match = 50% impact
 periodic-ci-openshift-multiarch-master-nightly-4.15-upgrade-from-nightly-4.14-ocp-ovn-remote-libvirt-s390x (all) - 5 runs, 100% failed, 40% of failures match = 40% impact
 periodic-ci-openshift-multiarch-master-nightly-4.15-upgrade-from-stable-4.14-ocp-e2e-upgrade-azure-ovn-arm64 (all) - 5 runs, 40% failed, 50% of failures match = 20% impact
 periodic-ci-openshift-multiarch-master-nightly-4.15-upgrade-from-stable-4.14-ocp-e2e-upgrade-azure-ovn-heterogeneous (all) - 5 runs, 20% failed, 100% of failures match = 20% impact
 periodic-ci-openshift-release-master-ci-4.15-e2e-aws-ovn-upgrade (all) - 5 runs, 20% failed, 200% of failures match = 40% impact
 periodic-ci-openshift-release-master-ci-4.15-e2e-aws-upgrade-ovn-single-node (all) - 1 runs, 100% failed, 100% of failures match = 100% impact
 periodic-ci-openshift-release-master-ci-4.15-e2e-azure-ovn-upgrade (all) - 50 runs, 56% failed, 21% of failures match = 12% impact
 periodic-ci-openshift-release-master-ci-4.15-e2e-gcp-ovn-upgrade (all) - 80 runs, 44% failed, 17% of failures match = 8% impact
 periodic-ci-openshift-release-master-ci-4.15-upgrade-from-stable-4.14-e2e-aws-ovn-upgrade (all) - 80 runs, 30% failed, 13% of failures match = 4% impact
 periodic-ci-openshift-release-master-ci-4.15-upgrade-from-stable-4.14-e2e-azure-sdn-upgrade (all) - 80 runs, 43% failed, 6% of failures match = 3% impact
 periodic-ci-openshift-release-master-ci-4.15-upgrade-from-stable-4.14-e2e-gcp-ovn-rt-upgrade (all) - 50 runs, 16% failed, 63% of failures match = 10% impact
 periodic-ci-openshift-release-master-ci-4.15-upgrade-from-stable-4.14-from-stable-4.13-e2e-aws-sdn-upgrade (all) - 1 runs, 100% failed, 100% of failures match = 100% impact
 periodic-ci-openshift-release-master-nightly-4.15-e2e-aws-ovn-single-node-serial (all) - 5 runs, 100% failed, 100% of failures match = 100% impact
 periodic-ci-openshift-release-master-nightly-4.15-e2e-aws-ovn-upgrade-rollback-oldest-supported (all) - 5 runs, 40% failed, 50% of failures match = 20% impact
 periodic-ci-openshift-release-master-nightly-4.15-e2e-aws-sdn-upgrade (all) - 50 runs, 18% failed, 11% of failures match = 2% impact
 periodic-ci-openshift-release-master-nightly-4.15-e2e-gcp-ovn-etcd-scaling (all) - 1 runs, 100% failed, 100% of failures match = 100% impact
 periodic-ci-openshift-release-master-nightly-4.15-e2e-ibmcloud-csi (all) - 1 runs, 100% failed, 100% of failures match = 100% impact
 periodic-ci-openshift-release-master-nightly-4.15-e2e-vsphere-ovn-techpreview (all) - 5 runs, 40% failed, 50% of failures match = 20% impact
 periodic-ci-openshift-release-master-nightly-4.15-upgrade-from-stable-4.14-e2e-aws-upgrade-ovn-single-node (all) - 1 runs, 100% failed, 100% of failures match = 100% impact
 periodic-ci-openshift-release-master-nightly-4.15-upgrade-from-stable-4.14-e2e-metal-ipi-sdn-bm-upgrade (all) - 5 runs, 100% failed, 20% of failures match = 20% impact
 periodic-ci-openshift-release-master-okd-scos-4.15-e2e-aws-ovn-upgrade (all) - 15 runs, 47% failed, 14% of failures match = 7% impact
 {code}
 
 The impact rates are low enough that I haven't checked older 4.y.  And it's possible that some of those matches have the operator going {{Available=False}} for other reasons besides {{APIServices_Error}}:
 
 {code:none}
 $ curl -s 'https://search.ci.openshift.org/search?maxAge=48h&type=junit&name=4.15.*upgrade&context=0&search=clusteroperator/openshift-apiserver.*condition/Available.*status/False' | jq -r 'to_entries[].value | to_entries[].value[].context[]' | sed 's|.*clusteroperator/\([^ ]*\) condition/Available reason/\([^ ]*\) status/False.*|\1 \2|' | sort | uniq -c | sort -n
       2 openshift-apiserver APIServerDeployment_NoPod
       2 openshift-apiserver APIServerDeployment_PreconditionNotFulfilled
      19 openshift-apiserver APIServices_Error
      22 openshift-apiserver APIServerDeployment_NoDeployment
 {code}
 
 h2. How reproducible:
 
 {{12% impact}} for {{periodic-ci-openshift-release-master-ci-4.15-e2e-azure-ovn-upgrade}} looks like the highest impact among the jobs with double-digit run counts.
 
 h2. Steps to Reproduce:
 
 Run {{periodic-ci-openshift-release-master-ci-4.15-e2e-azure-ovn-upgrade}} a bunch of times watching the {{openshift-apiserver}} ClusterOperator's {{Available}} condition.
 
 h2. Actual results:
 
 Some very brief blips of {{Available=False}} that self-resolve before an admin could possibly resolve to the summons.
 
 h2. Expected results:
 
 No quickly-resolving blips in CI.  No long runs of {{Available=False}} for issues that don't seem worth summoning an admin.  Still going {{Available=False}} for outages that need immediate admin response.
Status: New
#OCPBUGS-54927issue2 days agoCI: API is broken in periodic-ci-openshift-release-master-nightly-4.19-e2e-aws-ovn-single-node-techpreview-serial CLOSED
{code:java}
: [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available {code}
My understanding is that this fails if the "openshift-apiserver" moves out of Available to some other state. If this happens often then it may justify why other things are reporting "connection refused". The last time this test passed was on Feb 11th and it also started to fail later on the same day never to recover, I went run by run and I could not see a single success run after that.
periodic-ci-openshift-release-master-ci-4.15-upgrade-from-stable-4.14-e2e-azure-sdn-upgrade (all) - 26 runs, 81% failed, 86% of failures match = 69% impact
#1933139205853024256junit23 hours ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#1933206045182660608junit19 hours ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#1933070087208570880junit27 hours ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#1932946075551797248junit36 hours ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#1933453525832962048junit3 hours ago
2025-06-13T12:12:37Z: Call to sippy finished after: 1.359756163s
response Body: {"ProwJobName":"periodic-ci-openshift-release-master-ci-4.15-upgrade-from-stable-4.14-e2e-azure-sdn-upgrade","ProwJobRunID":1933453525832962048,"Release":"4.15","CompareRelease":"4.15","Tests":[{"Name":"[bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available","TestID":0,"Risk":{"Level":{"Name":"Low","Level":1},"Reasons":["This test has passed 77.42% of 62 runs on jobs [periodic-ci-openshift-release-master-ci-4.15-upgrade-from-stable-4.14-e2e-azure-sdn-upgrade] in the last 14 days."],"CurrentRuns":62,"CurrentPasses":48,"CurrentPassPercentage":77.41935483870968},"OpenBugs":[]}],"OverallRisk":{"Level":{"Name":"Low","Level":1},"Reasons":["Maximum failed test risk: Low"],"JobRunTestCount":4617,"JobRunTestFailures":1,"NeverStableJob":false,"HistoricalRunTestCount":2216},"OpenBugs":[{"id":16343469,"key":"OCPBUGS-42875","created_at":"2025-04-28T17:08:30.586775Z","updated_at":"2025-06-13T12:12:30.839623Z","deleted_at":null,"status":"ON_QA","last_change_time":"2025-06-12T17:07:56Z","summary":"clusteroperator/cluster-autoscaler blips Degraded=True during upgrade test","affects_versions":["4.18"],"fix_versions":null,"target_versions":["4.20.0"],"components":["Cluster Autoscaler"],"labels":["good-first-issue"],"url":"https://issues.redhat.com/browse/OCPBUGS-42875"}]}
#1933453525832962048junit3 hours ago
2025-06-13T12:12:37Z: Call to sippy finished after: 1.359756163s
response Body: {"ProwJobName":"periodic-ci-openshift-release-master-ci-4.15-upgrade-from-stable-4.14-e2e-azure-sdn-upgrade","ProwJobRunID":1933453525832962048,"Release":"4.15","CompareRelease":"4.15","Tests":[{"Name":"[bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available","TestID":0,"Risk":{"Level":{"Name":"Low","Level":1},"Reasons":["This test has passed 77.42% of 62 runs on jobs [periodic-ci-openshift-release-master-ci-4.15-upgrade-from-stable-4.14-e2e-azure-sdn-upgrade] in the last 14 days."],"CurrentRuns":62,"CurrentPasses":48,"CurrentPassPercentage":77.41935483870968},"OpenBugs":[]}],"OverallRisk":{"Level":{"Name":"Low","Level":1},"Reasons":["Maximum failed test risk: Low"],"JobRunTestCount":4617,"JobRunTestFailures":1,"NeverStableJob":false,"HistoricalRunTestCount":2216},"OpenBugs":[{"id":16343469,"key":"OCPBUGS-42875","created_at":"2025-04-28T17:08:30.586775Z","updated_at":"2025-06-13T12:12:30.839623Z","deleted_at":null,"status":"ON_QA","last_change_time":"2025-06-12T17:07:56Z","summary":"clusteroperator/cluster-autoscaler blips Degraded=True during upgrade test","affects_versions":["4.18"],"fix_versions":null,"target_versions":["4.20.0"],"components":["Cluster Autoscaler"],"labels":["good-first-issue"],"url":"https://issues.redhat.com/browse/OCPBUGS-42875"}]}
#1933453525832962048junit3 hours ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
2 unexpected clusteroperator state transitions during e2e test run.  These did not match any known exceptions, so they cause this test-case to fail:
#1932424606430269440junit2 days ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#1932492162251886592junit2 days ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#1932166766834749440junit3 days ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#1932227958546632704junit3 days ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#1931242280341999616junit6 days ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#1931191644590182400junit6 days ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#1930457064727908352junit8 days ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#1930279495055446016junit8 days ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#1930138180237922304junit9 days ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#1929727180695146496junit10 days ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#1929846899536302080junit10 days ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#1928572384734875648junit13 days ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#1928468881198813184junit13 days ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.

Found in 69.23% of runs (85.71% of failures) across 26 total runs and 1 jobs (80.77% failed) in 150ms - clear search | chart view - source code located on github