Job:
#OCPBUGS-62517issue8 days agoClusterOperator olm goes Available=False with reason=CatalogdDeploymentCatalogdControllerManager_Deploying or reason=OperatorcontrollerDeploymentOperatorControllerControllerManager_Deploying during updates POST
Issue 17438850: ClusterOperator olm goes Available=False with reason=CatalogdDeploymentCatalogdControllerManager_Deploying or reason=OperatorcontrollerDeploymentOperatorControllerControllerManager_Deploying during updates
Description: Description of problem:
 
 [A component must not report Available=False during the course of a normal upgrade.|https://github.com/openshift/api/blob/7f245291a17ac0bd31cf8ba08530c3355b86dbea/config/v1/types_cluster_operator.go#L156]
 
 ClusterOperator olm goes Available=False with reason=CatalogdDeploymentCatalogdControllerManager_Deploying or reason=OperatorcontrollerDeploymentOperatorControllerControllerManager_Deploying during updates
 
 Example job: https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-master-ci-4.21-e2e-gcp-ovn-upgrade/1972489796022439936
 {code:none}
    Sep 29 04:35:47.504 E clusteroperator/olm condition/Available reason/CatalogdDeploymentCatalogdControllerManager_Deploying status/False CatalogdDeploymentCatalogdControllerManagerAvailable: Waiting for Deployment
 Sep 29 04:35:47.504 - 52s   E clusteroperator/olm condition/Available reason/CatalogdDeploymentCatalogdControllerManager_Deploying status/False CatalogdDeploymentCatalogdControllerManagerAvailable: Waiting for Deployment
 Sep 29 04:42:35.127 E clusteroperator/olm condition/Available reason/OperatorcontrollerDeploymentOperatorControllerControllerManager_Deploying status/False OperatorcontrollerDeploymentOperatorControllerControllerManagerAvailable: Waiting for Deployment
 Sep 29 04:42:35.127 - 12s   E clusteroperator/olm condition/Available reason/OperatorcontrollerDeploymentOperatorControllerControllerManager_Deploying status/False OperatorcontrollerDeploymentOperatorControllerControllerManagerAvailable: Waiting for Deployment
  {code}
 Version-Release number of selected component (if applicable):
 
 The issue was spotted with a 4.21 to 4.21 upgrade test.
 {code:none}
     INFO[2025-09-29T02:33:17Z] Using explicitly provided pull-spec for release initial (registry.ci.openshift.org/ocp/release:4.21.0-0.ci-2025-09-28-082535) INFO[2025-09-29T02:33:17Z] Using explicitly provided pull-spec for release latest (registry.ci.openshift.org/ocp/release:4.21.0-0.ci-2025-09-29-022535) {code}
 How reproducible:
 
 Seems always in [the aggregated job|https://prow.ci.openshift.org/view/gs/test-platform-results/logs/aggregated-gcp-ovn-upgrade-4.21-micro-release-openshift-release-analysis-aggregator/1972561250676117504]  but there is also [a green run|https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/30308/pull-ci-openshift-origin-main-e2e-gcp-ovn-upgrade/1971564973029068800] in a similar test.
 {code:none}
 ### failure
 $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/logs/periodic-ci-openshift-release-master-ci-4.21-e2e-gcp-ovn-upgrade/1972489796022439936/artifacts/e2e-gcp-ovn-upgrade/openshift-e2e-test/artifacts/junit/e2e-monitor-tests__20250929-034333.xml | grep 'clusteroperator/olm should not change condition/Available' -A1
     <testcase name="[Monitor:legacy-cvo-invariants][bz-OLM] clusteroperator/olm should not change condition/Available" time="7014.05639286">
         <failure message="">4 unexpected clusteroperator state transitions during e2e test run.  These did not match any known exceptions, so they cause this test-case to fail:&#xA;&#xA;Sep 29 04:35:47.504 E clusteroperator/olm condition/Available reason/CatalogdDeploymentCatalogdControllerManager_Deploying status/False CatalogdDeploymentCatalogdControllerManagerAvailable: Waiting for Deployment&#xA;Sep 29 04:35:47.504 - 52s   E clusteroperator/olm condition/Available reason/CatalogdDeploymentCatalogdControllerManager_Deploying status/False CatalogdDeploymentCatalogdControllerManagerAvailable: Waiting for Deployment&#xA;Sep 29 04:42:35.127 E clusteroperator/olm condition/Available reason/OperatorcontrollerDeploymentOperatorControllerControllerManager_Deploying status/False OperatorcontrollerDeploymentOperatorControllerControllerManagerAvailable: Waiting for Deployment&#xA;Sep 29 04:42:35.127 - 12s   E clusteroperator/olm condition/Available reason/OperatorcontrollerDeploymentOperatorControllerControllerManager_Deploying status/False OperatorcontrollerDeploymentOperatorControllerControllerManagerAvailable: Waiting for Deployment&#xA;&#xA;2 unwelcome but acceptable clusteroperator state transitions during e2e test run.  These should not happen, but because they are tied to exceptions, the fact that they did happen is not sufficient to cause this test-case to fail:&#xA;&#xA;Sep 29 04:36:39.932 W clusteroperator/olm condition/Available reason/AsExpected status/True CatalogdDeploymentCatalogdControllerManagerAvailable: Deployment is available\nOperatorcontrollerDeploymentOperatorControllerControllerManagerAvailable: Deployment is available (exception: Available=True is the happy case)&#xA;Sep 29 04:42:48.072 W clusteroperator/olm condition/Available reason/AsExpected status/True CatalogdDeploymentCatalogdControllerManagerAvailable: Deployment is available\nOperatorcontrollerDeploymentOperatorControllerControllerManagerAvailable: Deployment is available (exception: Available=True is the happy case)&#xA;</failure>
 
 ### success
 $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/pr-logs/pull/30308/pull-ci-openshift-origin-main-e2e-gcp-ovn-upgrade/1971564973029068800/artifacts/e2e-gcp-ovn-upgrade/openshift-e2e-test/artifacts/junit/e2e-monitor-tests__20250926-142805.xml | grep 'clusteroperator/olm should not change condition/Available' -A1
     <testcase name="[Monitor:legacy-cvo-invariants][bz-OLM] clusteroperator/olm should not change condition/Available" time="0"></testcase>
     <testcase name="[Monitor:legacy-cvo-invariants][bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available" time="0"></testcase>{code}
 Steps to Reproduce:
 {code:none}
     1. Run the aggregated job above
     2.
     3.
     {code}
 Actual results:
 {code:none}
 co/olm goes Available=True during the upgrade test.{code}
 Expected results:
 {code:none}
 co/olm stays Available=True during the upgrade test.{code}
 Additional info:
 {code:none}
 The failures were taken from 4.21 to 4.21 upgrade test. It could go with earlier versions too.{code}
Status: POST
#OCPBUGS-23746issue6 weeks agoopenshift-apiserver ClusterOperator should not blip Available=False on brief missing HTTP content-type POST
Issue 15637203: openshift-apiserver ClusterOperator should not blip Available=False on brief missing HTTP content-type
Description: h2. Description of problem:
 
 Seen [in 4.15 update CI|https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.15-e2e-azure-ovn-upgrade/1727427846533550080]:
 {code:none}
 : [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available expand_less
 Run #0: Failed expand_less	1h28m25s
 {  1 unexpected clusteroperator state transitions during e2e test run 
 
 Nov 22 21:47:32.876 - 1s    E clusteroperator/openshift-apiserver condition/Available reason/APIServices_Error status/False APIServicesAvailable: rpc error: code = Unknown desc = malformed header: missing HTTP content-type}
 {code}
 While the Kube API server, if that's what's missing the header, is supposed to always be available, an issue that only persists for 1s is not long enough to warrant [immediate admin intervention|https://github.com/openshift/api/blob/c3f7566f6ef636bb7cf9549bf47112844285989e/config/v1/types_cluster_operator.go#L149-L153]. Teaching the openshift-apiserver operator to stay {{Available=True}} for this kind of brief hiccup, while still going {{Available=False}} for issues where [least part of the component is non-functional, and that the condition requires immediate administrator intervention|https://github.com/openshift/api/blob/c3f7566f6ef636bb7cf9549bf47112844285989e/config/v1/types_cluster_operator.go#L149-L153] would make it easier for admins and SREs operating clusters to identify when intervention was required.
 h2. Version-Release number of selected component (if applicable):
 {code:none}
 $ w3m -dump -cols 200 'https://search.ci.openshift.org/?maxAge=48h&type=junit&search=clusteroperator/openshift-apiserver+should+not+change+condition/Available' | grep '^periodic-.*4[.]15.*failures match' | sort
 periodic-ci-openshift-multiarch-master-nightly-4.15-ocp-e2e-ibmcloud-ovn-multi-ppc64le (all) - 4 runs, 100% failed, 25% of failures match = 25% impact
 periodic-ci-openshift-multiarch-master-nightly-4.15-ocp-e2e-ibmcloud-ovn-multi-s390x (all) - 4 runs, 25% failed, 200% of failures match = 50% impact
 periodic-ci-openshift-multiarch-master-nightly-4.15-upgrade-from-nightly-4.14-ocp-ovn-remote-libvirt-s390x (all) - 5 runs, 100% failed, 40% of failures match = 40% impact
 periodic-ci-openshift-multiarch-master-nightly-4.15-upgrade-from-stable-4.14-ocp-e2e-upgrade-azure-ovn-arm64 (all) - 5 runs, 40% failed, 50% of failures match = 20% impact
 periodic-ci-openshift-multiarch-master-nightly-4.15-upgrade-from-stable-4.14-ocp-e2e-upgrade-azure-ovn-heterogeneous (all) - 5 runs, 20% failed, 100% of failures match = 20% impact
 periodic-ci-openshift-release-master-ci-4.15-e2e-aws-ovn-upgrade (all) - 5 runs, 20% failed, 200% of failures match = 40% impact
 periodic-ci-openshift-release-master-ci-4.15-e2e-aws-upgrade-ovn-single-node (all) - 1 runs, 100% failed, 100% of failures match = 100% impact
 periodic-ci-openshift-release-master-ci-4.15-e2e-azure-ovn-upgrade (all) - 50 runs, 56% failed, 21% of failures match = 12% impact
 periodic-ci-openshift-release-master-ci-4.15-e2e-gcp-ovn-upgrade (all) - 80 runs, 44% failed, 17% of failures match = 8% impact
 periodic-ci-openshift-release-master-ci-4.15-upgrade-from-stable-4.14-e2e-aws-ovn-upgrade (all) - 80 runs, 30% failed, 13% of failures match = 4% impact
 periodic-ci-openshift-release-master-ci-4.15-upgrade-from-stable-4.14-e2e-azure-sdn-upgrade (all) - 80 runs, 43% failed, 6% of failures match = 3% impact
 periodic-ci-openshift-release-master-ci-4.15-upgrade-from-stable-4.14-e2e-gcp-ovn-rt-upgrade (all) - 50 runs, 16% failed, 63% of failures match = 10% impact
 periodic-ci-openshift-release-master-ci-4.15-upgrade-from-stable-4.14-from-stable-4.13-e2e-aws-sdn-upgrade (all) - 1 runs, 100% failed, 100% of failures match = 100% impact
 periodic-ci-openshift-release-master-nightly-4.15-e2e-aws-ovn-single-node-serial (all) - 5 runs, 100% failed, 100% of failures match = 100% impact
 periodic-ci-openshift-release-master-nightly-4.15-e2e-aws-ovn-upgrade-rollback-oldest-supported (all) - 5 runs, 40% failed, 50% of failures match = 20% impact
 periodic-ci-openshift-release-master-nightly-4.15-e2e-aws-sdn-upgrade (all) - 50 runs, 18% failed, 11% of failures match = 2% impact
 periodic-ci-openshift-release-master-nightly-4.15-e2e-gcp-ovn-etcd-scaling (all) - 1 runs, 100% failed, 100% of failures match = 100% impact
 periodic-ci-openshift-release-master-nightly-4.15-e2e-ibmcloud-csi (all) - 1 runs, 100% failed, 100% of failures match = 100% impact
 periodic-ci-openshift-release-master-nightly-4.15-e2e-vsphere-ovn-techpreview (all) - 5 runs, 40% failed, 50% of failures match = 20% impact
 periodic-ci-openshift-release-master-nightly-4.15-upgrade-from-stable-4.14-e2e-aws-upgrade-ovn-single-node (all) - 1 runs, 100% failed, 100% of failures match = 100% impact
 periodic-ci-openshift-release-master-nightly-4.15-upgrade-from-stable-4.14-e2e-metal-ipi-sdn-bm-upgrade (all) - 5 runs, 100% failed, 20% of failures match = 20% impact
 periodic-ci-openshift-release-master-okd-scos-4.15-e2e-aws-ovn-upgrade (all) - 15 runs, 47% failed, 14% of failures match = 7% impact
 {code}
 
 The impact rates are low enough that I haven't checked older 4.y.  And it's possible that some of those matches have the operator going {{Available=False}} for other reasons besides {{APIServices_Error}}:
 
 {code:none}
 $ curl -s 'https://search.ci.openshift.org/search?maxAge=48h&type=junit&name=4.15.*upgrade&context=0&search=clusteroperator/openshift-apiserver.*condition/Available.*status/False' | jq -r 'to_entries[].value | to_entries[].value[].context[]' | sed 's|.*clusteroperator/\([^ ]*\) condition/Available reason/\([^ ]*\) status/False.*|\1 \2|' | sort | uniq -c | sort -n
       2 openshift-apiserver APIServerDeployment_NoPod
       2 openshift-apiserver APIServerDeployment_PreconditionNotFulfilled
      19 openshift-apiserver APIServices_Error
      22 openshift-apiserver APIServerDeployment_NoDeployment
 {code}
 
 h2. How reproducible:
 
 {{12% impact}} for {{periodic-ci-openshift-release-master-ci-4.15-e2e-azure-ovn-upgrade}} looks like the highest impact among the jobs with double-digit run counts.
 
 h2. Steps to Reproduce:
 
 Run {{periodic-ci-openshift-release-master-ci-4.15-e2e-azure-ovn-upgrade}} a bunch of times watching the {{openshift-apiserver}} ClusterOperator's {{Available}} condition.
 
 h2. Actual results:
 
 Some very brief blips of {{Available=False}} that self-resolve before an admin could possibly resolve to the summons.
 
 h2. Expected results:
 
 No quickly-resolving blips in CI.  No long runs of {{Available=False}} for issues that don't seem worth summoning an admin.  Still going {{Available=False}} for outages that need immediate admin response.
Status: POST
{noformat}
: [Monitor:legacy-cvo-invariants][bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available expand_less2h4m41s{  2 unexpected clusteroperator state transitions during e2e test run.  These did not match any known exceptions, so they cause this test-case to fail:
periodic-ci-openshift-release-master-nightly-4.20-upgrade-from-stable-4.19-e2e-aws-upgrade-ovn-single-node (all) - 26 runs, 35% failed, 267% of failures match = 92% impact
#2011211012128116736junit4 hours ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#2011032471063236608junit15 hours ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#2011032471063236608junit15 hours ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Degraded
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#2010761681541533696junit34 hours ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#2010619085703876608junit43 hours ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#2010389039177273344junit2 days ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#2010389039177273344junit2 days ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Degraded
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#2010489774887931904junit2 days ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#2010289751592013824junit2 days ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#2010154359366619136junit3 days ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#2010024656693628928junit3 days ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#2009896418327662592junit3 days ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#2009444757146701824junit5 days ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#2009361799576555520junit5 days ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#2009122667155689472junit5 days ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#2009253920748081152junit5 days ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#2009031844208578560junit6 days ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#2009031844208578560junit6 days ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Degraded
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#2008961139647451136junit6 days ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#2008530176341708800junit7 days ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#2008446639198441472junit7 days ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#2007624633041293312junit10 days ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#2007428089902010368junit10 days ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#2007428089902010368junit10 days ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Degraded
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#2007543856442118144junit10 days ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#2007543856442118144junit10 days ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Degraded
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#2007338393578508288junit10 days ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#2007204629862944768junit11 days ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#2007204629862944768junit11 days ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Degraded
0 unexpected clusteroperator state transitions during e2e test run, as desired.
#2006838133529776128junit12 days ago
# [bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
0 unexpected clusteroperator state transitions during e2e test run, as desired.

Found in 92.31% of runs (266.67% of failures) across 26 total runs and 1 jobs (34.62% failed) in 94ms - clear search | chart view - source code located on github