* Clean up added work_type processing for mesh_code branch
* track both execution and control capacity
* Remove unused execution_capacity property
* Count all forms of capacity to make test pass
* Force jobs to be on execution nodes, updates on control nodes
* Introduce capacity_type property to abstract some details out
* Update test to cover all job types at same time
* Register OpenShift nodes as control types
* Remove unqualified consumed_capacity from task manager and make unit tests work
* Remove unqualified consumed_capacity from task manager and make unit tests work
* Update unit test to execution vs control TM logic changes
* Fix bug, else handling for work_type method
* Model changes for instance last_seen field to replace modified
* Break up refresh_capacity into smaller units
* Rename execution node methods, fix last_seen clustering
* Use update_fields to make it clear save only affects capacity
* Restructing to pass unit tests
* Fix bug where a PATCH did not update capacity value
* Introduce utilities for --worker-info health check integration
* Handle case where ansible-runner is not installed
* Add ttl parameter for health check
* Reformulate return data structure and add lots of error cases
* Move up the cleanup tasks, close sockets
* Integrate new --worker-info into the execution node capacity check
* Undo the raw value override from the PoC
* Additional refinement to execution node check frequency
* Put in more complete network diagram
* Followup on comment to remove modified from from health check responsibilities
This requires swapping out the container images
for the execution nodes from awx-ee to the awx image
For completeness, the hop node image is switched to the raw
receptor image
A few outright bugs are fixed here
memory calculation just was not right at all
the execution_capacity calculation was reverse of intention
Drop in a few TODOs about error handling from debugging
Always send websocket messages for
high priority events like playbook_on_stats
Never send websocket messages for
events with no output
unless they are a high priority event type
keep pre-upgrade events in an old table (instead of a partition)
- instead of creating a default partition, keep all events in special
"unpartitioned" tables
- track these tables via distinct proxy=true models
- when generating the queryset for a UnifiedJob's events, look at the
creation date of the job; if it's before the date of the migration,
query on the old unpartitioned table, otherwise use the more modern table
that provides auto-partitioning
Include the EE set on a workflow template in the resolver hierarchy
SUMMARY
This step comes immediately after checking the actual job/template for
an explicitly set EE.
Note that now, because of how jobs are spawned off of workflow nodes,
the call to .resolve_execution_environment() no longer happens in
.create_unified_job(). The job instance within .create_unified_job()
doesn't yet have access to the node that it will be attached to,
making it impossible to use this information in the resolver if called
there.
related #9560
ISSUE TYPE
Feature Pull Request
Bugfix Pull Request
COMPONENT NAME
API
AWX VERSION
Reviewed-by: Shane McDonald <me@shanemcd.com>
Reviewed-by: Christian Adams <rooftopcellist@gmail.com>
* AWX_ISOLATION_SHOW_PATHS will be shared between containers. Strange
file not found error can crop up when concurrently accessing shared
directories between multiple containers that are bind mounted with big
Z. So make sure we use little z.