postgres has a limitation on its notify message size (8k), and the
messages we generate for deep copying functionality easily go over this
limit; instead of passing a giant nested data structure across the
message bus, this change makes it so that we temporarily store the JSON
structure in memcached, and look it up from *within* the task
see: https://github.com/ansible/tower/issues/4162
* Gather brroadcast websocket metrics and push them into redis every
configurable seconds.
* Pop metrics from redis in web view layer to display via the api on
demand
* New tower nodes that are (de)registered in the Instance table are seen
by the websocket layer and connected to or disconnected from by the
websocket broadcast backplane using a polling mechanism.
* This is especially useful for openshift and kubernetes. This will be
useful for standalone Tower in the future when the restarting of Tower
services is not required.
* Sending health about websockets over websockets is not a great idea.
* I tried sending health data via prometheus and encountered problems
that will need PR's to prometheus_client library to solve. Circle back
to this later.
* We can not query the dispatcher running on isolated nodes to see if
the playbook is still running because that is the nature of isolated
nodes, they don't run the dispatcher nor do they run the message broker.
Therefore, we should query the control node that is arbitrating the
isolated work. If the control node process in the dispatcher is dead,
consider the iso job dead.
* This change adds more than just an unsubscribe reply.
* Websockets canrequest to join/leave groups. They do so using a single
idempotent request. This change replies to group requests over the
websockets with the diff of the group subscription. i.e. what groups the
user currenntly is in, what groups were left, and what groups were
joined.
* User in channels session is a lazy user class. This does not conform
to what the generic Role ancestry code expects. The Role ancestry code
expects a User objects. This change converts the lazy object into a
proper User object before calling the permission code path.
* asgiref async_to_sync was causing a Redis connection _for each_ call
to emit_channel_notification i.e. every event that the callback receiver
processes. This is a "known" issue
https://github.com/django/channels_redis/pull/130#issuecomment-424274470
and the advise is to slow downn the rate at which you call
async_to_sync. That is not an option for us. Instead, we put the async
group_send call onto the event loop for the current thread and wait for
it to be processed immediately.
The known issue has to do with event loop + socket relationship. Each
connection to redis is achieved via a socket. That conection can only be
waiting on by the event loop that corresponds to the calling thread.
async_to_sync creates a _new thread_ for each invocation. Thus, a new
connection to redis is required. Thus, the excess redis connections that
can be observed via netstat | grep redis | wc -l.
* Under the new postgres backed notify/listen message queue, this never
actually worked. Without using the database to store state, we can not
provide a at-most-once delivery mechanism w/ multi-readers.
* With this change, work is done ONLY on the node that requested for the
work to be done. Under rabbitmq, the node that was first to get the
message off the queue would do the work; presumably the least busy node.
Set JT.organization with value from its project
Remove validation requiring JT.organization
Undo some of the additional org definitions in tests
Revert some tests no longer needed for feature
exclude workflow approvals from unified organization field
revert awxkit changes for providing organization
Roll back additional JT creation permission requirement
Fix up more issues by persisting organization field when project is removed
Restrict project org editing, logging, and testing
Grant removed inventory org admin permissions in migration
Add special validate_unique for job templates
this deals with enforcing name-organization uniqueness
Add back in special message where config is unknown
when receiving 403 on job relaunch
Fix logical and performance bugs with data migration
within JT.inventory.organization make-permission-explicit migration
remove nested loops so we do .iterator() on JT queryset
in reverse migration, carefully remove execute role on JT
held by org admins of inventory organization,
as well as the execute_role holders
Use current state of Role model in logic, with 1 notable exception
that is used to filter on ancestors
the ancestor and descentent relationship in the migration model
is not reliable
output of this is saved as an integer list to avoid future
compatibility errors
make the parents rebuilding logic skip over irrelevant models
this is the largest performance gain for small resource numbers
This is the old version of this feature from 2019
this allows setting the organization in the data sent
to the API when creating a JT, and exposes the field
in the UI as well
Subsequent commit changes the field from editable
to read-only, but as of this commit, the machinery
is not hooked up to infer it from project