Commit Graph

6697 Commits

Author SHA1 Message Date
Ryan Petrello
8f1db173c1 remove a bunch of RabbitMQ references 2020-03-24 18:46:58 -04:00
AlanCoding
653850fa6d Remove duplicated index 2020-03-23 22:54:04 -04:00
AlanCoding
5e595caf5e Add workflow node identifier
Generate new modules WFJT and WFJT node
Touch up generated syntax, test new modules

Add utility method in awxkit

Fix some issues with non-name identifier in
  AWX collection module_utils

Update workflow docs for workflow node identifier

Test and fix WFJT modules survey_spec
Plug in survey spec for the new module
Handle survey spec idempotency and test

add associations for node connections
Handle node credential prompts as well

Add indexes for new identifier field

Test with unicode dragon in name
2020-03-23 22:00:00 -04:00
softwarefactory-project-zuul[bot]
4b497b8cdc Merge pull request #6364 from wenottingham/dont-make-a-tree-that-never-ends-and-just-goes-on-and-on
Preserve symlinks when copying a tree.

Reviewed-by: https://github.com/apps/softwarefactory-project-zuul
2020-03-23 18:57:05 +00:00
chris meyers
e9021bd173 serialize register_queue
* also remove uneeded query
2020-03-23 07:21:17 -04:00
Bill Nottingham
ac68e8c4fe Preserve symlinks when copying a tree.
This avoids creating a recursive symlink tree.
2020-03-20 13:41:16 -04:00
softwarefactory-project-zuul[bot]
b998d93bfb Merge pull request #6360 from chrismeyersfsu/log_notification_failures
log when notifications fail to send

Reviewed-by: https://github.com/apps/softwarefactory-project-zuul
2020-03-20 14:54:30 +00:00
chris meyers
47f5c17b56 log when notifications fail to send
* If a job does not finish in the 5 second timeout. Let the user know
that we failed to even try to send the notification.
2020-03-20 09:11:01 -04:00
softwarefactory-project-zuul[bot]
0fb800f5d0 Merge pull request #6344 from chrismeyersfsu/redis-cleanup1
Redis cleanup1

Reviewed-by: https://github.com/apps/softwarefactory-project-zuul
2020-03-20 13:07:40 +00:00
Seth Foster
88fb30e0da Delete jobs without loading objects first
The commit is intended to speed up the cleanup_jobs command in awx. Old
methods takes 7+ hours to delete 1 million old jobs. New method takes
around 6 minutes.

Leverages a sub-classed Collector, called AWXCollector, that does not
load in objects before deleting them. Instead querysets, which are
lazily evaluated, are used in places where Collector normally keeps a
list of objects.

Finally, a couple of tests to ensure parity between old Collector and
AWXCollector. That is, any object that is updated/removed from the
database using Collector should be have identical operations using
AWXCollector.

tower issue 1103
2020-03-19 14:14:02 -04:00
Ryan Petrello
d40a5dec8f change when we send job notifications to avoid a race condition
success/failure notifications for *playbooks* include summary data about
the hosts in based on the contents of the playbook_on_stats event

the current implementation suffers from a number of race conditions that
sometimes can cause that data to be missing or incomplete; this change
makes it so that for *playbooks* we build (and send) the notification in
response to the playbook_on_stats event, not the EOF event
2020-03-19 10:01:52 -04:00
chris meyers
5e481341bc flake8 2020-03-19 10:01:20 -04:00
chris meyers
0a1070834d only update the ip address field on the instance
* The heartbeat of an instance is determined to be the last modified
time of the Instance object. Therefore, we want to be careful to only
update very specific fields of the Instance object.
2020-03-19 10:01:20 -04:00
chris meyers
c7de3b0528 fix spelling 2020-03-19 10:01:20 -04:00
chris meyers
7f2e1d46bc replace janky unique channel name w/ uuid
* postgres notify/listen channel names have size limitations as well as
character limitations. Respect those limitations while at the same time
generate a unique channel name.
2020-03-19 08:59:15 -04:00
chris meyers
12158bdcba remove dead code 2020-03-19 08:57:05 -04:00
Egor Margineanu
f858eda6b1 Made OPTIONS optional 2020-03-19 13:43:06 +01:00
Egor Margineanu
3a208a0be2 Added support for PG port and options. related #6340 2020-03-19 13:29:06 +01:00
Ryan Petrello
f1ee963bd0 fix up rebased migrations 2020-03-18 16:19:04 -04:00
chris meyers
87de0cf0b3 flake8, pytest, license fixes 2020-03-18 16:10:20 -04:00
chris meyers
89163f2915 remove redis broker url test
* We use sockets everywhere. Thus, password special characters no longer
are an issue.
2020-03-18 16:10:20 -04:00
Ryan Petrello
1caa2e0287 work around a limitation in postgres notify to properly support copying
postgres has a limitation on its notify message size (8k), and the
messages we generate for deep copying functionality easily go over this
limit; instead of passing a giant nested data structure across the
message bus, this change makes it so that we temporarily store the JSON
structure in memcached, and look it up from *within* the task

see: https://github.com/ansible/tower/issues/4162
2020-03-18 16:10:20 -04:00
chris meyers
d58df0f34a fix sliding window calculation 2020-03-18 16:10:19 -04:00
chris meyers
2b59af3808 safely operate in async or sync context 2020-03-18 16:10:19 -04:00
chris meyers
9e5fe7f5c6 translate Instance hostname to safe analytics name
* More robust translation of Instance hostname to analytics safe name by
replacing all non-alpha numeric characters with _
2020-03-18 16:10:19 -04:00
chris meyers
093d204d19 fix flake8 2020-03-18 16:10:19 -04:00
chris meyers
e25bd931a1 change dispatcher test to make required queue
* No fallback-default queue anymore. Queue must be explicitly provided.
2020-03-18 16:10:19 -04:00
chris meyers
8350bb3371 robust broadcast websocket error hanndling 2020-03-18 16:10:18 -04:00
chris meyers
d6594ab602 add broadcast websocket metrics
* Gather brroadcast websocket metrics and push them into redis every
configurable seconds.
* Pop metrics from redis in web view layer to display via the api on
demand
2020-03-18 16:10:18 -04:00
chris meyers
3b9e67ed1b remove channel group model
* Websocket user session <-> group subscription membership now resides
in Redis rather than the database.
2020-03-18 16:10:18 -04:00
chris meyers
3c5c9c6fde move broadcast websocket out into its own process 2020-03-18 16:10:18 -04:00
chris meyers
f5193e5ea5 resolve rebase errors 2020-03-18 16:10:17 -04:00
chris meyers
03b73027e8 websockets aware of Instance changes
* New tower nodes that are (de)registered in the Instance table are seen
by the websocket layer and connected to or disconnected from by the
websocket broadcast backplane using a polling mechanism.
* This is especially useful for openshift and kubernetes. This will be
useful for standalone Tower in the future when the restarting of Tower
services is not required.
2020-03-18 16:10:17 -04:00
chris meyers
c06b6306ab remove health info
* Sending health about websockets over websockets is not a great idea.
* I tried sending health data via prometheus and encountered problems
that will need PR's to prometheus_client library to solve. Circle back
to this later.
2020-03-18 16:10:17 -04:00
Shane McDonald
45ce6d794e Initial migration of rabbitmq -> redis for k8s installs 2020-03-18 16:10:17 -04:00
chris meyers
be58906aed remove kombu 2020-03-18 16:10:17 -04:00
chris meyers
403e9bbfb5 add websocket health information 2020-03-18 16:10:16 -04:00
chris meyers
ea29f4b91f account for isolated job status
* We can not query the dispatcher running on isolated nodes to see if
the playbook is still running because that is the nature of isolated
nodes, they don't run the dispatcher nor do they run the message broker.
Therefore, we should query the control node that is arbitrating the
isolated work. If the control node process in the dispatcher is dead,
consider the iso job dead.
2020-03-18 16:10:16 -04:00
chris meyers
feac93fd24 add websocket group unsubscribe reply
* This change adds more than just an unsubscribe reply.
* Websockets canrequest to join/leave groups. They do so using a single
idempotent request. This change replies to group requests over the
websockets with the diff of the group subscription. i.e. what groups the
user currenntly is in, what groups were left, and what groups were
joined.
2020-03-18 16:10:16 -04:00
chris meyers
088373963b satisfy generic Role code
* User in channels session is a lazy user class. This does not conform
to what the generic Role ancestry code expects. The Role ancestry code
expects a User objects. This change converts the lazy object into a
proper User object before calling the permission code path.
2020-03-18 16:10:16 -04:00
chris meyers
5818dcc980 prefer simple async -> sync
* asgiref async_to_sync was causing a Redis connection _for each_ call
to emit_channel_notification i.e. every event that the callback receiver
processes. This is a "known" issue
https://github.com/django/channels_redis/pull/130#issuecomment-424274470
and the advise is to slow downn the rate at which you call
async_to_sync. That is not an option for us. Instead, we put the async
group_send call onto the event loop for the current thread and wait for
it to be processed immediately.

The known issue has to do with event loop + socket relationship. Each
connection to redis is achieved via a socket. That conection can only be
waiting on by the event loop that corresponds to the calling thread.
async_to_sync creates a _new thread_ for each invocation. Thus, a new
connection to redis is required. Thus, the excess redis connections that
can be observed via netstat | grep redis | wc -l.
2020-03-18 16:10:16 -04:00
chris meyers
dc6c353ecd remove support for multi-reader dispatch queue
* Under the new postgres backed notify/listen message queue, this never
actually worked. Without using the database to store state, we can not
provide a at-most-once delivery mechanism w/ multi-readers.
* With this change, work is done ONLY on the node that requested for the
work to be done. Under rabbitmq, the node that was first to get the
message off the queue would do the work; presumably the least busy node.
2020-03-18 16:10:16 -04:00
chris meyers
3fec69799c fix websocket job subscription access control 2020-03-18 16:10:15 -04:00
chris meyers
2a2c34f567 combine all the broker replacement pieces
* local redis for event processing
* postgres for message broker
* redis for websockets
2020-03-18 16:10:15 -04:00
chris meyers
558e92806b POC postgres broker 2020-03-18 16:10:15 -04:00
chris meyers
355fb125cb redis events 2020-03-18 16:10:15 -04:00
chris meyers
c8eeacacca POC channels 2 2020-03-18 16:10:12 -04:00
softwarefactory-project-zuul[bot]
eda494be63 Merge pull request #6330 from rooftopcellist/fix_flakey_workflow_functest
Fix flaky workflow test & set junit family

Reviewed-by: Christian Adams <rooftopcellist@gmail.com>
             https://github.com/rooftopcellist
2020-03-18 19:27:26 +00:00
Christian Adams
4a0c371014 Fix flaky workflow test & set junit family 2020-03-18 14:02:33 -04:00
Bill Nottingham
b875c03f4a Clean up a few more cases where we checked the license for features. 2020-03-17 17:19:33 -04:00