Resolved -
We are closing this incident. Our main background processing queue for larger repos has had no backups in 48 hrs. We will continue monitoring for recurrence.
May 28, 08:53 PDT
Update -
We've cleared the backlog of background jobs for larger projects from our heavy project queues, with the exception of nine projects whose volume continues to exceed our daily fair use thresholds. Those nine have been redirected to secondary and tertiary queues that drain only after the primary queue is clear. Most of their jobs have now been processed, but a few projects generate more jobs than a normal day + overnight cycle can drain, so some backlog remains. We're continuing to work those queues down manually. If you think one of your projects may be affected, reach out to us at support@coveralls.io and we'll check status for you.
May 27, 15:18 PDT
Update -
We are continuing to monitor for any further issues.
May 25, 13:06 PDT
Update -
We continue to monitor background processing queues for larger repos (5K+ files), as we continue offloading incoming jobs from the outlier repos whose volume led to the original backlog. If you think you may have been paused as one of these repos with outlier-level volume, please reach out to us at support@coveralls.io and we'll confirm status and get you an ETA on when your remaining jobs will be processed.
May 21, 09:23 PDT
Update -
We have moved all repos with more than 300 jobs in queue to dedicated queues for processing. These 7 repos were responsible for 75% traffic in our heavy repos queue. This move allows us to provide normal processing times for non-outlier repos and dedicated resources to outlier repos. Outlier repos by their job volume will still take longer to clear. If you believe you may be one of these outlier repos, please reach out and we'll confirm and give you an idea of when your jobs will be processed. If you'd like faster processing, we can also pause processing on older job, for instance, anything older than 1 hour (or 2 hours, or 3 hours) ago.
May 20, 07:06 PDT
Update -
We are keeping this incident open as we continue monitoring.
May 20, 06:48 PDT
Update -
We are still receiving reports of latency for large projects and are seeing backlogs of background jobs in queues dedicated to large repos (5K+ source files). We may pause some outlier repos with excessive numbers of jobs in order to clear general traffic. If you believe you may have been paused, reach out and we'll let you know.
May 19, 06:36 PDT
Monitoring -
A fix has been implemented and we are monitoring the results.
May 18, 10:37 PDT
Identified -
We have received reports of increased report latency for larger projects (5K file and up). We are doing our best to clear the background job queues for these projects. If you have one of these projects, feel free to reach out and we'll check status for you.
May 18, 09:41 PDT