Salesforce Concurrent Long Running Apex Limit Troubleshooting

Salesforce’s Concurrent Long Running Apex Limit is an org-wide limit where no more than 10 synchronous transactions can be executing for more than 5 seconds. The Execution Governors and Limits page has this as a footnote:

If more transactions are started while the 10 long-running transactions are still running, they’re denied. HTTP callout processing time is not included when calculating this limit.

Execution Governors and Limits

They’re not kidding when the next transaction will be denied. Production users were randomly unable to do things when this limit was reached including logging in, opening records, running custom code and other day-to-day actions.

Here are my lessons learned and what you can to troubleshoot and resolve this.

High-Level Steps

  1. Identify the long running Apex transactions.
  2. Tune / Optimize those synchronous Apex transactions.
  3. Monitor.
  4. Repeat

Identify the Long running Apex Transactions

First Submit A Salesforce Case

Contact Salesforce immediately asking for more information and the report of what’s taking the most time in your org. When doing this, provide them with the timestamp when it occurred, org id, and anything else you can to help them sift through their logs. Via the case, Salesforce provides a spreadsheet of the top longest running Apex transactions for that day that are contributing to that limit.

Using that spreadsheet, one can now target the top offenders and optimize and tune them. Keep in mind that there could be multiple contributors to this and not a primary one. Your mileage will vary.

Check Your Event Monitoring Logs If Available

If you’ve purchased the stand alone Event Monitoring Logs add-on or have it through Platform Shield, great! You have access to lots of valuable information. Using the Salesforce Event Log File Browser, which in my head have nicknamed ELF, one can browse your org’s events including the ConcurrentLongRunningApexLimit event and see each time it occurred and for whom. Concurrent Long-Running Apex Limit Event Type Documentation

From there, open the ApexExecution event log file in Excel and then:

  1. Expand All the columns
  2. Enable Filtering on all the columns.
  3. On the “Is_Long_Running_Request” column, filter it for “1” so only the long running ones are shown.
  4. Sort the “Timestamp_Derived” column.
  5. See which apex transactions were running at the time of the limit being reached by analyzing those rows. If your entry point is “Triggers”, open the ApexTrigger Event logs and dig into there.

Even if you have Platform Events purchased, still submit a Salesforce case because they can provide you with even more information that you can’t get from here and are able to answer questions that are not obvious.

Apex Execution Event Type Documentation, Apex Trigger Event Type Documentation

Optimizing

General optimization strategies:

  1. Tune Salesforce so it runs faster and under 5 seconds
  2. Asynchronous – Move operations to be asynchronous
  3. Reduce Workload

Tune / Optimize Your Apex

There’s no silver bullet here. It entirely depends on your org’s particular structure and content. Optimizing so operations run faster will likely have trade-offs. For example, having lots of process builders will contribute to this limit even though it’s declarative. Consolidating those down to one or a few is considered a best practice but will likely make them much longer and more complex. One could also move that logic to an Apex trigger or a Before Save Flow or another faster alternative.

Optimization Guidelines

  • Delete the functionality? Is it even needed any longer?
  • Apex Triggers are bulkified with bulk unit tests.
  • Use appropriate algorithms. Common culprits:
    • Searching through records in a loop by Id or other criteria. One can index the records using a Map<Id, Sobject> and then use map.get(recordId) to get the needed record. Alternatively, one can build a map using a composite key where two or more fields from each record are used.
    • Multiple SOQL queries used when one can suffice using semi-joins, sub-selects, and fetching parent records up to 5-levels up.
    • Trying to do too much synchronously. Use asynchronous operations where appropriate. See more info below.
  • Go Off-Platform – If you have a particularly large operation that may even hit async limits, you may want to consider using non-platform software to do the processing. For example, using Heroku, Azure, your own on-prem servers to do the work and then communicate with Salesforce using its APIs as needed. I’m hoping Salesforce’s Evergreen will become available soon to explore that as an option since it’ll be tightly integrated with Salesforce.

Asynchronous

Instead of using synchronous operations, make them asynchronous or async for short. Keep in mind that making them asynchronous means they run later in a separate execution context. I’m not going to get into all the trade-offs. However, my general guideline is if it doesn’t have to happen in real-time but near real-time or on some schedule is acceptable, go asynchronous.

Also keep in mind that the the operation doesn’t have to be completely asynchronous. One can make one or more steps of it async but keep the remainder synchronous.

Async Options

  • Queueable Apex – My favorite async option. It starts as fast or faster than future methods but can be given complex input. Can be used as infrastructure to build a work queue where records are stored in the database. Saving a record starts a queueable to process. With Apex classes implementing a certain interface, one could dynamically instantiate the async handlers and use declarative options to trigger when these options run.
  • Batch Apex – Great for processing very large datasets, especially for manual one off operations or on a schedule using Scheduled Apex.
  • Scheduled Apex – Typically, I combine Scheduled Apex with batch apex for operations that take quite a while to run and they’re typically run a few times a day, weekly, or monthly. This is typically for data that doesn’t need to be very up-to-date and is mostly static or perhaps needs to be generated occasionally.
    • Note: One can create scheduled flows now so explore that option first.
  • Future Methods – These are great for smaller, quick one-off operations like send an email or do an Http Callout. Recommend using Queuable Apex instead because one can provide more complex input to it and have better visibility via Apex Jobs in Setup.
  • Asynchronous HTTP Callouts aka Continuations – HTTP Callout time DOES contribute to this limit despite the documentation saying it doesn’t. Salesforce support confirmed this and so do the Apex Execution event monitoring logs where one can see the CPU time being milliseconds but the Callout Time being more than 5 seconds. This was one that helped with one big contributor in one particular customization that used an external API to do operations but had a UI in Salesforce. The API’s services would often take longer to process during peak time like during business hours. Continuations don’t count toward this limit but this was a large refactoring because Continuations require their callbacks to be within the same class so when I had all the API callout code encapsulated, it had to be “exposed” which took a decent amount of effort.
    • Note: Even though a lot of documentation has only Visualforce Examples, Lightning supports continuations too with Aura and LWC components. However, it’s not supported in all Lightning run-times so check the documentation.
    • Second Note: Continuations have their own limits. Another limit is they only support up to a 1 Megabyte (MB) response size whereas synchronous Apex callouts support up to 6 MB.
  • Async APIs – If an external system such as some ETL tool or middleware is inserting, updating, or deleting large amounts of records that have lots of automation using the synchronous REST or SOAP APIs, that can be problematic too, especially if multiple, concurrent threads or processes are used. A potential quick fix could be to reduce the concurrency to less than 10 “workers” but that may require longer run-times. Another option is to use Async APIs such as the Bulk API which is designed specifically for large data volumes and handles parallelism on platform.

Reduce Workload

Another option is reduce how much work the synchronous operation has to do. For example, throttle back how many records are being processed by reducing the batch size.

Monitor & Repeat

As more users get onboarded and more customizations are built, it’s likely that this will be encountered from time to time so you’ll need to monitor your run times during development, testing, and in production. Event Monitoring is a great way to do that. There are other tools that can be purchased that’ll consume the event information and provide alerts and reports on it but I haven’t gotten that far yet. If you have experience with those, let me know in the comments!!

If event monitoring isn’t purchased, you can periodically submit a Salesforce case asking for that long-running apex report. Also, one could periodically turn on debug logs for certain users and monitor various operations.

Repeat the above steps as needed to identify the problem areas and tune them.

Closing Thoughts / Lessons Learned

  • Event Monitoring was very helpful to help find out what was happening. It took quite a bit of time though to find the source. One semi-recurring occurrence was caused by an ETL tool updating records on a heavily automated object with lots of process builders, workflow rules, and an apex trigger. The Apex Execution logs just kept saying TRIGGERS but it wasn’t until the ApexTriggers logs were opened that I noticed that the same user was updating multiple batches of 200 records simultaneously which indicated some service user was the culprit. Turned out to be our ETL user. The quick fix was to reduce the concurrency to below 10 threads.
  • Huge shoutout to the Salesforce Ohana Slack community who provided great feedback and actionable advice. That’s where I learned about event monitoring. Salesforce Ohana Slack Join Link. They were a lot more helpful than Salesforce support.
  • HTTP Callouts contribute to the limit despite the documentation saying they don’t. Apex continuations can be a great alternative since they don’t contribute to the limit but they have their own limits and will likely require redesigning and refactoring the codebase to support them.
  • This takes time. It could be hours or days and possibly very long days if it’s an ongoing issue.
  • Stay calm and do what you can. You will figure this out.

What other tips, advice, or causes have you seen? Let us know in the comments below.