In 7.x.x the information in stateTransitions was reduced from one element per state transition to only the first transition per state, this considerably reduces the use of stateTransitions for effective workflow analysis and results in the need to use the less efficient itemHistory for the same analysis.
Consider a scenario to analysis the duration in each state for a workflow and the frequency of state re-entry.
Prior to the change to stateTransitions it was only necessary to extract all stateTransitions information. Even if there were thousands of state transitions, that should still not be a huge amount of data because stateTransitions are fairly efficient.
Now the only solution is to use itemHistory, itemHistory will always be larger than stateTransitions because it covers all history changes, not just state changes, and for the same report, it is necessary to report ALL itemHistory, this can be a massive amount of data and far greater than just stateTransitions.
Without additional information on the performance issue which caused the issue originally with 4000 state transitions, I suspect, the original issues was not necessary caused directly by the amount of stateTransitions, but could instead have been caused by stateTransitions information changes inclusion in itemHistory, this is a very inefficient duplication of information which is already in stateTransitions and can also be extracted from the state information contained in itemHistory.
Take an example of a work item with 4000 state transitions, for that, stateTransitions could be considered a significant amount of data, but it is not massive. Consider, for example, each stateTransition could be recorded in 100 bytes (note that the 100 bytes is only for example and for a relative analysis of impact vs itemHistory, a more detailed analysis should be used to provide an accurate figure), then the whole data set for 4000 stateTransitions is 0.38MB, not massive.
However as stateTransitions change history is included in each and every itemHistory, then consider there are 4000 itemHistory elements which all duplicate the stateTransitions changes, this equates to a growing quantity of stateTransitions change history per itemHistory at the rate 1 + 2+ 3 + .... + 4000 reaching a figure that for itemHistory, there is a minimum of 8002000 state Transitions change information in the work item, each using the estimated 100 bytes per state Transition, this would equate to 763MB of stateTransitions contained in itemHistory. However we might also expect that the figure of 4000 entries in itemHistory is low, because there will also be other changes in the history not related to state changes, so that figure for stateTransition change information in itemHistory could be far greater.
The original performance issue may not have been caused by the quantity of stateTransitions needed to store all state transitions, but the duplication of stateTransitions change in every itemHistory which makes the impact of stateTransitions grow expontentially larger.
Instead of limiting stateTransitions to one item per state, it might have been better to include that in full, but either limit, or omit, stateTransitions change history from itemHistory, that would considerably reduce the impact of stateTransitions in itemHistory whilst still allowing efficient analysis of state transitions using stateTransitions information.
In a basic test of work items created prior to the 7.x.x change to stateTransitions, it was noted that for the same report, using itemHistory took a factor of 6x longer (3 minutes) than when using stateTransitions (30 seconds), the performance impact is probably similar.
My idea is to include the full set of state transitions in stateTransitions information, for the efficient processing of state transitions, but omit the stateTransitions change information from itemHistory in which it is probably not useful at all (the same information can be determined using the state change information contained in itemHistory.