Effect of memory, intolerance and second-order reputation on cooperation

The understanding of cooperative behavior in social systems has been the subject of intense research over the past decades. In this regard, the theoretical models used to explain cooperation in human societies have been complemented with a growing interest in experimental studies to validate the proposed mechanisms. In this work, we rely on previous experimental findings to build a theoretical model based on two cooperation driving mechanisms: second-order reputation and memory. Specifically, taking the Donation Game as a starting point, the agents are distributed among three strategies, namely Unconditional Cooperators, Unconditional Defectors, and Discriminators, where the latter follow a second-order assessment rule: Shunning, Stern Judging, Image Scoring, or Simple Standing. A discriminator will cooperate if the evaluation of the recipient's last actions contained in his memory is above a threshold of (in)tolerance. In addition to the dynamics inherent to the game, another imitation dynamics, involving much longer times (generations), is introduced. The model is approached through a mean-field approximation that predicts the macroscopic behavior observed in Monte Carlo simulations. We found that, while in most second-order assessment rules, intolerance hinders cooperation, it has the opposite (positive) effect under the Simple Standing rule. Furthermore, we show that, when considering memory, the Stern Judging rule shows the lowest values of cooperation, while stricter rules show higher cooperation levels.


I. INTRODUCTION
The presence of cooperative behavior among unrelated individuals remains an open question in the scientific community, constituting one of the current key scientific challenges 1 . The Evolutionary Game Theory 2,3 provides a powerful framework to study cooperative behavior 4 , including cooperation in structured populations [5][6][7][8] . Several mechanisms have been proposed to explain cooperation 9 , such as kin selection 10 , direct 11 or indirect reciprocity 12 , group selection 13 , and network reciprocity [14][15][16][17] . Among them, indirect reciprocity does not require repeated interactions between the same pair of partners and offers a clear explanation of how this preference for cooperation has evolved 18,19 . In a population, when an individual exhibits an altruistic behavior towards another one, he pays a cost -including time, energy or risks-for his helping action even if he cannot get immediate returns. However, if a third party knows of his kind deed, he may provide help to this altruist in a later action so that the original cost of the first agent can be counteracted to obtain the positive benefit. That is, the helper receives the benefit not from the beneficiary himself but another individual. Indirect reciprocity requires public information about individual actions, as well as an evaluation system, so that cooperation can be sustained for a long time [20][21][22] . Thus, it is significant to build a feasible and reliable evaluation system 23 to differentiate between altruistic and selfish persons, and give the corresponding reward 24,25 for the contributor or punishment [26][27][28] to the cheater.
Regarding the individual actions evaluation, probably the most popular approach is the Image Scoring, proposed by Nowak and Sigmund to explore the role of indirect reciprocity in the evolution of cooperation through computer simulations and theoretical analyses 12,29 . They showed that cooperation can thrive via the indirect reciprocity if each agent holds an image score, being the score increased (resp., decreased) by one point for each act of helping (not helping). According to this approach, a donor will provide help to a recipient if, and only if, this recipient has a positive score. Therefore, a player will obtain the help from others in the future if he has helped more often than he has refused to do it.
However, there is no unanimity on the effectiveness of the Image Scoring rule. As an example, Leimar and Hammerstein 30 indicated theoretically that Sugden's Standing Strategy 31 provided a much more effective mechanism to foster cooperation through indirect reciprocity under a more complex population structure. In Sugden's Standing Model 31 , a player's score only decreases when he refuses to help a recipient with a good score. Unlike Image Scoring, defecting against a bad guy does not penalize the donor's reputation. After that, Panchanathan and Boyd 32 also explored the evolution of indirect reciprocity when errors are considered, showing that, under these circumstances, Image Scoring is not an evolutionary stable strategy (ESS), while the Standing Strategy can be. Henceforth, only considering the actor's action (usually termed as the first-order evaluation) is not always enough when we design the rule of reputation evaluation, it is necessary to take both the donor's action and the recipient's reputation into account, which is referred to as a second-order assessment rule.
As a further step, Ohtsuki and Iwasa 20,33 exhaustively dis-arXiv:2004.01480v1 [physics.soc-ph] 3 Apr 2020 cussed the aforementioned two rules, together with other second-order reputation evaluation schemes, and found that Standing Strategy is often more successful than Image Scoring. In particular, they further pointed out that only eight cases, called "leading eight", significantly facilitate indirect reciprocity. At the same time, extensive experiments 34,35 are also conducted to illustrate how the Standing or Scoring mechanisms are adopted in the human cooperation, and it is indicated that the Standing rule is not superior to the Scoring mechanism due to the imperfect information 36 or gossip 37 dissemination during the real experiments.
A non-negligible fact is that, with some exceptions 30 , most theoretical works assume the well-mixed structure to study the reputation evaluation. Recently, Sasaki et al. 38 investigated the evolution of reputation-based cooperation in a regular lattice considering four leading second-order assessment rules (these rules will be defined and discussed in section II B). Through an agent-based model, they showed that those four rules lead to distinct cooperative behaviors, which strongly depends on the setup, and it is particularly indicated that the Simple Standing strategy is the most efficient one in terms of the promotion of cooperation on regular ring networks.
It is worth noting that the above-mentioned theoretical models carry out the second-order reputation assessment just according to the last action of a donor and the standing status of a recipient, that is, the historical information on individual actions in those studies reduces to one step. Nevertheless, the historical information (i.e., memory effect) may play a role in decision-making. For instance, Wang et al. 39 presented a memory-based Snowdrift Game on top of regular lattices and scale-free networks, where the fraction of cooperating actions stored in the memory is used to determine the strategy adoption at the next generation, finding that the memory length of individuals plays a distinct role as the cost-to-benefit ratio is changed. In a recent work, Cuesta et al. 40 showed through experiments with memory effect that reputation fosters cooperation and drives network formation. Furthermore, they found that people measure reputation based on all the information available (memory length), giving more weight to the last action. Thus, it is essential to combine the second-order assessment information with the memory effect to further study the role of indirect reciprocity in the evolution of cooperation, and we try to fill this gap in the current work.
The rest of this paper is organized as follows. First, in Section II, we introduce the Donation Game Model with Memory and Second-Order Assessment. Secondly, in section III, we address the model through a mean-field approximation that, while omitting some key features, helps its understanding and qualitatively predicts the macroscopic behavior observed in section IV, where we provide the results of large numerical simulations. Finally, in section V we discuss the implications of the model, together with the conclusions.

II. DONATION GAME MODEL WITH MEMORY AND SECOND-ORDER ASSESSMENT
In this paper, we investigate the Evolutionary Donation Game in a finite size population, as formulated in most spatial indirect reciprocity models 18 . During each interaction, every actor (individual) has just one chance to play as a donor, i.e., donate or not to a recipient, which is chosen within his neighborhood. Furthermore, individual actions history will be considered as a basis of reputation assessment 40 , according to four typical second-order strategies 20,33 below described in section II B. Hence, the memory effect or history of the recent actions will be combined with the reputation-based assessment rule to analyze the evolution of cooperation within a structured population. In what follows, we will describe in detail the newly proposed Donation Game Model with Memory and Second-Order Assessment.

A. Donation Game
In the proposed model, the interaction between any pair of players can be described as a Donation Game, that is, one player is selected as a donor and the other one as a recipient, and subsequently the donor will decide whether he will make a donation to the recipient or not. If he donates, the donor will pay a cost c, and the recipient will obtain a benefit b (b > c); if not, the donor will pay nothing, and the recipient will not receive any benefit. Although the donation does not give any direct benefit to donors, some individuals may choose to donate to show a good image and then increase their chances to get help from others in the future. Thus, the Donation Game is often chosen as a basic framework to explore the role of indirect reciprocity.

B. Second-order Assessment Rules
How to judge the goodness of a recipient is crucial for the here proposed model, and we select four typical second-order assessment rules as the basis of the calculation of the recipient score 33 . In Tab. I, we depict the assessment results under these four rules including Shunning, Stern Judging, Image Scoring and Simple Standing. We summarize their main features as follows: • Shunning: The donor is positively evaluated when he cooperates (donates) against a cooperator. Otherwise (when he cooperates against a defector or whenever he defects), he will be negatively evaluated. This is the strictest rule to obtain a good image.
• Stern Judging: The donor will be positively evaluated if he cooperates against a cooperator, or if he defects (rejects the donation) against a defector. Otherwise, he will be negatively evaluated. To a certain extent, this rule will justify the defection since rejecting the donation to a recipient with a bad image is not considered a bad action, which helps a bad recipient to cleanse his image by refusing to help another player with a bad image.
• Image Scoring: The donor will be positively evaluated if he cooperates or negatively evaluated if he defects, regardless of the recipient's past actions. In essence, this is a first-order rule since the image of an agent is uniquely determined by his own action.
• Simple Standing: The donor will be negatively evaluated only if defects against a cooperator. Otherwise (when he defects against a defector or whenever he cooperates), he will be positively evaluated. Henceforth, the Simple Standing rule is the most tolerant rule for a donor to get a good evaluation among the four rules considered here.
For all these rules, when facing cooperator, a player will be positively evaluated if he cooperates, and negatively if he defects. The differences between these four rules appear when the donor meets a defector. In the second row, C and D (i.e., cooperation and defection) designate the action of the donor facing a recipient whose previous action is displayed in the first row. From third to sixth rows, G/B denotes that the donor will be evaluated as good (G) or bad (B) after the corresponding actions.

C. Initial Conditions
Let us consider a regular grid lattice of size L (the total number of players is N = L × L), which satisfies the periodic boundary conditions, and each node of the lattice will be occupied by a player who has 8 nearest neighbors (i.e., we consider the Moore neighborhood). Initially, each player will be randomly assigned equiprobably to one of three possible strategies: Cooperator (ALLC), Defector (ALLD) or Discriminator (DISC), which can be described in detail as follows: • ALLC: the donor always cooperates, that is, ALLC strategists are unconditional cooperators.
• DISC: the decision of whether to cooperate or not depends on the estimated reputation score of the recipient, which in turn is based i) on the last recipient's actions and ii) on the donor's assessment rule. We term these strategists as discriminators; the assessment rules have been described in previous Section II B.
Players, as donors, are characterized by four possible actions: CC, CD, DC, DD. Two of these actions, CC and CD, correspond to cooperative actions, namely, CC when cooperating against a cooperator (i.e., the recipient cooperated in his last action) and CD when cooperating against a defector (recipient's last action was to defect). The other two actions, DC and DD, correspond to non-cooperative actions: DC if a player defects against a cooperator and DD if he defects against a defector. In order to characterize the memory effect of individuals in the current model, we will record the action lists for each individual in the most recent M steps 40 .
Regarding the initial conditions, first M actions of ALLC (resp., ALLD) strategists are randomly chosen from CC or CD (resp., DC or DD), while first M actions of DISC strategists are randomly taken from the set {CC, CD, DC, DD} and determined by the specific assessment rule of the discriminator (Section II B).

D. Iteration Procedure
The evolution of the game will be hinged in the following way: at each elementary time step, a random player (the focal player or donor) chooses a random neighbor (the recipient) and decides if he cooperates or not. At each period, any player will have, on average, one chance to act as a donor, that is, a period consists of N elementary time steps (Donation Game decisions) that will take part in random order.
Let us explain in detail the dynamical procedure: • ALLC: if the focal Player-i is an unconditional cooperator, he pays the cost c to Player-j who obtains the benefit b (i and j payoffs are −c and b, respectively). We record i's last action as CC if Player-j's last action was CC or CD; otherwise, the last action of Player-i is recorded as CD.
• ALLD: if the focal Player-i is an unconditional defector, he rejects the donation to his partner j, and both payoffs are zero. We record i's last action as DC if Player-j's last action was CC or CD; otherwise, the last action of Player-i is recorded as DD.
• DISC: if the focal Player-i is a discriminator, he will calculate the weighted image score of Player-j in the light of four different assessment rules as shown in Tab. I. If Player-j's score is higher than the required minimum reputation H 0 , Player-i will donate to j; otherwise, Player-i will reject the donation. H 0 represents a minimum threshold so that the recipient can be considered good enough to be a beneficiary of the donation. It is, therefore, a measure of the intolerance. Finally, Playeri's last action will be accordingly updated. The detailed decision procedure for DISC players can be further described as follows: 1): Assessment.Player-i will evaluate Player-j's actions to be good (G) or bad (B) according to Tab. I. As an example, we assume that Player-i is a discriminator adopting the Stern Judging rule, and the last M = 5 actions of Player-j are CC, DC, CD, DD, CC. Then, based on Tab. I, Player-i judges Player-j's goodness of action list to be G, B, B, G, G.
2): Calculation of the weighted score. If the action is judged as good (G) or bad (B), the corresponding score will be 1 or 0, respectively. The final reputation score of Player-j through the eyes of Player-i will be defined as: where S denotes the score of j's last action andS represents the average score of j's M last actions 48 . In the above-mentioned example, Player-j's score will be: 3): Decision. If r j|i > H 0 , Player-i will pay the cost c to cooperate with Player-j, who will obtain the benefit b. Otherwise, Player-i will defect, and both payoffs will be zero. We will record Player-i last action accordingly.
A generation includes h of the above described periods. At the end of a generation, all the players synchronously update their current strategies following a Fermi-like updating rule [41][42][43] . Let P i and P j be the payoffs of player i and a random neighbor j, accumulated throughout the last generation. Then, Player-i will imitate Player-j's strategy with a probability Prob(i ← j) given by: where K denotes the irrationality of individual choice or the noise of strategy adoption. Note that the model includes two different time scales: a time scale involving payoff-independent decision-making strategies 17,47 , and another longer scale, of evolutionary character, involving strategies imitation 41 .
In this Section, we discuss various approaches to obtain a mean-field solution to the model here presented. These approaches preserve the assessment rules based on the secondorder reputation while neglecting some aspects related to the spacial distribution and formation of clusters, the length of the memory and the weight of the last action. The goal of this mean-field approximation is to capture the qualitative behavior of the system and detect which specific aspects are not reproduced due to the ingredients not considered here. Throughout this section, we will refer to figures 1 and 2 (which contain the numerical results that will be developed in Section IV) to have a visual reference of the parameter space and, also, to compare the predictions with the agent-based numerical results.
Consider a well-mixed population and the low noise case (K hb), which allows us to assume a deterministic imitation rule. Let ρ c , ρ d , and ρ i be the fraction of ALLC, ALLD, and DISC strategists, respectively. For simplicity, let us consider an initial population defined by ρ c = ρ d = ρ i = 1/3.

A. Image Scoring
Here, we study the case when DISCs are Image Scoring strategists. Discriminators always donate to ALLC, but never to ALLD players. Let r i|i be the mean value of the reputation score of a DISC as seen by another DISC. From Eq. (1) and Table I, it follows: where C is the fraction of cooperative actions in the system. Note that w influences the variance of r i|i but not its mean value. On average, a DISC will give to another DISC if The average payoffs for ALLC, ALLD, and DISC strategists are, respectively: where Θ stands for the Heaviside function which is zero for negative arguments and one for positive ones.
Low H 0 For low enough values of H 0 (i.e., H 0 < C ), DISC players, on average, will cooperate when facing another DISC. The average cooperation level within a generation (constant ρ c , ρ d , ρ i ) evolves according to: Within the first generation, given ρ c = ρ d = ρ i = 1/3, C evolves to a value greater than 1/2. When H 0 < C , what is true for H 0 0.5, the average payoff of a DISC is given by: and the average payoff difference between DISC and ALLD is: what implies that At the end of the first generation, this condition will be satisfied for b > 2c. For b > 2c, DISC will overcome both ALLD and ALLC, while ALLD will overcome ALLC. If C increases over time, condition H 0 < C is preserved, and therefore Eq. (6). Furthermore, as ρ i increases over time, Regarding the resilience of ALLC strategists, on the one hand the average payoff difference between ALLC and ALLD is given by: On the other hand, the payoff difference between DISC and ALLC is: that is, Actually, in absence of ALLD players, ALLC and DISC are indistinguishable strategists. Summarizing, regarding panels (III) in Fig. 2, where c = 1: • Upper left area. Provided b > 3, the higher the value of b, the higher the fraction of ALLC that will survive the first stages and will coexist with DISC players at the steady state.
• Bottom left corner. For b < 2, we have Π d − Π i = ρ c , and ALLD will invade ALLC and DISC.
High H 0 For high values of H 0 (i.e., H 0 < C ), the average cooperation level within a generation (constant ρ c , ρ d , ρ i ) evolves according to: Within the first generation (ρ c = ρ d = ρ i = 1/3), C evolves to a value lower than 1/2. When H 0 > 0.5, the average payoff of a DISC is given by: and, therefore: For high enough values of b, both ALLC and ALLD will defeat DISC. Taking into account: the advantage of the ALLD over the ALLC increases as ρ i decreases, which in turn leads to a reduction in ρ c and to an absorving mono-strategic state of ALLD (Upper-right and central-right area of panels (III) in Fig. 2). As b decreases, the average payoff difference between ALLC and DISC (i.e., ρ i b − (ρ i − ρ d )c) decreases. Note that, in absence of ALLC, DISC strategists never donate, therefore ALLD and DISC are indistinguishable strategists: if some DISC strategists survive the first stages sorrounded by DISC and ALLD, they will coexist (as defectors) with ALLD, allowing a mixed equilibrium of ALLD and DISC (Bottom-right area of panels (III) in Fig. 2). Note that, in any case, the only action will be to defect (right area of panel (c) in Fig. 1).

B. Shunning
In this subsection, we analize the case when DISCs are Shunning strategists. In this case, discriminators never donate to ALLD. On the other hand, ALLC always cooperate, but only when cooperating with a cooperator will be positively evaluated by DISC.
The mean reputation scores of ALLC, ALLD, and DISC players through the eyes of a DISC are, respectively: Low H 0 For low relative values of H 0 (i.e., H 0 < C ), DISC players, on average, will pay to ALLC. The average cooperation level within a generation (constant ρ c , ρ d , ρ i ) evolves according to: In the first generation (ρ c = ρ d = ρ i = 1/3), C evolves to a value greater than ∼ 0.38. When H 0 < C 2 , what is true in the early stages for H 0 0.15, the average payoffs are given by: which are the same payoffs that those of Eq. (4,6) corresponding to the previous Image Scoring -Low H 0 case, and therefore, the same analysis applies here. Note that, although payoffs in Eq. (16) were calculated for the first generation, C increases over time, and therefore also the payoffs differences. A consequence is the similarity between the left part of the respective panels (I) and (III) in Fig. 2, and between panels (a) and (c) in Fig. 1. Nevertheless, given the fact that the condition of low H 0 is more restrictive in the current case, the cooperative green area on the right side of panel (a) in Fig. 1 is smaller than that in panel (c).
High H 0 For high values of H 0 (i.e., H 0 > C ), DISC players, on average, will not donate to ALLC. The cooperation level within a generation evolves according to: Therefore, the condition H 0 > r i|i = C 2 is satisfied at the end of the first generation (ρ c = ρ i = 1/3, C 0.38). It follows that DISC players, on average, will not donate to anybody. The payoffs are given by: For high values of H 0 , DISC will play as ALLD, both having a higher payoff than ALLC. Payoffs in (18) were calculated for the first generation. Nevertheless, C decreases over time as DISC and ALLD overcome ALLC, and therefore r i|i decreases according to (14). This means that the order of the payoffs is preserved over time: ALLD and DISC will invade ALLC. Although there is a mixed equilibrium composed of DISC and ALLD strategists (right side of panels (I) in Fig. 2), the former will act as defectors, and therefore the cooperative level will tend to zero (right side of panel (a) in Fig. 1).
Since in this case Π d = Π i , higher-order effects beyond the mean-field approach should play a key role. Although for H 0 > C , DISC players, on average, will not donate to any strategist, there is an ε > 0 probablity for a DISC to pay to an ALLC (and an even smaller probability to pay to another DISC). By adding this corrective term, the average DISC payoff becomes: The lower the value of H 0 (also the shorter the memory length M), the higher the value of ε. Furthermore, the relative payoff difference between ALLD and DISC will decrease as b increases. To summarize, although according to the mean-field approximation ALLD and DISC payoffs are equal, higherorder effects imply a dependence of the final mixed equilibrium on b and H 0 .

C. Stern Judging
Here, we investigate the case when DISCs are Stern Judging strategists. Depending on the values of the parameters, DISC players may donate or not both to ALLC and ALLD players. The mean reputation scores of the different strategist, as seen by a DISC, are given by: Low H 0 From (20) it follows that, provided H 0 < C and H 0 < (1 − C ), DISC players, on average, will pay to ALLC and ALLD. At the first generation (ρ c = ρ d = ρ i = 1/3), on average, half of the actions of a DISC will be considered as good actions by another DISC. The average cooperation level within a generation (constant ρ c , ρ d , ρ i ) evolves according to: Therefore, for (ρ c = ρ d = ρ i = 1/3), it follows r i|i 1/2. The corresponding average payoffs for H 0 < 1/3 will be: At the end of the first generation, ALLD players overcome ALLC and DISC and C decreases. Therefore, according to (20), r c|i decreases and r d|i increases over time. In the same way, r i|i tends to ρ d + ρ i as C decreases, and therefore to 1 as ρ c decreases. Consequently, payoffs evolve over time towards: and ALLD strategy will invade ALLC and DISC (left area of panels (II) in Fig. 2). Mean-field approximation cannot reproduce the cooperative behavior observed in the numerical simulations for high values of b, when payoff differences are small and other highorder effects become key. As in the previous case, there are two equal payoffs in (23), here Π c = Π i . By adding a higherorder corrective term, the average DISC payoff for the first stages becomes: Note that, unlike the previous case (Shunning, high H 0 ), the corrective term now applies to the probability of a DISC to defect against any strategist (i.e., it is not multiplied by a density ρ), becoming higher than that for Shunning discriminators. The relative payoff difference between ALLD and the rest of the players will decrease as b increases. For high values of b, the differences between the payoffs cannot prevent the formation of cooperative clusters. Note that this cooperative behavior (upper left corner of panel (b) in Fig. 1) corresponds to DISC strategist that act as cooperators.
High H 0 For high relative values of H 0 (i.e., H 0 > C and H 0 > (1 − C ), DISC players, on average, will pay neither ALLC nor ALLD. As in the previous case (low H 0 ), at the first generation (ρ c = ρ d = ρ i = 1/3), on average, half of the actions of a DISC will be considered as good actions by another DISC. The average cooperation level within a generation evolves according to: Solving it for ρ c = ρ i = 1/3, it is found that for H 0 > 1/2, a DISC will probably defect when facing any strategist. Regarding higher order effects, the shorter the memory length M, the higher the probability ε for a DISC to cooperate. At the end of the first generation, the corresponding average payoffs will be: where, the higher-order corrective term ε has been added (Π i = Π d in the mean-field). Given Π d > Π i > Π c , ALLD players will beat ALLC and DISC. According to (20), the consequent decrease of C leads to an increase in r d|i and r i|i , and to a decrease in r c|i . Consequently, payoffs evolve over time towards: and ALLD strategy will invade ALLC and DISC, bringing the system to a mono-strategic ALLD state (right area of panels (II) in Fig. 2).

D. Simple Standing
In this subsection we discuss the case when DISC are Simple Standing strategists. In this case, discriminators always cooperate when facing an ALLC. Regarding ALLD strategists, an ALLD defecting against a defector will be positively evaluated by a DISC. The mean value of the reputation scores through the eyes of a DISC are: On average, a DISC will give to an ALLD if r d|i = 1 − C > H 0 . Regarding how a DISC evaluates another DISC, note that the function r i|i ( C ) is not monotonous, reaching a minimun value for C = 1/2.
For low relative values of H 0 (i.e., H 0 < 1 − C ), DISC players, on average, will pay to ALLD. Within a generation, the average cooperation evolves according to: Solving (28) for ρ c = ρ i = 1/3, it is found that C evolves towards C 0.59 within the first generation. It follows that, at the end of the first generation, DISC will pay to all the strategists for H 0.41. The corresponding average payoffs are given by: Therefore, ALLD players will overcome ALLC and DISC. As ρ d and (1 − C ) increase over time, r i|i increases, and the system will evolve in time towards r c|i = r d|i = r i|i = 1, with DISC playing as ALLC. The system is characterized by a fraction ρ c + ρ d of cooperators and a fraction ρ d of defectors. This is the classical scenario where mean-field approach involves full defection (ALLD) and cannot explain cooperation for high enough values of b in structured populations (and also with memory in this model).
As H 0 increases, the probablity for a DISC to pay to an ALLD decreases. For H 0 > 1 − C , DISC players, on average, will not donate to ALLD. Approximation (28) becomes: Solving (30) for ρ c = ρ i = 1/3, it is found that within the first generation the cooperation will tend to C →∼ 0.59. At the end of the first generation, for H 0 0.41, and the average payoffs can be approximated by: and the average payoff difference between DISC and ALLD will be: In the first stages (ρ c = ρ i ), DISC players will overcome ALLC and ALLD for b > 2c. Nevertheless, the consequent increase of ρ i and C over time leads to a decrease in r i|i and to an increase in r d|i . This fact involves a trade-off between b and H 0 : a lower value of H 0 involves a higher b to allow DISC invading ALLD (column (IV) in Fig. 2). Regarding ALLC strategists, the average payoffs differences are given by: what implies that Π c > Π d for ρ i b > c. At the first stages Actually, for ρ d = 0, ALLC and DISC are indistinguishable strategists. Provided b ≥ 3, the higher the value of b, the higher the fraction of ALLC players that will survive the first stages and will coexist with DISC ones at the steady state (panels (d1,d3) in Fig. 2).

IV. NUMERICAL SIMULATIONS
In this section, we present and discuss the results of numerical simulations for the agent-based model proposed here. We reduce the payoffs matrix parameters by fixing c = 1 and focus on the impact of recipient's benefit b and intolerance threshold H 0 on the cooperative behavior under the different second-order assessment rules considered. Based on previous experiments 40 , we fix the additional weight of the last action to w = 0.165 and the memory length to M = 5. Each independent realization is run up to 2000 generations, ensuring that the system can reach a steady state, which is reached typically after 100-1000 generations. Fig. 1-5, which will be discussed in the following subsections, display the results corresponding to N = 2500 (L = 50). Additionally, larger lattice sizes (e.g., N = 10 4 ) are also tested and qualitatively equivalent results have been obtained (not shown here for brevity).  Fig. 1 displays the stationary fraction of cooperative actions C as a function of the benefit b and intolerance H 0 , each panel corresponding to each one of the four assessment rules considered: Shunning (panel a), Stern Judging (b), Image Scoring (c) and Simple Standing (d). In general, a very low recipient's benefit (b ∼ 1) does not encourage the donor to donate. For higher values of b, the level of cooperation depends on the benefit b and intolerance H 0 in different ways for different assessment rules. As shown, a low intolerance H 0 promotes cooperation for Image Scoring and Shunning rules while, conversely, Simple Standing rule behaves better for high values of H 0 . Finally, Stern Judging rule is the least favorable for cooperation since it only allows cooperative actions for very high benefit b and low intolerance H 0 . This last result differs from previous studies 20, 33 where neither memory nor intolerance was considered.

A. Level of cooperation and strategies distribution
To further study the differences in the cooperation level for the different assessment rules, Figure 2 shows the distribution of the different strategies -ALLC, ALLD, and DISC-as a function of b and H 0 . From left to right, each column corresponds to one of the four assessment rules: Shunning (column I), Stern Judging (II), Image Scoring (III) and Simple Standing (IV). Additionally, for each column, the fraction of each strategy at the stationary state is shown in different rows: ALLC (panels in top row), ALLD (center), and DISC (bottom). Generally speaking, under one specific assessment rule, the level of cooperation is determined by the competition between ALLD, DISC, and ALLC strategists.The coexistence of these three strategies is difficult, showing (simplex) inner points only for very specific regions of the parameter space.
The distribution of strategies can help explain cooperative behavior for the different rules. Note that the arguments used here, although from a qualitative nature, include more ingredients than those used in the previous mean-field approximation, such as memory and spatial distribution, and the results exhibit some new non-trivial phenomena as follows: • Shunning: Here, DISC players will only positively evaluate CC actions and therefore do not cooperate against ALLD players. For a low intolerance threshold H 0 , given an initial homogeneous strategy distribution (ρ c ∼ ρ d ∼ ρ i ), and a large enough memory (in plots, M = 5), DISC and ALLC players will have, on average, a fraction µ S of CC actions in their memory such that µ S > H 0 , and therefore will be positively evaluated by DISC players. Thus, DISCs will likely cooperate when facing DISC and ALLC players. As ALLC strategists will cooperate against any strategist, ALLC and DISC players can group and form cooperative clusters for high enough b, invading ALLD players (who only receive donations from ALLC players). Without ALLD players, ALLC and DISC are equivalent strategies and will coexist as cooperators. On the other hand, for high values of intolerance H 0 , a small fraction of actions belonging to the set {DC,CD, DD} (any strategist is compatible with one or more actions in that set and will be likely present in his history at the early stages) is enough to be negatively evaluated by other DISC players, that is, any strategist will have, on average, a fraction µ S of CC or actions in his memory such that µ S < H 0 , and therefore will be negatively evaluated by DISC players. Thus, DISC strategists will likely not cooperate against any strategist, acting as ALLD players. DISC and ALLD players (which constitute a majority acting as a unique strategy) will invade ALLC and will coexist as defectors.
• Stern Judging: In this rule, DISC players will positively evaluate CC and DD actions. For low values of H 0 , at early stages (ρ c ∼ ρ d ∼ ρ i ), any strategist will have in his memory, on average, a fraction µ SJ of actions belonging to the set {CC, DD} such that µ SJ > H 0 , and therefore will be positively evaluated by DISC players. Thus, DISC players will likely cooperate when facing any strategist, behaving as ALLC players. Unlike the previous Shunning case, where ALLD players obtained benefit only from ALLC, now they get donations both from ALLC and DISC players, thus having a higher relative payoff and resulting in an invasion of ALLD over the rest of strategies. Only for very high values of benefit (in Fig. 1-2, b 9) DISC players can resist invasion by ALLD; actually, this is the only region of parameters that allows cooperation. On the other hand, for high values of intolerance H 0 , and at early stages, any strategist will have, on average, a fraction µ SJ of actions belonging to the set {CC, DD} in his memory such that µ SJ < H 0 , and therefore will be negatively evaluated by DISC players. However, although DISC players tend to defect against any strategist, they have a non-zero probability of cooperating when facing any player (either ALLC, ALLD, or DISC), resulting in a lower accumulated payoff than that of ALLD players. Therefore, DISC players will get the higher accumulated payoff, which drives to an invasion of ALLD strategy over ALLC and DISC.
• Image Scoring: Here, DISC players will positively evaluate CC and CD actions and therefore will cooperate against ALLC but not against ALLD players. For low values of intolerance H 0 , at early stages, DISC players will have, on average, a fraction µ IS of actions cooperative actions (CC or CD) in their memory such that µ IS > H 0 , and therefore will be positively evaluated by other DISC players. Thus, DISCs will likely cooperate when facing DISC and ALLC players. This is the same situation than that corresponding to the previous Shuning -low H 0 case: ALLC and DISC players can form cooperative clusters for high enough b, invading ALLD strategy. Without ALLD players, ALLC and DISC will coexist as cooperators. On the other hand, for high values of intolerance H 0 , at early stages, DISC players will have, on average, a fraction 1 − µ IS of non-cooperative actions in their memory such that µ IS < H 0 , and therefore will likely be negatively evaluated by other DISC players. Each strategist will cooperate against a different set of strategies: ALLC against any strategist, DISC against ALLC ones, and ALLD against no strategist. This results in a three-strategies scenario where the higher payoff corresponds to ALLD players, who will invade ALLC and ALLD.
• Simple Standing: This rule is the most tolerant: the only negatively evaluated action is defecting against a cooperator. Counterintuitively, while high values of intolerance H 0 promote cooperation, low values drive to non-cooperative states. Here, DISC players will positively evaluate CC, CD and DD actions. As the available actions for ALLC players are {CC,CD}, and those for DISC ones are {CC,CD, DD}, DISC strategists will always cooperate against any ALLC or DISC player. For low values of H 0 , at early stages, ALLD players will have, on average, a fraction µ SS of DD actions in their memory such that µ SS > H 0 , and therefore will be positively evaluated by DISC strategists. This is the same situation than that corresponding to the previous Stern Judging -low H 0 case: DISC players will behave as ALLC players. ALLD players will obtain the higher accumulated payoff, resulting in an invasion of ALLD over the rest of strategies. Only for very high values of b, DISC players can resist invasion by ALLD. As H 0 increases, µ SS − H 0 decreases, moving towards the following scenario: for high values of intolerance H 0 , ALLD players will have, on average, a fraction µ SS of DD actions at the early stages such that µ SS < H 0 , and therefore will be negatively evaluated by DISC players. Thus, DISC players will cooperate when facing DISC and ALLC, but very unlikely when facing ALLD players. As ALLC strategists will cooperate against any player, ALLC and DISC players can form cooperative clusters for high enough b, invading ALLD strategists who only receive donations from ALLC. In the absence of ALLD players, ALLC and DISC strategists will behave alike and will coexist.

B. Spatial patterns
In this subsection, we analyze the strategies evolution through characteristic snapshots. Taking the Shunning Rule as an example, Fig. 3 and Fig. 4 display the strategies distribution on the square lattice at different generations (t) to scrutinize the evolutionary process, for a low intolerance (H 0 = 0.2) in Fig. 3 and large (H 0 = 0.9) in Fig. 4. As mentioned above, for low values of H 0 , ALLC and DISC players are more easily to be positively evaluated: DISC strategists will cooperate when facing DISC and ALLC players, allowing the formation of cooperative clusters. In Fig. 3, it is shown that ALLD strategists (gray dots) are gradually invaded by ALLC and DISC ones, and they disappear after around 100 generations. Even if we reduce the benefit of the recipient (say, b = 2 or 3), similar patterns are still observed (although they are not shown here for the sake of shortness). However, when H 0 is large enough (e.g., H 0 0.5), ALLC and DISC players are often negatively evaluated, which leads DISCs to act as ALLD strategists. Thus, for largue H 0 , ALLD and DISC strategies are equivalent and invade ALLC. This behavior is shown in Fig. 4, where gray (ALLD) and blue (DISC) dots dominate the whole population just after 10 generations.
Similarly, the Image Scoring Rule also creates the coexistence between ALLC and DISC ones for low values of the intolerance H 0 , and even this case appears for larger H 0 when compared to the Shunning Rule (the corresponding characteristic snapshots are not shown here for the sake of brevity). Conversely, for high intolerance H 0 , as mentioned above, DISC players cooperate when facing ALLC but not DISC ones. Thus, ALLD players obtain the higher benefit and dominate the whole population. This evolutionary process is illustrated in Fig. 5, where the simulation setup is identical to that of Fig. 4 except that here applies the Image Scoring Rule. The same approach can be used to characterize the competition among three strategies under Stern Judging and Simple Standing rules (not shown here for conciseness).

V. DISCUSSION AND CONCLUSIONS
In this paper, we combine four typical second-order assessment rules with the memory effect to explore the evolution of cooperation in the spatial donation game. To this end, the reputation evaluation takes into account the last M actions of the agents. We discuss the impact of four assessment rulesnamely Shunning, Stern Judging, Image Scoring, and Simple Standing-on the level of cooperation among the population. It is found that the assessment rule plays a non-trivial role in the evolution of cooperation.
In our model, the interplay between any pair of players can be characterized as a Donation Game, where a player is chosen as a donor and the other one as a recipient. If the donor contributes by paying a cost c, the recipient will get a benefit b > c; otherwise, both will get nothing. We implement two dynamics, one in which strategies do not take into account neighbors' payoffs but their reputation, modulated by an intolerance parameter, and another evolutionary dynamic that takes place at a larger time scale.
We have studied the model through a mean-field approxi-mation, finding that the role of intolerance varies according to the assessment rule: while under Shunning, Stern Judging and Image Scoring rules intolerance hinders cooperation, it counterintuitively promotes it under Simple Standing rule. Moreover, it is shown that Stern Judging rule, despite being a positive rule (positively evaluates more actions than the Shunning Rule), is the one that shows, by far, the lowest values of cooperation. We have performed extensive simulations that confirm these findings. Furthermore, there are several other parameters (e.g., noise parameter K and memory length M) that deserve consideration in future research. As K increases, the strategy adoption uncertainty is also increased, but the level of cooperation can still be qualitatively kept unchanged in the current setup. With regard to the impact of memory length or weight, we only adopt the parameter values (M = 5 and w = 0.165) in Ref. 40 , but it may deserve further discussion in the future. Meanwhile, observation or reputation evaluation errors may take place during the decision of donation, which is also worth being further investigated in future studies. Another potential direction could be conducted to explore the impact of secondorder assessment rules in heterogeneous networks, such as small-world 45 and scale-free 46 networks.
Taking together, based on previous experimental findings on human behavior, we present a novel second-order evaluation model with memory effect to investigate the evolution of cooperation in the spatial donation game. These results may help to understand the cooperative behavior under the indirect reciprocity and reputation mechanisms.