Figure 1. Temporal variation of the annual mean PDO index, 1960-2000.

**Abstract**

**The findings of the Vincent et al. (2005) study were used by the Intergovernmental Panel on Climate Change (IPCC) in its fourth assessment report (Trenberth et al., 2007) to support the widespread belief of anthropogenic global warming/climate change. Based on our robust results, we argue that the case for anthropogenic global warming/climate change is unsupported by the Vincent et al. (2005) study. Moreover, our findings cast doubt and suspicion on the conclusions of other similar studies on trends in extreme climate indices that suffer from much the same flaws in homogeneity testing and trend analysis as the Vincent et al. (2005) study. Some of these other studies include, but are not limited to, Frich et al. (2002), Peterson et al. (2002), Aguilar et al. (2005), Alexander et al. (2006), to identify just a few.**

**Introduction**

In Vincent et al’s. (2011) reply to our Comments (Stone, 2011), the authors made two main claims that are worthy of attention. Their first claim is that we inappropriately applied the Pettitt test (Pettitt, 1979) and therefore falsely identified the presence of step changes in the regional data series of cold and warm nights over southern South America for the period 1960-2000. They argued that since the Pettitt test cannot discriminate between a real step change and a long-term trend, the 1976 change point in the cold-nights series and the 1979 change point in the warm-nights series are artifacts of the improper application of the Pettitt test.

Their second claim is that even after taking into account the influence of the Pacific decadal oscillation (PDO), the changes in the linear trend of cold and warm nights, though statistically significant, were relatively small in comparison with the much larger statistically significant long-term regional warming trend. They arrived at this conclusion by using a multiple linear regression model with cold nights or warm nights as the dependent variable and with time (year), and the annual mean of the PDO index as the independent variables.

We refute the first claim by reporting the results of analyses using a diverse suite of twelve additional change point tests including two two-phase regression tests that are related to the two-phase regression-based approach proposed by Vincent (1998). All of the tests detected the same two change points detected by the Pettitt test, thereby, convincingly invalidating the claim that the Pettitt test was improperly applied.

These robust results suggest that the practice adopted by the authors of not checking for change points in the data series of temperature indices, at each individual station or at the regional level, is seriously flawed as it is based on the erroneous assumption that the absence of inhomogeneities at the annual time scale implies the absence of inhomogeneities at the daily time scale. It is, of course, quite possible for annual daily maximum and minimum temperatures to appear homogeneous but still contain inhomogeneities at the daily time scale. Since extreme temperature indices are computed using daily maximum and minimum values, it is not surprising to detect inhomogeneities in the data series of extreme temperature indices despite not detecting any in the annual mean maximum and minimum temperature series. Thus, homogeneity testing of the data series of extreme temperature indices at each station is essential before embarking on a search for trends.

Another plausible explanation for the existence of change points in the data series is the reliance on one change point test. Several performance evaluation and comparison studies have clearly shown that no single change point test is able to provide accurate and reliable results for the wide variety of inhomogeneity situations encountered in practice (e.g. DeGaetano, 2006; Reeves et al., 2007). It is, therefore, plausible that the changepoints in some annual series simply escaped detection. To avoid this problem, a diverse suite of appropriate change point tests is highly recommended.

We refute the second claim by demonstrating that the correlation between warm nights or cold nights, and time (year) and the annual mean of the PDO index are all spurious, thereby, yielding a spurious multiple linear regression model. Since no meaningful correlation exists between the variables, the results of the trend analyses using this regression model are inaccurate, unreliable and misleading.

**The Pettitt Test**

The authors used computer simulations to prove that the Pettitt test is likely to falsely identify a genuine trend as a significant step in a data series and, by so doing, consciously or unconsciously, evaded the key issue as to whether the two regional data series exhibit a trend or a flat-step change. The simulation exercise was unnecessary as its results are not unexpected since a data series with a significant trend is obviously not independent and identically distributed (random). Therefore, it should be obvious that the existence of a significant trend in a data series is likely to result in the rejection of the null hypothesis of randomness when the Pettitt test is inappropriately applied. This is why one should not blindly apply the Pettitt test, or any other statistical test, before visually inspecting a time series plot of the data to determine whether a step change, a trend, or both are present.

In this regard, Brockwell and Davis (2002) advised as follows:

“The first step in the analysis of any time series is to plot the data. If there are apparent discontinuities in the series, such as a sudden change of level, it may be advisable to analyze the series by first breaking it into homogeneous segments.”

On the other hand, if both a step change and a trend are apparent, Rodionov (2005) advised that the data series must be first be de-trended before application of the Pettitt test to detect the step change. The visual inspection of a suitable graphical display of the data is therefore essential in deciding on the choice of appropriate formal statistical tests.

This explains why the importance of graphical data display prior to formal statistical analyses has been so widely emphasized in the statistical literature (Anscombe, 1973; Cox, 1978; Fienberg, 1979; Chatfield, 1985; Chatfield, 1991; Chambers et al., 1983). More specifically, Chambers et al. (1983) explained the reason for and benefit of the visual inspection of graphs in data analysis as follows:

“There is no single statistical tool that is as powerful as a well-chosen graph. Our eye-brain system is the most sophisticated information processor ever developed, and through graphical displays we can put this system to good use to obtain deep insight into the structure of data. An enormous amount of quantitative information can be conveyed by graphs; our eye-brain system can summarize vast information quickly and extract salient features, but it is also capable of focusing on detail. Even for small sets of data, there are many patterns and relationships that are easier to discern in graphical displays than by any other data analytic method.”

The renowned statistician, C.R. Rao, who also strongly advocated the use of graphs in data analysis, summed it up quite nicely when he declared (DeGroot, 1987):

“The analysis is only to make sure that we are not deceived by what we are seeing.”

The first step in our study, therefore, was to plot the time series of cold nights and warm nights and perform a visual inspection to identify patterns in the data. Our visual inspection revealed the presence of a flat-step change in each data series. To ensure that we were not deceived by what we were seeing, we applied seven statistical tests, namely, the Pettitt test (PT) (Pettitt, 1979), the standard normal homogeneity test (SNHT) (Alexandersson, 1986), the Buishand range test (BRT) (Buishand, 1982), the von Neumann ratio test (VNRT) (von Neumann, 1941; Buishand, 1982), the Buishand Bayesian test (BBT) (Buishand, 1984), the Worsley likelihood ratio test (WLRT) (Worsley, 1979) and the penalized maximal t-test (PMTT) (Wang et al., 2007). All the tests, with the exception of VNRT, are location specific tests, that is, they not only detect a change point but also indicate the year of the change point. The VNRT, on the other hand, confirms the presence of a change point but does not indicate when it occurs. It is worth noting that the PT, SHNT, BRT and the VNRT together comprise the Wijngaard et al. (2003) homogeneity assessment procedure. According to this procedure, if three or four tests detect significant change points at the 1% level, the series is classified as “suspect” and deemed to be unsuitable for use in the analysis of trend and variability.

The decision to use a diverse suite of tests was based on the recommendation of several researchers arising from studies on the performance evaluation and comparison of the various change point tests. DeGaetano (2006) carried out a comparative study of seven common techniques and reported that no one technique is a panacea for detecting discontinuities as each technique has its own set of strengths and weaknesses. Reeves et al. (2007) cautioned against the practice of using only one statistical technique for detecting change points by noting that there is no unique best technique among the numerous available techniques for all commonly encountered inhomogeneity situations, many of which yield conflicting conclusions even when applied to the same data series, and advised that different techniques may be required for different climate elements or the same climate elements on different time scales. More recently, Toreti et al. (2010) asserted that the comparison of detected break points by several tests is a good strategy, especially when metadata is not available, as it gives robust results avoiding overestimation and correction of false inhomogeneities.

Unfortunately, the results of six statistical tests mentioned above were not included in our published manuscript (Stone, 2011) because reviewer A thought it unnecessary and made the following comment/suggestion with respect to an earlier version of our manuscript:

“There is a considerable amount of overkill in the analysis. More than half a dozen statistical tests are applied and considerable discussion is devoted to the results. The main point, that some of the series exhibit change-points can be made using one of the change-point tests – – in fact, these change-points are obvious via visual examination.”

We concurred with reviewer A and acted on his suggestion as we also thought at that time that the step changes near the middle of the two data series would be obvious to anyone with minimal exposure to and experience in homogeneity assessment and testing. We opted for the nonparametric Pettitt test for three reasons – it does not rely on the assumption of normality or any particular distributional assumption, it is less sensitive to outliers, and it is more sensitive to change points in the middle of the series (Wijngaard et al, 2003).

We have now, in addition to the additional six tests mentioned above, also carried out analyses using six other popular change point tests found in the climatology, statistics and econometrics literatures, namely, the two-phase regression model with different trends (TPRLR) (Lund and Reeves, 2002), the two-phase regression model with common trend (TPRW) (Wang, 2003), the REC-CUSUM test (Brown et al., 1975), the OLS-CUSUM test (Ploberger and Krämer, 1992), the supF test (Andrews, 1993) and the Hinkley likelihood ratio test (HLRT) (Hinkley, 1970). The CUSUM and F tests were implemented in the R package **strucchange** (Zeileis et al., 2002) whereas the HLRT was implemented in the R package **changepoint **(Killick and Eckley, 2011). The results of these twelve additional tests are shown in Table 1 together with the results of the Pettitt test.

Table 1 shows that the additional eleven location-specific tests all detected the same change points identified by the Pettitt test, namely, 1976 and 1979, for the data series of cold nights and warm nights respectively. The VNRT confirmed the presence of the significant change points in both cases. The results of all tests were statistically significant at the 1% level except for the two two-phase regression tests which both produced results that were significant at the 5% level for the cold-nights data series. These results therefore convincingly demonstrate that the two change points are indeed real and, as a result, invalidate the authors’ claim that they are mere artifacts of the improper application of the Pettitt test.

To further confirm the presence of flat step-type changes in the two data series, we fitted the Lundand Reeves (2002) version of the two-phase regression model for changepoint detection at an undocumented time, c. The model allows both step-type (μ_{1 }≠μ_{2}) and trend-type (β_{1} ≠ β_{2}) changepoints and takes the following form:

Y_{t} = µ_{1} + β_{1} t + ε_{t, }1 ≤ t ≤ c

Y_{t} = µ_{2} + β_{2} t + ε_{t, } c + 1 ≤ t ≤ n

Where, Y_{t} is the data series of interest (cold nights and warm nights),

t = time in years (1 to 41)

n = number of years (41)

μ_{1 = }mean level before the change point

β_{1 }= slope before the change point

μ_{2 }= mean level after the change point

β_{2 }= slope after the change point

and, {ε_{t}} are the errors with zero mean, IID, and Gaussian with variance σ_{ε}^{2 }.

The null (**H _{0}**) and alternative (

**H**) hypotheses of this two-phase regression model are:

_{A}**H _{O}: μ_{1 = }μ_{2 }and β_{1} = β_{2}**

**H _{A}: μ_{1 }≠μ_{2 }and/or β_{1} ≠ β_{2}**

_{ }

The results of the regression analyses are shown in Table 2. For both the cold-nights and warm-nights data series, it can be seen that the slopes before and after the change points are not statistically significant (P > 0.05) and thus not significantly different from zero. In each data series, therefore, μ_{1 }≠μ_{2} and β_{1} = β_{2} = 0, proving, beyond doubt, the presence of flat step changes in the two data series. These results indicate an abrupt decreasing step change of 3.5% after 1976 for the percentage of cold nights and an abrupt increasing step change of 4.0 % after 1979 for the percentage of warm nights, without any significant trends on either side of the changepoints. These step changes are similar to the corresponding values 3.5% and 3.8% for cold and warm nights respectively which we found previously based on the difference in median values of the subseries before and after the changepoints.

**Using the PDO to explain the variability in cold and warm nights**

Figure 1 shows a time series plot of the annual means of the standardized monthly Pacific decadal oscillation (PDO) index for the period 1960-2000. The PDO index can be seen to be fluctuating randomly about some constant level and then around the mid-1970s increases abruptly to once again fluctuate randomly about some new constant level until after 1998 when it seems to experience a sudden drop. The Pettitt test as well as the other location-specific tests listed in Table 1 identified 1975 as the statistically significant change point (P < 0.01). It is therefore evident that this data series does not exhibit a monotonic trend but a flat step change or level shift. The data series thus comprises two homogeneous subperiods, 1960-1975 and 1976-2000, that are statistically different from each other and therefore non-homogeneous relative to each other.

This abrupt step change is similar to those observed in the data series of cold and warm nights. Each data series is inhomogeneous but comprises two homogeneous subseries with different means. It is well known that the use of these kinds of data series in linear regression analysis could indicate the presence of a statistically significant linear trend when none in fact exists (Kundzewicz and Robson, 2000). This is a common type of spurious correlation/regression involving non-homogeneous groups (non-homogeneous relative to each other but homogeneous within group) and is explained in some introductory statistical texts (e.g. Hill and Lewicki, 2007; Kirk, 2008). Spurious correlation occurs when there is as a high correlation between two variables despite the fact that no meaningful relationship exists between the two variables.

Hill and Lewicki (2007) and Kirk (2008) show that a high correlation may result due to the heterogeneity of the data caused by clusters of two or more groups. Kirk (2008) provides a general explanation of the phenomenon as follows:

“A substantial correlation between X and Y can occur because the sample of participants contains two or more subgroups with means that differ for both variables. The correlation coefficient computed for the combined samples will be much higher than for each sample taken alone. This occurs because the means for the two subgroups differ with respect to both X and Y.”

Hill and Lewicki (2007) note that the high correlation is entirely due to the arrangement of the two groups as it does not represent the “true” relation between the two variables, which may be practically zero.

The phenomenon is demonstrated by the results of correlation analyses between cold nights, warm nights, PDO index and year shown in Table 3. For cold nights and year, the correlation coefficients for the subperiods 1960-1976 and 1977-2000 are low, 0.11 and 0.06 respectively, and not statistically significant (P > 0.05) indicating a lack of significant linear trend in cold nights in each subperiod. However, the correlation coefficient for the entire period is relatively high and highly statistically significant (P < 0.001). These results are typical of spurious correlation. The same pattern is observed for warm nights and year, and for the PDO index and year. These results suggest that there is no true linear relationship between cold nights, warm nights, PDO index and year and the statistically significant correlations are all spurious. This implies that the changes experienced by the data series of cold nights, warm nights and the PDO index over the period 1960-2000 are not linear trends but flat step changes or jumps.

Hassler (2003) investigated the phenomenon of nonsense or spurious regressions using a Monte Carlo experiment and showed that spurious regressions can arise because of simple mean shifts in time series without trends as long as the mean shifts are neglected. He suggested that one could use R^{2} (and thus R) to detect spurious regression due to mean shifts in time series and highlighted the importance of testing for mean shifts and removing them before running regressions as a remedy for dealing with this kind of nonsense or spurious regression.

Consequently, since the time series of cold nights, warm nights and the PDO index all exhibit mean shifts, their use in a regression model is likely to yield spurious results. The phenomenon is again demonstrated in Table 4. The absolute values of the correlation coefficients between cold nights and the PDO index for the subperiods 1960-1976 and 1977-2000 are both low, 0.15 and 0.18 respectively, and not statistically significant (P > 0.05) clearly indicating the lack of a meaningful relationship between these two variables. On the other hand, the absolute value of the correlation coefficient for the entire period, 1960-2000, is relatively high and highly statistically significant (P < 0.001). The same pattern is observed in the correlation between warm nights and the PDO index. These results suggest that there is no true linear relationship between cold nights or warm nights and the PDO index and that the statistically significant correlations are all spurious.

Ergo, the multiple regression model constructed by the authors to explain the variability in cold nights and warm nights as a function of year and the annual mean of the PDO index is spurious thereby yielding results that are inaccurate, unreliable and misleading. This suggests that there is no meaningful quantifiable relationship between the temperature indices and the PDO index. As a result, all one can reasonably say, based on the available data at this time, is that the change in the PDO from the cold phase to the warm phase was associated with an abrupt decrease and an abrupt increase in the frequency of cold nights and warm nights, respectively.

**Summary and concluding remarks**

We endorse the authors’ position that the proper use of any statistical test requires the careful consideration of its underlying assumptions. We also agree that a step change could be falsely identified as a trend while a trend could be falsely identified as a step change and therefore the nature of the change must first be verified in order to select appropriate statistical tests. In this regard, we strongly advocate the use of graphical displays as an indispensable aid in deciding on the selection of appropriate statistical tests.

We strongly disagree with the authors that

“RJS assumed that there was not a trend in the temperature indices, without any due justification, and indicated step changes in the indices series.”

On the contrary, we visually inspected time series plots of both temperature indices which clearly indicated the presence of changepoints. We confirmed that we were not being deceived by what we were seeing through the use of the Pettitt test which identified 1976 and 1979 as statistically significant changepoints in the data series of cold nights and warm nights respectively.

We have now applied a diverse suite of change point tests whose results conclusively corroborate those of the Pettitt test. In addition, we have used the two-phase regression model of Lund and Reeves (2002)

to show that the trends before and after the change points are not statistically significant (P > 0.05) in both cases thereby providing further confirmatory evidence for the presence of step-type change points. These results again verify the appropriateness of the Pettitt test and also substantiate the results of the Sen (1968) nonparametric linear regression method which showed a lack of significant trends before and after the change points (Stone, 2011).

We have demonstrated that the correlations between the regional temperature indices and year, and regional temperature indices and the PDO index are all spurious or nonsensical owing to the presence of mean shifts in the temperature indices and the PDO index. Therefore, the results of the regression analyses which allegedly show that the PDO does not fully account for the trends in regional temperature indices are inaccurate and misleading. Indeed, our results convincingly demonstrate that there was no long-term increasing linear trend in regional warming during the period 1960-2000 in southern South America. Therefore, the conclusions in Stone (2011) remain valid and are, in fact, are appreciably strengthened by the robust results of substantial additional analyses.

In closing, it is important to note that the findings of the Vincent et al. (2005) study were used by the Intergovernmental Panel on Climate Change (IPCC) in its fourth assessment report (Trenberth et al., 2007) to support the widespread belief of anthropogenic global warming/climate change. Based on our robust results, we argue that the case for anthropogenic global warming/climate change is unsupported by the Vincent et al. (2005) study. Moreover, our findings cast doubt and suspicion on the conclusions of other similar studies on trends in extreme climate indices that suffer from much the same flaws in homogeneity testing and trend analysis as the Vincent et al. (2005) study. Some of these other studies include, but are not limited to, Frich et al. (2002), Peterson et al. (2002), Aguilar et al. (2005), Alexander et al. (2006), to identify just a few.

Finally, we believe that a compelling case has been made for the above-mentioned studies to be critically re-visited by independent and objective researchers as it would be unreasonable, in light of our robust results, to continue to consider their dubious conclusions as accurate and reliable evidence in support of the current anthropogenic greenhouse warming theory.

**References**

Aguilar, E., and Coauthors. 2005: Changes in precipitation and temperature extremes in Central Americaand northern South America, 1961-2003. J. Geophys. Res., **110**, D23107, doi:0.1029/2005JD006119.

Alexander, L.V. and Coauthors, 2006: Global observed changes in daily climate extremes of temperature and precipitation. J. Geophys. Res., **111**, D05109, doi: 10.1029/2005JD006290.

Alexandersson, H., 1986: A homogeneity test applied to precipitation data. J. Climatol., **6**, 661-675.

Andrews, D.W.K., 1993: Tests for parameter instability and structural change with unknown change point. Econometrica, **61**, 821-856.

Anscombe, F.J., 1973: Graphs in statistical analysis. Amer. Statist., **27**, 17-21.

Brockwell, P.J. and R.A. Davis, 2002: Introduction to Time Series and Forecasting. 2nd edition,Springer-Verlag,New York.

Brown, R.L., J. Durbin, and J.M.Evans, 1975: Techniques for testing the constancy of regression relationships over time. J. Roy. Stat. Soc., Ser. B, **37**, 149-163.

Buishand, T.A., 1982: Some methods for testing the homogeneity of rainfall records. J. Hyrol., **58**, 11-27.

Buishand, T.A, 1984: Tests for detecting a shift in the mean of hydrological time series. J. Hydrol., **73**, 51-69.

Chatfield, C., 1985: The initial examination of data. J. Roy. Stat. Soc., Ser. A, **148**, Part 3, 214-253.

Chambers, J.M., W.S. Clevland, B. Kleiner and P.A. Tukey, 1983: Graphical methods for data analysis. Duxbury Press,Boston.

Chatfield, C., 1991: Avoiding statistical pitfalls, Stat. Sci., **6**, 240-268.

Cox, D.R., 1978: Some remarks on the role in statistics of graphical methods. Appl. Stat., **27**, 4-9.

DeGaetano, A.T., 2006: Attributes of several methods for detecting discontinuities in mean temperature series. J. Climate, **19**, 838-853.

DeGroot, M.H., 1987: A conversation with C.R. Rao. Stat. Sci., **2**, 53-67.

Fienberg, S.E., 1979: Graphical methods in statistics. The American Statistician, **33**, 165-178.

Frich, P., L.V. Alexander, P. Della-Marta, B. Gleason, M. Haylock, A.M.G. Klein Tank and T. Peterson, 2002: Observed coherent changes in climatic extremes during the second half of the twentieth century. Clim. Res., **19**, 193-212.

Hassler, U., 2003: Nonsense regressions due to neglected time-varying means. Statistical Papers, **44**, 169-182.

Hill, T. and P. Lewicki, 2007: STATISTICS Methods and Applications. Statsoft,Tulsa, O.K.

Hinkley, D.V., 1970: Inference about the change-point in a sequence of random variables. Biometrika, **57**, 1-17.

Killick, R. and I.A. Eckley, 2011: **changepoint**: contains functions that run various single and multiple changepoint methods. URL: cran.r-project.org/web/packages/changepoint/

Kirk, R.E., 2008: Statistics: An Introduction. 5th edition. Thompson/Wadsworth,Belmont,CA.

Kundzewicz, Z. W., and A. Robson, Eds., 2000: Detecting trends and other changes in hydrological data. World Climate Programme – Water, World Climate Programme Data and Monitoring Rep. WCDMP-45, WMO-TD 1013, World Meteorological Organization.

Lund, R., and J. Reeves, 2002: Detection of undocumented changepoints: A revision of the two-phase regression model. J. Climate, **15**, 2547-2554.

Peterson, T.C., and Coauthors, 2002: Recent changes in climate extremes in the Caribbeanregion. J. Geophys. Res. **107**(D21), 4601, doi: 10.1029/2002JD002251.

Pettitt, A. N., 1979: A non-parametric approach to the changepoint problem. Appl. Stat., **28**, 126-135.

Ploberger, W., and W. Krämer, 1992: The CUSUM test with OLS Residuals. Econometrica, **60**, 271-285.

Reeves, J., J. Chen, X.L. Wang, R. Lund, and Q.Q. Lu, 2007: A review and comparison of changepoint detection techniques for climate data. J. Appl. Meteorol. and Climatol. **46**, 900-915.

Rodionov, S., 2005: A Brief Overview of the Regime Shift Detection Methods. In: Large-Scale Disturbances (Regime Shifts) and Recovery in Aquatic Systems: Challenges for Management Toward Sustainability, V. Velikova and N. Chipev (Eds.),UNESCO-ROSTE/BAS Workshop on Regime Shifts, 14-16 June 2005, Varna, Bulgaria, pp. 17-24.

Sen, P.K., 1968: Estimates of the regression coefficient based on Kendall’s tau. J. Amer. Stat. Assoc., **63**, 1379-1389.

Stone, R.J., 2011: Comments on “Observed Trends in Indices of Daily Temperature Extremes in South America”. J. Climate, **24**, 2880-2883. DOI:10.1175/2011JCLI3662.1.

Toreti, A., F.G. Kuglitsch, E. Xoplaki, P.M. Della-Marta, E. Aguilar, M. Prohom, and J. Luterbacher, 2010: A note on the use of the standard normal homogeneity test to detect inhomogeneities in climate series. Int. J. of Climatol., **31**, 630-632.

Trenberth, K.E., P.D. Jones, P. Ambenje, R. Bojariu, D. Easterling, A. Klein Tank, D. Parker, F. Rahimzadeh, J.A. Renwick, M. Rusticucci, B. Soden and P. Zhai, 2007: Observations: Surface and Atmospheric Climate Change. In: Climate Change 2007: The Physical Science Basis. Contribution of Working Group I to the Fourth Assessment Report of the Intergovernmental Panel on Climate Change [Solomon, S., D. Qin, M. Manning, Z. Chen, M. Marquis, K.B. Averyt, M. Tignor and H.L. Miller (eds.)].CambridgeUniversityPress,Cambridge,United KingdomandNew York,NY,USA. pp. 235-336.

Vincent, L.A., 1998: A technique for the identification of inhomogeneities in Canadian temperature series. J. Climate, **11**, 1094-1104.

Vincent, L.A., and Coauthors. 2005: Observed trends in indices of daily temperature extremes in South America 1960-2000. J. Climate, **18**, 5011-5023.

Vincent, L.A., X. Zhang and X.L. Wang. 2011. Reply to Comments on “Observed Trends in Indices of Daily Temperature Extremes in South America 1960-2000”. J. Climate, **24**, 2884-2887, doi:10.1175/2011JCLI3786.1.

von Neumann, J., 1941: Distribution of the ratio of the mean square successive difference to the variance. Ann. Math. Stats., **13**, 367-395.

Wang, X.L., 2003: Comments on “Detection of Undocumented Changepoints: A Revision of the Two-Phase Regression Model”. J. Climate, **16**, 3383-3385.

Wang, X.L., Q.H. Wen, and Y. Wu, 2007: Penalized maximal t test for detecting undocumented mean change in climate data series. J. Appl. Meteorol. and Climatol., **46**, 916-931.

Wijngaard, J. B., A.M.G. Klein Tank and G.P. Können, 2003: Homogeneity of 20th century European daily temperature and precipitation series. Int. J. Climatol., **23**, 679-692.

Worsley, K.J., 1979: On the likelihood ratio test for a shift in location of normal proportions. J. Am. Stat. Assoc., **74**, 365-367.

Zeileis, A., F. Leisch, K. Hornik, and C. Kleiber, 2002: **strucchange**: An R package for testing for structural change in linear regression models. J. of Statistical Software, **7**, 1-38. URL: http://www.jstatsoft.org/v07/i02/.

Table 1. Detected change points in the regional annual series of cold nights and warm nights by the various statistical change point tests with their corresponding percentage levels of statistical significance.

Statistical Test | Cold Nights | Significance level (%) | Warm Nights | Significance level (%) |

PT | 1976 |
1 |
1979 |
1 |

SHNT | 1976 |
1 |
1979 |
1 |

BRT | 1976 |
1 |
1979 |
1 |

VNRT | Confirmed |
1 |
Confirmed |
1 |

BBT | 1976 |
1 |
1979 |
1 |

WLRT | 1976 |
1 |
1979 |
1 |

PMTT | 1976 |
1 |
1979 |
1 |

TPRLR | 1976 |
1 |
1979 |
5 |

TPRW | 1976 |
1 |
1979 |
5 |

REC-CUSUM | 1976 |
1 |
1979 |
1 |

OLS-CUSUM | 1976 |
1 |
1979 |
1 |

supF | 1976 |
1 |
1979 |
1 |

HLRT | 1976 |
1 |
1979 |
1 |

Table 2. Regression results of theLundand Reeves two-phase regression model

Index | µ_{1} |
µ_{2} |
β_{1} |
R | P-value | β_{2} |
R | P-value |

Cold nights | 11.9 | 8.4 | -0.0088 | 0.05 | 0.83 | -0.0005 | 0.01 | 0.95 |

Warm nights | 8.7 | 12.7 | 0.0128 | 0.08 | 0.74 | 0.0049 | 0.05 | 0.82 |

Table 3. Correlation analyses between cold nights, warm nights, PDO index and year

Table 4. Correlation analyses between cold nights, warm nights and the PDO index