更全的杂志信息网

Evaluation of Ordinary Least Square (OLS) and Geographically Weighted Regression (GWR) for Water Quality Monitoring:A Case Study for the Estimation of Salinity

更新时间:2016-07-05

1 Introduction

Salinity is the concentration of dissolved salts in water and reported in Practical Salinity Units (PSU). Salinity is a key physical parameter of an ecosystem as it affects the water quality, growth and development of the aquatic vegetation and the different animal species (Meier et al.,2011). It varies in an estuary depending on the location,flow of input fresh water and daily tides, while it is somehow constant in open ocean (35) (Gillanders and Kingsford, 2002). In a coastal ecosystem, the salinity is generally lower at the upstream where a river enters in the ocean. It also depends on seasons, e.g., it is low in spring when abundant rainfalls increase the fresh water flow in an estuary, and usually high in summer when evaporation rate increases due to high temperature (Rudek et al.,1991). Salinity variations, one of the main drivers of ocean circulation, are closely connected with the cycling of freshwater around the planet and provide scientists with valuable information on global rainfall patterns.

Satellite remote sensing data have been extensively used for the monitoring of water quality parameters(Nazeer and Nichol, 2016a, 2015). Aquarius is the first satellite instrument specifically built and launched in June 2011 to study the salt contents of oceanic water. The instrument collects data in 386 km swaths with a temporal resolution of 7 days. There are several other sensors including the Landsat-5 (L5) Thematic Mapper (TM), the Landsat-7 (L7) Enhanced Thematic Mapper Plus (ETM+)and the Moderate Resolution Imaging Spectroradiometer(MODIS) aboard Terra/Aqua. Although these satellites were not specifically designed for the estimation of salinity, their potential has been explored by several studies(e.g., Lavery et al., 1993; Xie et al., 2013; Zhao et al.,2016). Salinity has an inverse relationship with Coloured Dissolved Organic Materials (CDOM), and with the increase of salinity, the solubility of the organic matters decreases (Harvey et al., 2015). There are two absorption peaks of CDOM, i.e., one in blue portion of the spectrum and the other in ultraviolet. This absorption property of CDOM makes L5 TM band 1 very useful to be as an index for salinity because of the inverse relationship of CDOM and salinity (D’Sa et al., 2000). While due to the correlation of the L5 TM bands 3 and 4 with water depth and aquatic vegetation, it is believed that these two bands are indirectly related to salinity. The previous studies(Alvera-Azcárate et al., 2016; Chen et al., 2016; Li et al.,2016; Qi and Wei, 2012) lack in the estimation of salinity at higher spatial resolution and do not provide a direct comparison between the Ordinary Least Square (OLS) regression and Geographically Weighted Regression (GWR)models. Therefore, this study aims to develop an empirical model for the remote sensing based estimation of salinity at 30 m spatial resolution using OLS and GWR regression models.

2 Geographical Setting of Study Area

Hong Kong is a thriving port with a population of more than 7 million people (Census and Statistics Department,2017). Coastal area around Hong Kong and Pearl River Delta (PRD) region are also home to various types of marine life, ranging from microscopic algae to dolphins.Based on the sediment loads and water quality levels, this area can be divided into three different zones (Nazeer and Nichol, 2016b). The eastern water zone is influenced by oceanic currents. Western water zone is relatively clear compared to the eastern zone and is affected by the water from Pearl River estuary.

3 Data Used

3.1 Satellite Data

L5 TM images were used to estimate the salinity. The whole area around Hong Kong is covered in four L5 TM scenes and each of these four scenes contain small portion of Hong Kong. The L5 TM data was freely obtained from the United States Geological Survey (USGS) Earth Explorer website (http://earthexplorer.usgs.gov/). The image acquisition date and their corresponding path/row are provided in Table 1.

Table 1 L5 TM image acquisition date and orbit number(path/row)

Acquisition date Orbit number 10-Oct-2009 121/044 10-Oct-2009 121/045 17-Oct-2009 122/044 17-Oct-2009 122/045

3.2 In situ Salinity Data

The Environment Protection Department (EPD) of Hong Kong has been monitoring the water quality in ten water zones around Hong Kong since 1986. Water quality data are collected every month from 76 sampling stations in open sea areas, typhoon shelters and semi-enclosed bays.Salinity data are collected from three depths (i.e., surface,middle and bottom layers) using the electrical conductivity instrument by the marine monitoring team of the Hong Kong Environmental Protection Department (HKEPD,2016). In this study the ‘surface’ salinity data from ten water zones was obtained from EPD during October, 2009(Table 2).

Table 2 Hong Kong water zones and in situ data collection date during October, 2009

Zone name Sampling date Southern 12-Oct-09 Tolo Harbour and Channel 23-Oct-09 Port Shelter 15-Oct-09 Junk Bay 5-Oct-09 Deep Bay 19-Oct-09 Mirs Bay 19-Oct-09 22-Oct-09 North Western 9-Oct-09 Western Buffer 8-Oct-09 Eastern Buffer 5-Oct-09 Victoria Harbour 7-Oct-09 8-Oct-09

4 Methodology

4.1 Atmospheric Correction

In order to remove the atmospheric artifacts which were introduced by molecules and aerosols during the image acquisition, the atmospheric correction was performed using the 6S atmospheric correction model (Nazeer et al.,2014). 6S is a physical method which uses the ancillary data of water vapor, ozone, aerosol optical depth, sensor and solar view and zenith angles in the atmospheric correction process (Vermote et al., 2006). In a previous study by Nazeer et al. (2014), the atmospheric correction results were validated with the in-situ water surface reflectance data collected by a multispectral radiometer. It was found that 6S model showed less atmospheric correction errors(i.e., less than 3%) than the other physical based methods.Therefore, in this study 6S model was used for atmospheric correction of all the Landsat TM images. After atmospheric correction, the water surface reflectance data was extracted from 76 sampling locations from a window of 3×3 pixels.

4.2 Model Development

BIM平台的资产管理是利用BIM模型的相关信息,运用信息化技术增强资产监管力度,降低资产的闲置浪费,减少和避免资产流失,使业主在资产管理上更加规范,进而从整体上提高业主资产管理水平。

Table 3 Summary statistics of salinity and L5 TM bands

iable Mean St Dev Variance Minimum MedianMaximum inity 31.436 1.844 3.40 22.10 32.0 33.6 d 1 0.052 0.019 0.00035 0.016 0.054 0.153 Band 2 0.071 0.020 0.00041 0.031 0.073 0.161 Band 3 0.037 0.016 0.00026 0.014 0.035 0.140 Band 4 0.021 0.015 0.00024 0.008 0.017 0.139

In order to overcome the problem of spatial heterogeneity, the GWR model is used to predict a particular variable of interest. The GWR is often used when the dependent and independent variables vary across the study area, which is the case of this study as the salinity data was collected for different spatial locations (i.e., latitude and longitude) at different time. Similar to the OLS model,the GWR model was applied to the salinity and same band combinations (i.e., B1/B3 and B4).

In order to develop OLS regression model, the Pearson correlation of salinity was analyzed with different TM bands, band ratios and their combinations. Among all, the ratio of band 1 and band 3 (B1/B3) showed a high correlation of 0.613 which was statistically significant at 95%confidence level (Table 4). The other independent variable considered was TM band 4 (B4). Although its correlation with salinity was low (0.123), it was still taken for the regression model because this band is believed to have an indirect relation with salinity.

大力开展森林防火宣传工作。一是投入110万元,采购森林防火智能预警卡口100个,安装在重点林区关键地段,及时监测进入林区的人员和车辆;二是制作安装大型森林防火警示牌140个,提醒进山人员注意森林防火;三是编制印刷《森林防火宣传手册》、《防火通告》、《倡议书》等宣传材料11万份,发放到全市各林区,努力营造良好的森林防火氛围。

Table 4 Pearson correlation coefficient (R) and P-value for TM bands 1–4 and their combinations in correlation with salinity (AV represents the average of bands)

R P-value Variable R P-value 0.231 0.045 AV(B1, B2) 0.087 0.456−0.051 0.664 AV(B1, B3) 0.052 0.657 B3 −0.157 0.175 AV(B1, B4) 0.201 0.082 B4 0.123 0.029 AV(B2, B3) −0.101 0.387 B4/B1 −0.072 0.538 AV(B2, B4) 0.029 0.805 B4/B2 0.135 0.244 AV(B3, B4) −0.022 0.849 B4/B3 0.192 0.097 AV(B1, B2, B3) 0.013 0.909 B3/B2 −0.298 0.009 AV(B1, B2, B4) 0.108 0.353 B3/B1 −0.736 0.000 AV(B2, B3, B4) −0.036 0.755 B3/B4 −0.27 0.018 AV(B1, B3, B4) 0.08 0.493 B2/B1 −0.503 0.000 BB2 0.099 0.396 B2/B3 0.259 0.024 BB3 0.04 0.732 B2/B4 −0.117 0.316 BB4 0.118 0.31 B1/B2 0.487 0.000 BB3 −0.085 0.468 B1/B3 0.613 0.000 BB4 0.078 0.504 B1/B4 0.058 0.617 BB4 0.054 0.645

A summary of the OLS regression model is provided in Table 5. Variance Inflation Factor (VIF) value for both the explanatory variables was 1.015 which shows that there is no multi-collinearity between the predictor variables.

实际上,烧秸秆带来的大气污染在全部污染中有占很小的份额,并且是季节性的,而工业污染、汽车和其它交通工具的污染是大气污染,尤其是构成雾霾的持续的主要部分。但我们的政策并没有限制汽车等交通工具的使用,也没有限制工业生产。即使搬迁了一些工业企业,其目的也不是为了减少废气排放总量,而是为了地区性环保目的。而在所有这些相关人群中,显然农村居民是收入最低且政治最弱势的群体。

4.2.1 OLS regression model development

The OLS regression model (also known as global model)estimates a parameter of interest independent of the location of the particular observation, while a GWR model estimates the parameter of interest under more localized conditions by considering the location of that observation.The GWR is especially advantageous over OLS when dealing with the spatial datasets. It is reported that OLS models merely present average data over the whole study area and therefore discard large amounts of potentially interesting information on spatial variations of relationships and model performance (Fotheringham et al., 1998;Fotheringham, 1993). This study intends to explore the potential usage of OLS and GWR regression models for the estimation of salinity. Before this work, basic statistical analysis was performed to determine the characteris-tics of each variable, i.e., salinity and surface reflectance of first four bands of L5 TM (Table 3).

A regression model was developed using the ratio of TM bands 3 and 1 (B1/B3) and B4. The regression model for salinity estimation is given in Eq. (1).

他汀类药物被研究应用以来,全球有超过上亿人口都在使用他汀类药物治疗相应的症状,还有数千万人连续服用他汀类药物已经超过了6年,大量的研究结果表明多数患者服用他汀类药物是处于安全状态的。但是我们仍然不能忽视他汀类药物在应用过程中的禁忌,例如当他汀类药物与贝特类药物进行同时服用时,会使患者增加肌病和肌溶解症发生的危险。根据美国食品管理局的调查研究表明,在上世纪90年代大约有40%的患者因为同时服用他汀类药物和贝特类药物,而出现肌溶解症.著名的拜斯亭事件正是因为患者同时服用了立伐他汀和吉非罗齐而产生的严重后果。

5 Results and Discussion

5.1 OLS Regression Model

The GWR is available as statistical tool in the ArcGIS 9.3 toolbox. To perform GWR, salinity was selected as a dependent variable and B1/B3 and B4 as the explanatory variables. An appropriate kernel type is selected based on spatial configuration of the input feature class. Fixed kernel type is preferred when the observation stations are homogeneously distributed in the study area, while adaptive kernel type is used when the observation are clustered. There are three types of bandwidth methods, i.e.,Akaike Information Criterion (AIC), Cross Validation (CV)and bandwidth parameter. AIC and CV select the bandwidth values automatically. While in Bandwidth Parameter option the bandwidth value should be specified. In this study the observations were clustered (as suggested by the Moran’s I test), so an adaptive kernel type is used whose bandwidth was determined using AIC method. Moran’s I test is a measure of the spatial autocorrelation based on feature location and value. For a set of given features and their associated attributes, Moran’s I test evaluates whether the pattern expressed is clustered, dispersed, or random.

The R2 of the OLS model was 0.42 which indicates that regression model can describe 42% of the variation in salinity. The root mean square error of 1.43 showed that salinity values estimated based on Landsat TM may have an uncertainty of 1.43.

4.2.2 GWR model development

Table 5 Salinity regression model diagnostics

?

The normal probability plot (Fig.1) shows a linear pattern consistent with a normal distribution. The two points in the lower left corner of the normal probability plot are outliers; these points are the 35th and 36th observation in salinity data at the same station, which were identified as unusual observations in output. The Versus Fits plot shows that the residuals get close to the reference line as the fitted values increase and are close to the reference line at the fitted value of 32. The residuals which are away from the reference line may have non-constant variance. The outliers are also visible in the left part of the histogram.The versus order plot shows the residual values against the order of observations, here the observation values of Nos. 35 and 36 showed different behaviour as compared to others. Fig.2 shows the predicted salinity map which was estimated using the OLS regression model (Eq. (1)).

总之,在专业课程领域实现专业课程思政,是党和国家对高等教育提出的一个新要求,是非思政类的高等教育者的一项新使命。在具体的专业课程思政教学实践过程中,任课教师还根据实际的教学成效不断改进,努力实现全程全方位育人的专业人才培养目标。

Fig.1 Residual plots for salinity.

Fig.2 Salinity distribution predicted by applying OLS regression on Landsat TM.

5.2 GWR Model

The GWR model was applied to the same datasets as the OLS model. In Table 6 (obtained from GWR model)the neighbours value, 34, shows the nearest neighbours that have been used for estimation of each set of coefficients when the total numbers of observations are 76.This shows that there is about 45% of the data under each kernel. Residual square field shows the sum of the squared residuals having a value of 35.04. The effective number value is 19.046 which is related to kernel bandwidth. For large bandwidths, the effective number is close to the actual number and the GWR model is likely to behave similar to the OLS model. The estimated standard deviation of the residuals, called sigma, with a small value of 0.784, is significant. The value of sigma for OLS and GWR models was compared based on the AIC values. In this study the AIC value for OLS model was 272.897,while for GWR model it was 197.861. So there is a difference of 75.036 which is a strong evidence for the improvement in the results based on the local model (GWR)relative to the global model (OLS). The improved value of R2 based on GWR model was 0.863, two times higher than the OLS model. Fig.3 shows the predicted map of salinity obtained from the GWR model.

Table 6 GWR model diagnostic report

Name Value Neighbours 34 Residual square 35.043 Effective number 19.046 Sigma 0.784 AIC 197.861 R2 0.863 R2 adjusted 0.819

Fig.3 Predicted salinity map obtained from GWR model. The black dots represent the in-situ measurement values of salinity.

6 Discussion and Conclusions

In this study, the coastal water quality of Hong Kong has been assessed by estimating the salinity content in water. Hong Kong Environment Protection Department(EPD) is continuously collecting the water quality data from 76 monitoring stations every month since 1986. In order to monitor the spatial distribution of salinity, two types of regression models were developed to extract the salinity data from the remote sensing data. One was the Ordinary Least Square (OLS) regression model or global regression model, and the other was the Geographically Weighted Regression (GWR) model or local regression model. These models were developed using the in-situ data of salinity from 76 monitoring stations during October, 2009 and the Landsat-5 Thematic Mapper (TM) data at the same time.

The OLS regression model provided the value of R2 of 0.42 for salinity, while GWR model improved the value of R2 to 0.86. The increase in the value of R2 suggested that the values of salinity are location dependant and was better predicted using the GWR regression model. The salinity concentrations, measured by EPD, ranged from 22.1 to 33.6. The OLS regression model predicted salinity values ranging from 28.9 to 33.5, while the GWR regression model predicted a range of salinity values from 24.1 to 32.7.

连接超时的处理办法,即如果发现有个节点接收消息的时间超出了设定的值,那么就删除节点不再连接,关键代码如下:

The results showed that GWR regression model predicted the salinity well in Hong Kong coastal waters. The salinity maps were also generated using the OLS regression equation to analyze the spatial variation of salinity. It was observed that there was a decrease in salinity in the north-western waters (deep bay), which indicates the water of this region is biologically active. This may be the cause of the influx of the industrial water from Shenzhen.The industrial water has high phosphorus content which helps to feed the micro-organisms and therefore this area has high chl-a values. This is because the salinity has an inverse relationship with Coloured Dissolved Organic Matter (CDOM). The salinity value has a constant value of about 33 in the open ocean. This study concludes that the GWR model is robust compared to the OLS model for the estimation of salinity over the regions with large spatial difference.

Acknowledgements

Authors would like to acknowledge the Hong Kong Environmental Protection Department (EPD) for providing the in-situ salinity data and the U.S. Geological Survey for providing Landsat TM images. The National Key Research and Development Program of China (No. 2016 YFC1400901) has sponsored this research.

References

Alvera-Azcárate, A., Barth, A., Parard, G., and Beckers, J. M.,2016. Analysis of SMOS sea surface salinity data using DINEOF. RemoteSensing of Environment, 180: 137-145.

Census and Statistics Department, 2017. Population. Available online at http://www.censtatd.gov.hk/hkstat/sub/so20.jsp (last accessed: 28 February, 2017).

Chen, L., Alabbadi, B., Tan, C. H., Wang, T. S., and Li, K. C.,2016. Predicting sea surface salinity using an improved genetic algorithm combining operation tree method. the Indian Society of Remote Sensing, 45 (4): 699-707.

D’Sa, E. J., Zaitzeff, J. B., and Steward, R. G., 2000. Monitoring water quality in Florida Bay with remotely sensed salinity and in situ bio-optical observations. RemoteSensing, 21: 811-816.

Fotheringham, A. S., Charlton, M. E., and Brunsdon, C., 1998.Geographically weighted regression: A natural evolution of the expansion method for spatial data analysis. Environment and Planning A, 30 (11): 1905-1927.

Fortheringham, A. S., 1993. On the future of spatial analysis:The role of GIS. Environment and Planning A, 25: 30-34.

Gillanders, B., and Kingsford, M., 2002. Impact of changes in flow of freshwater on estuarine and open coastal habitats and the associated organisms. Oceanography and Marine Biology,40: 233-309.

Harvey, E. T., Kratzer, S., and Andersson, A., 2015. Relationships between colored dissolved organic matter and dissolved organic carbon in different coastal gradients of the Baltic Sea.Ambio, 44 (Suppl. 3): 392-401.

Hong Kong Environmental Protection Department, 2016. Marine water quality in Hong Kong in 2015. Environmental Protection Department, Hong Kong. DOI: http://www.epd.gov.hk/epd/.

Lavery, P., Pattiaratchi, C., Wyllie, A., and Hick, P., 1993. Water quality monitoring in estuarine waters using the landsat the matic mapper. RemoteSensing of Environment, 46: 268-280.

Li, C., Zhao, H., Li, H., and Lv, K., 2016. Statistical models of sea surface salinity in the South China Sea based on SMOS satellite data. IEEEJournal of Selected Topics in Applied EarthObservations and Remote Sensing, 9: 2658-2664.

Meier, H., Eilola, K., and Almroth, E., 2011. Climate-related changes in marine ecosystems simulated with a 3-dimensional coupled physical-biogeochemical model of the Baltic Sea. Climate Research, 48: 31-55, DOI: 10.3354/cr00968.

Nazeer, M., and Nichol, J. E., 2015. Combining landsat TM/ETM+ and HJ-1 A/B CCD sensors for monitoring coastal water quality in Hong Kong. IEEEGeoscience and Remote Sensing Letters, 12 (9): 1898-1902.

Nazeer, M., and Nichol, J. E., 2016a. Development and application of a remote sensing-based chlorophyll-a concentration prediction model for complex coastal waters of Hong Kong.Journal of Hydrology, 532: 80-89.

Nazeer, M., and Nichol, J. E., 2016b. Improved water quality retrieval by identifying optically unique water classes. nal of Hydrology, 541: 1119-1132.

Nazeer, M., Nichol, J. E., and Yung, Y. K., 2014. Evaluation of atmospheric correction models and Landsat surface reflectance product in an urban coastal environment. Journal of RemoteSensing, 35: 6271-6291.

Qi, Z., and Wei, E., 2012. Analysis of cost functions for retrieving sea surface salinity. Journal of Ocean University of China,11: 147-152.

Rudek, J., Paerl, H. W., Mallin, M. A., and Bates, P. W., 1991.Seasonal and hydrological control of phytoplankton nutrient limitation in the lower Neuse River Estuary, North Carolina.Marine Ecology Progress Series, 75: 133-142.

Vermote, E. F., Tanré, D., DeuzéHerman, J. L., Herman, M.,Morcrette, J. J., and Kotchenova, S. Y., 2006. Second Simulation of a Satellite Signal in the Solar Spectrum–Vector (6SV),6S user guide, version 3. International Geoscience & Remote Sensing Symposium, 1-55.

Xie, Z., Zhang, C., and Berry, L., 2013. Geographically weighted modelling of surface salinity in Florida Bay using Landsat TM data. Remote Sensing Letters, 4: 75-83.

Zhao, H., Li, C., Li, H., Lv, K., and Zhao, Q., 2016. Retrieve sea surface salinity using principal component regression model based on SMOS satellite data. Journal of Ocean University of China, 15 (3): 399-406.

NAZEERMajid,andBILALMuhammad,
《Journal of Ocean University of China》2018年第2期文献

服务严谨可靠 7×14小时在线支持 支持宝特邀商家 不满意退款

本站非杂志社官网,上千家国家级期刊、省级期刊、北大核心、南大核心、专业的职称论文发表网站。
职称论文发表、杂志论文发表、期刊征稿、期刊投稿,论文发表指导正规机构。是您首选最可靠,最快速的期刊论文发表网站。
免责声明:本网站部分资源、信息来源于网络,完全免费共享,仅供学习和研究使用,版权和著作权归原作者所有
如有不愿意被转载的情况,请通知我们删除已转载的信息 粤ICP备2023046998号