Regional Sampling of Forest Canopy Covers Using UAV Visible Stereoscopic Imagery for Assessment of Satellite-Based Products in Northeast China

Canopy cover is an important parameter affecting forest succession, carbon fluxes, and wildlife habitats. Several global maps with different spatial resolutions have been produced based on satellite images, but facing the deficiency of reliable references for accuracy assessments. The rapid development of unmanned aerial vehicle (UAV) equipped with consumer-grade camera enables the acquisition of high-resolution images at low cost, which provides the research community a promising tool to collect reference data. However, it is still a challenge to distinguish tree crowns and understory green vegetation based on the UAV-based true color images (RGB) due to the limited spectral information. In addition, the canopy height model (CHM) derived from photogrammetric point clouds has also been used to identify tree crowns but limited by the unavailability of understory terrain elevations. This study proposed a simple method to distinguish tree crowns and understories based on UAV visible images, which was referred to as BAMOS for convenience. The central idea of the BAMOS was the synergy of spectral information from digital orthophoto map (DOM) and structural information from digital surface model (DSM). Samples of canopy covers were produced by applying the BAMOS method on the UAV images collected at 77 sites with a size of about 1.0 km across Daxing’anling forested area in northeast of China. Results showed that canopy cover extracted by the BAMOS method was highly correlated to visually interpreted ones with correlation coefficient (r) of 0.96 and root mean square error (RMSE) of 5.7%. Then, the UAV-based canopy covers served as references for assessment of satellite-based maps, including MOD44B Version 6 Vegetation Continuous Fields (MODIS VCF), maps developed by the Global Land Cover Facility (GLCF) and by the Global Land Analysis and Discovery laboratory (GLAD). Results showed that both GLAD and GLCF canopy covers could capture the dominant spatial patterns, but GLAD canopy cover tended to miss scattered trees in highly heterogeneous areas, and GLCF failed to capture non-tree areas. Most important of all, obvious underestimations with RMSE about 20% were easily observed in all satellite-based maps, although the temporal inconsistency with references might have some contributions.


Introduction
Canopy cover is defined as the fraction of ground covered by the vertically projected tree crowns, which plays an important role in affecting forest succession, evaluating carbon fluxes, and managing wildlife habitats [1][2][3][4][5]. Canopy cover maps have been widely used in assessing forest disturbance and restoration caused by climate changes or anthropogenic activities [6][7][8]. Several global maps have been produced based on satellite images with low or moderate spatial resolutions, such as those acquired by Advanced Very High Resolution Radiometer (AVHRR) from the NOAA, Moderate Resolution Imaging Spectroradiometer (MODIS) aboard the Terra and Aqua satellites and Landsat. The early maps at 1 km resolution were generated using the AVHRR data by linear mixture model or regression tree method [9,10]. Then, annual global canopy cover datasets (i.e., MOD44B Version 6 Vegetation Continuous Fields) at 250 m resolution from 2000 were produced using MODIS images (hereafter referred to as the MODIS VCF) [11,12]. Two global products with finer resolution were derived from Landsat images. One was developed by the Global Land Cover Facility, which was composed of four maps centered on years 2000,2005,2010, and 2015, respectively (hereafter referred to as the GLCF canopy cover) [4]. The other was distributed by the Global Land Analysis and Discovery laboratory, which has two maps centered on years 2000 and 2010, respectively (hereafter referred to as the GLAD canopy cover) [7].
Independent assessment was indispensable before the application of these satellite-based maps, which depended on accurate measurements of canopy cover as references. An easy-to-understand assessment method was to directly compare with field measurements [13,14]. However, the collection of field measurements was laborious and timeconsuming for evaluating maps with coarse resolution (e.g., 250 m of MODIS VCF), especially over mountainous areas [13,15,16]. Alternatively, high-resolution satellite images, including QuickBird, WorldView, IKONOS, and GeoEye, were used to produce reference data [8,[17][18][19][20]. For example, Montesano et al. [19] visually collected canopy covers from QuickBird images to evaluate MODIS VCF over circumpolar taiga-tundra ecotone. In addition, airborne LiDAR could also provide reference data but was limited by high cost [4,[21][22][23][24]. For example, Tang et al. [24] estimated accuracy of GLAD and GLCF canopy cover based on airborne LiDAR data in Teakettle Experiment Forest, CA, USA.
The reference canopy covers derived from highresolution images or LiDAR have been used to evaluate satellite-based maps at different spatial scales and geographical regions. Results showed that each of satellite-based maps had different uncertainties. For example, the root mean square error (RMSE) of MODIS VCF varied from 5.2% to 31% [18,19,21]; the RMSE of GLCF canopy cover varied from 13% to 31.5% [4,22,24], and the RMSE of GLAD canopy cover varied from 17.64% to 30.4% [8,17,20,24]. In addition to factors including definitional discrepancies of canopy cover between maps and references (e.g., considering within-crown gaps or not) and deficiencies of maps themselves (e.g., underestimation of dense forests and overestimation of sparse forests), the uncertainties of references could not be ignored, especially those collected from highresolution satellite images. Some studies reported that the references acquired from meter-level satellite images were not always accurate because of similar spectral features of tree crowns and understory green vegetation (e.g., shrubs and grasses). Montesano et al. [19] found that the difference between two visual interpretations of the same QuickBird images was about 14.8%. Furthermore, the shadows caused by the illumination occlusion also effected interpretation of satellite images [18].
Most studies on the evaluation of global products concentrated on plots or small local area because the field or LiDAR-based measurements were hardly or costly to be collected on a large scale. Many assessments were carried out in Eurasian high-latitude forest, American temperate, and tropical forests [18,22,24], while the assessment from other regions was rare, such as northeast China. Therefore, more evaluations are still needed for understanding the performance of global canopy cover maps [25].
Recently, the unmanned aerial vehicle (UAV) as a flexible platform has been widely used in forest inventory [26][27][28][29]. The camera onboard UAV could collect highresolution images (i.e., centimeter resolution), which made it promising to be an alternative to make field measurements of canopy covers [30,31]. In addition, the UAV could collect images at landscape scale (e.g., 1.0 km 2 ), which made it practical to collect references at low cost [8,32].
UAV-based LiDAR have been used to acquire reference data of canopy covers in virtue of strong penetration in forests [1,5,[33][34][35]. For example, Cai et al. [1] used canopy height model (CHM) derived from UAV-based LiDAR to estimate canopy covers over 18 samples with a size of 25 m * 25 m in temperate forest, and the result indicated that estimation was reliable with RMSE of 1.49%. Wallace et al. [35] used all UAV-based LiDAR returns greater than 1.3 m to reconstruct the shape of tree crowns and evaluated the canopy cover in a 30 m * 50 m patch of native dry forest, and the estimation of canopy cover was also accurate with difference of only 4% from the field measurement. Compared with UAV-based LiDAR, the UAV equipped with RGB camera is more applicable to forest inventory with the advancement of images processing (e.g., Structural from Motion algorithm (SfM)) [36][37][38]. The photogrammetric point clouds could be generated from UAV-based RGB images by SfM and further were used to produce digital orthophoto map (DOM) and digital surface model (DSM) of forested areas. In addition, in sparse forested areas, many ground points could be determined by algorithms of point cloud classification and were used to produce the Digital Terrain Model (DTM) and canopy height model (CHM). Therefore, some studies used UAV-based RGB images for evaluating canopy covers in sparse forests. For example, Cunliffe et al. [39] identified spatial pattern of woody plants by using height threshold on CHM in dryland ecosystem; Li et al. [40] evaluated the canopy cover in seminatural forest. However, in dense forests, there were few ground points due to the poor penetration of optical images, which could not produce reliable DTM and CHM. In this case, the auxiliary terrain information (e.g., LiDAR-derived DTM) was needed for generation of CHM, but such terrain information often did not exist. Therefore, it is important to develop a new method for evaluating canopy cover accurately without auxiliary understory terrain over complicated forests.
In this study, a new method was proposed to accurately identify canopy cover based on the UAV-based DSM and DOM. Then, the sampling data of UAV-based canopy covers collected at 77 sites across the Daxing'anling forested area was produced and further used as references to evaluate three published satellite-based maps of canopy cover in study area. The aims of this study include (1) developing a new method to produce high-accuracy canopy covers from RGB images without priori understory terrain in natural forests and (2) evaluating the performance of satellite-based maps in Daxing'anling forested area.  Figure 1. This region is about 770 km in length (i.e., from north to south) and 350 km in width (i.e., from east to west). Its elevation ranges 330~1750 m above sea level. The climate is a cold temperate continental monsoon, and the mean annual temperature is -2.8°C with interannual variability of -52.3°C in January to 39°C in July historically. The annual average precipitation is 450~550 mm, most of which falling from July to August. The period of annual snow accumulation is about 5 months, and the depth of snow is up to 30~50 cm [41]. Dahurian larch (Larix gmelinii Kuzen.) is the dominant tree species, and other tree species include Scots pine (Pinus sylvestris L.), white birch (Betula platyphylla Suk.), and aspen (Populus davidiana Dode) [42].

Collection and
Processing of UAV Visible Imagery. The high-resolution red-green-blue (RGB) images were collected by UAV system at 77 sampling sites from June 27th to July 18th, 2018. The weather of 45 sites was sunny, and the rest was cloudy. The spatial distribution of sampling sites was shown by green dots in Figure 1.
The UAV platform and sensor used in this study were DJI S900 and Sony NEX-5T digital camera, respectively. The UAV platform was a six-rotor platform with take-off weight of 6.8 kg. The flying height was 350 m above the ground elevation of take-off point, and the flying speed was 10 m/s. The forward and side overlaps were 90% and 60%, respectively. The flight time of each sampling area could last about 18~20 minutes including take-off and landing, and the coverage area of each sampling site was close to 1.0 km 2 . The NEX-5T camera had 4912 × 3264 pixel detectors (Exmor APS HD CMOS). With the focal length of 16 mm and exposure of 1/60 s, the camera could acquire images with a size of 16.03 megapixel using the field of view of 72.58°and 51.98°across and along flying direction, respectively. Positions of images were determined by UAVembedded GPS/IMU instrument.
Images were processed by SfM in Agisoft Photoscan (Agisoft LLC, St. Petersburg, Russia). They were firstly aligned to estimate positions and orientations using common points automatically detected based on image textures. Then, dense point cloud was generated based on image matching, which was further used to generate the DOM and DSM with a resolution of approximately 8.0 cm. Please referred to Ni et al. [43] for details of data processing.
2.3. Satellite-Based Canopy Cover Maps. Three satellitebased maps were evaluated in this study, including a MODIS-derived and two Landsat-derived datasets. The first dataset developed by DiMiceli et al. [11] was annual MODIS VCF from the year of 2000. The inputs were annual surface reflectance composites of MODIS bands; then, the linear regression tree algorithm was used to generate the MODIS VCF. The original dataset adopted sinusoidal projection and had nominal 250 m resolution. In this study, the MODIS VCF of 2018 was assessed.
The second dataset was GLCF canopy covers developed by Sexton et al. [4], which had four maps centered on 2000, 2005, 2010, and 2015. This dataset was developed by rescaling the MODIS VCF using Landsat 5 Thematic Mapper (TM) and Landsat 7 Enhanced Thematic Mapper Plus (ETM+) images archived in Global Land Survey (GLS). The GLCF adopted the coordinate system of Universal Transverse Mercator and had 30 m resolution. The GLCF canopy cover of the year 2015 was evaluated in this study.
The last dataset was GLAD canopy covers developed by Hansen et al. [7], including the 2000 and 2010 global maps, which was an integral part of The Global Forest Change. The GLAD canopy cover of the 2010 was evaluated in this study. This map was calculated by regression tree model based on top of atmosphere reflectance derived from Landsat 7 ETM + data. The GLAD adopted the coordinate system of WGS 1984 and had nominal 30 m resolution.

Canopy Cover Extraction.
A new method is proposed in this study to identify tree crowns over sampling areas from the UAV-based DSM and DOM, which is referred to as Background Analysis Method based on Object Segmentation (BAMOS). The BAMOS method consists of two steps. Firstly, two types of backgrounds, i.e., shaded gaps and sunlit gaps, are detected using the spectral information of DOM and structural analysis of DSM, respectively; then, the DSM is inverted and segmented by watershed method, and tree crowns are mosaics of segmentations excluding the identified two types of backgrounds and other gap pixels located within segmentations. The canopy cover is the ratio of tree crowns to that of a forest stand.  3 Journal of Remote Sensing

Extraction of Backgrounds
(1) Shaded Gaps. In dense forest, tree crowns are usually brighter than shaded gaps, and even the shaded crowns are slightly brighter than surrounding gaps on DOM. Based on these differences, some methods like between-class variance, corner detection, and minimum-distance-to-means algorithm have realized the high-accuracy separation of tree crowns and shaded gaps on closed-range photographs [44]. In this study, the between-class variance algorithm (i.e., OTSU) is used to detect shaded gaps by segmenting grayscale DOM into a binary image using a reasonable threshold [45,46]. In Otsu's algorithm, the threshold can be automatically determined by looking for two distributions of pixels and separating them as much as possible. The part of binary image less than the threshold is the shaded gaps, which is the first type of background distribution.
(2) Sunlit Gaps. It can be anticipated that the detection of understory background by Otsu works well over dense for-ests but tends to fail over sparse forest, because in addition to the shaded gaps, there are sunlit background in sparse forest, such as grass and shrubs, which may have a brightness similar to or even higher than tree crowns on DOM. In fact, it is hard to distinguish tree crowns and sunlit background based on limited spectral information from three visible bands of DOM (i.e., red, green, and blue bands). Therefore, additional information is needed to deal with this issue. In this study, the structural analysis of DSM is proposed to detect the sunlit backgrounds over sparse forest, as described in the following procedure.
It can be found that drastic increase of elevation occurs at the transitional zones from sunlit backgrounds to tree crowns, whereas the elevation changes are small within tree crowns or sunlit backgrounds on DSM. Therefore, it is possible to detect the interface between sunlit backgrounds and tree crowns using a common algorithm of edge detection. In this study, the Sobel operator is used to quantify the elevation changes. Given the elevation change within sunlit backgrounds is smaller than transitional zones, a conservative  threshold is used to detect the potential sunlit background regions. However, the conservative threshold causes some tree crown regions to be misidentified as sunlit backgrounds. Therefore, more structural analysis is needed to remove these misidentified regions from potential sunlit backgrounds. For each potential region, the inner and outer buffer zones are set up along the boundary with a width of 1.0 m. Given the average elevation of inner buffer zone should be significantly lower than that of outer buffer zone for real sunlit background region, and there is no similar phenomenon for region of tree crowns, so the real sunlit backgrounds can be identified by elevation difference between the inner and outer buffers. The sunlit backgrounds produced by structural analysis of DSM are the second type of understory backgrounds.
3.1.2. Extraction of Tree Crowns. Two maps of background have been produced using the method proposed in the previous section. Directly removing the background areas does not guarantee accurate identification of tree crowns, because the additional backgrounds within transitional zones are not considered during the process of detecting sunlit backgrounds. The method for extracting tree crowns on object level is further proposed. The object segmentation is performed on inverted DSM by the watershed algorithm [47,48]. It can be anticipated that the tree crowns form the lowlands and accumulate the water, whereas the understory backgrounds become drainage areas. The segmented objects are further categorized as sparse or dense ones according to whether they contain pixels of sunlit backgrounds or not. For each sparse object, pixels located within the intersection with the sunlit backgrounds are regarded as understory pixels, which provide the elevation of ground surface. Therefore, a height threshold is used to removing the background pixels within transitional zones. In order to diminish the effect of terrain, mean elevation of inner buffer rather than all sunlit backgrounds within objects is considered as understory elevation; then, the pixels whose elevations lower than a height threshold relative to understory elevations are determined as additional sunlit backgrounds. After completely removing sunlit backgrounds, the segmented objects still have shaded gaps. Therefore, these gaps are further excluded from the sparse and dense objects by using the first type of background distribution. The mosaic of all processed objects forms the tree crowns identification of sampling area. The canopy cover maps of different resolution can be calculated by the percentage of tree crowns within the new pixel range.

Assessment of UAV-Based Canopy
Covers. The UAVbased canopy covers of 77 sampling sites are produced by BAMOS. The central region of approximately 700 m * 700 m at each sampling site is considered as effective in order to avoid marginal effects. Their accuracy needs further evaluation based on more reliable reference data. The reference data used here consists of 231 plots with a size of 30 m × 30 m, which are randomly selected from 77 sampling areas. For each plot, the tree crowns are drawn by handediting vector polygons in ArcGIS (Esri LLC, CA, Redlands, America), and the reference canopy cover is ratio of cumulative areas of vector polygons to plot size. Compared with evaluation of line sampling or point sampling [5,19,49], it 3.3. Assessment of Satellite-Based Canopy Cover Maps. UAVbased canopy covers are further used as references to evaluate the aforementioned satellite-based maps in Daxing'anling forested area. Given the difference in coordinate systems of satellite-based maps and UAV-based canopy covers, the three satellite-based maps are firstly converted to the coordinate system of Universal Transverse Mercator, which is also the projection of UAV-based canopy covers; then, the satellite-based maps are evaluated by comparing with the UAV-based canopy covers over same sampling areas. Here, two evaluation plans are designed: (1) Evaluation at Sampling Area Level. For each sampling area, the canopy cover of satellite-based maps is the average of all pixels located within sampling areas. The reference value is directly calculated on the UAV-based canopy cover maps with a resolution of 8.0 cm by equation (1) (2) Evaluation at Level of Satellite-Based Maps' Pixel.
For each pixel of satellite-based maps located in sampling areas, the reference value is directly calculated using the covered UAV pixels by equation (1) where CC is the canopy cover, n is the number of all pixels of UAV-based canopy cover map within evaluated range (i.e., a pixel of satellite-based map or a sampling area), and P i is 1 or 0 according to whether it is a tree crown pixel or not.  Figure 2(f). The buffer analysis method is needed for discriminating false and real sunlit backgrounds, and the threshold of 2 m is adopted in this study. In Figure 2(g), the red and green regions are the inner and outer buffer zones, respectively. The average elevation of inner buffer is 790.03 m and that of the outer buffer is 790.05 m. The inner average elevation is only 0.02 m lower than outer average elevation, so this region is a false sunlit background caused by misidentification of tree crowns; while in Figure 2(h), the average elevation of inner and outer buffer is 776.18 m and 778.76 m, respectively, and the inner average elevation is 2 m lower than outer average elevation, so this region is real sunlit background. The final detected sunlit background is consistent with the visual interpretation in Figure 2(i). Figure 3 show the identification of tree crowns in three typical forest stands with different densities.  Figures 3(a), 3(g), and 3(m), it can be seen that each segmented object has both tree crowns and understory backgrounds. The distribution of sunlit backgrounds (Figures 3(b), 3(h), and 3(n)) is first used to classify segmentations as sparse or dense objects according to whether there are pixels of sunlit backgrounds or not. For sparse objects, the height threshold of 2 m is used to remove additional sunlit backgrounds. The results of completely removing the sunlit backgrounds are shown as Figures 3(c), 3(i), and 3(o). Then, the shaded backgrounds (Figures 3(d), 3(j), and 3(p)) are further excluded and produce the final distribution of tree crowns (Figures 3(e), 3(k), and 3(q)). Results of tree crowns overlapped on DOM (Figures 3(f), 3(l), and 3(r)) are consistent with that of visual interpretation, indicating that the BAMOS is robustness in forests with different densities.

Result
In this study, the UAV images are collected at 77 sampling sites in Daxing'anling forested area. Given limited time and expenses, it is impossible to only collect data under ideal weather conditions. Therefore, some sampling sites are acquired under unfavorable light conditions. Figure 4(a) shows a case of DOM with scattered cloud shadows, which causes uneven distributions of lights over sampling area, which is more clear in the enlarged subimage. Figure 4(b) is a case of DOM at a mountainous site acquired with low sun elevation angle, and the light condition is complicated as shown in the enlarged subimage. Figures 4(c) and 4(d) are identifications of tree crowns correspond to Figures 4(a) and 4(b), respectively. It can be seen that unfavorable conditions have no obvious effect on performance of BAMOS for extracting the tree crowns, which demonstrate BAMOS's robustness. Figure 5 shows the accuracy assessment of UAV-based canopy covers based on manually interpreted 231 plots as depicted in Section 3.2. The horizontal and vertical coordinates of each scatter point represent the reference canopy cover and UAV-based estimation of a 30 m * 30 m plot, respectively. Most scatter points are distributed along the 1 : 1 diagonal line. Results quantitatively show that the correlation coefficient between UAV-based canopy covers and manually interpreted references is 0.96 with root mean square error (RMSE) of 5.7% and relative root mean square error (rRMSE) of 8.9%, indicating that UAV-based canopy covers are reliable. Figure 6 shows the geographical matching of satellite-based maps and the UAV-based canopy cover maps at a sampling site. The pixel grids marked in red in Figures 6(a)-6(c) correspond to the GLAD, GLCF, and MODIS VCF, respectively. Due to different defaults of reprojection in ArcGIS, their grid sizes are 25.6 m, 30.0 m, and 237.1 m, respectively. The background image (i.e., binary image) in Figure 6 is the UAV-based canopy cover map, which provides references for evaluating satellite-based maps as depicted in Section 3.3. In order to avoid marginal effects, the effective range highlighted by green color is determined by the 9 central pixels of MODIS VCF in each sampling area. Figure 7 shows spatial pattern of UAV-based canopy covers and satellite-based maps including GLAD and GLCF     Figure 7(a) has large areas of grass, and the trees are concentrated on the southeast side; Figure 7(b) is forested area with many scattered sunlit non-tree areas (e.g., grasslands); Figure 7(c) has denser forest and less non-tree areas. Figures 7(d)-7(f) are UAV-based canopy covers whose spatial resolution is identical with the GLCF (i.e., 30 m). It can be seen that the UAV-based canopy covers can accurately capture the distribution pattern of forest on DOM in all sampling areas, even in highly heterogeneous area, e.g., scattered tree clusters along the road on Although both satellite-based maps can capture the dominant spatial patterns of forest distributions, they are obviously underestimated in forest areas compared to UAV maps. In addition, while the GLCF maps tend to overestimate canopy cover for non-forest areas, e.g., the grassland regions are mistakenly defined as low canopy cover, the GLAD maps tend to neglect tree crowns close to non-tree regions (e.g., grassland), leading to these regions having high UAV-based canopy cover but very low GLAD-based value. Figure 8 shows the relative frequency distribution of pixel values of three canopy cover maps corresponding to Figure 7. The obvious underestimation is the basic feature of GLAD and GLCF maps if taking UAV maps as references. Figure 8(a) is the statistics corresponding to Figure 7(a). For pixels with canopy cover > 10%, the peak position is about 75% in the UAV map but is only 45% in GLAD map and 35% in GLCF map. The situation is similar at other two sampling areas. In Figure 8(b), the three peaks are 75%, 55%, and 45% for UAV, GLAD, and GLCF, respectively. They are 75%, 35%, and 45% in Figure 8(c), respectively. For pixels with canopy cover > 10% at all the three sites, dominant frequency of UAV maps are among 60%-90%, but those of GLAD or GLCF maps are concentrated among 30%~60%. Figure 9 shows the accuracy assessment of satellite-based maps over 77 sampling sites. Figures 9(a)  Daxing'anling forest areas. The BAMOS consists of two steps: extracting two types of backgrounds (i.e., shaded gaps and sunlit gaps) through automatic OTSU algorithm and structural analysis of DSM firstly and then identifying remaining gaps pixels (i.e., those located in the transition from sunlit gaps to trees crowns) by analysis of segmentation.

Assessment of Satellite-Based Canopy Cover Maps.
Although several well-known image processing algorithms are employed including the Otsu's algorithm, edge detection by Sobel operator, and DSM segmentation by watershed algorithm, the BAMOS is not just a simple combination of them. The Otsu's algorithm is effective to detect small shaded gaps but fails to detect large sunlit backgrounds as shown in Figures 2(b) and 2(c). There is no doubt to get a kind of detection of potential sunlit backgrounds by applying a threshold on results of Sobel operator, but these detected regions can be either true sunlit backgrounds or false ones caused by forest crown with small changes of elevation as shown in Figure 2 10 Journal of Remote Sensing algorithm is widely used in the image segmentations. However, each segmented object of watershed algorithm has both tree crowns and backgrounds as shown in Figures 3(a), 3(g), and 3(m). Based on these imperfect results, it is innovative to find a way to accurately identify tree crowns.

Settings of Parameters in BAMOS.
There are three key parameters used in the structural analysis of DSM. They are the threshold applied on results of Sobel calculator to find the potential sunlit backgrounds, the width of buffer zone in the buffer analysis, and the threshold of elevation difference between inner and outer buffers. They all are not sensitive for final results, but extreme settings of these parameters should be avoided. In this study, the threshold for Sobel operator is 1.35 m, which equals to a slope of 45°. Higher threshold means that the detected interface between sunlit backgrounds and forest crowns moves outward to tree crowns. Inversely, the detected interface moves inward to sunlit backgrounds. The width of buffer zone used in this study is 1.0 m, and the smaller width is not suggested. This parameter is coupled with the third parameter. Wider buffer zone tends to produce higher elevation differences between inner and outer buffers for identifying true sunlit backgrounds. The Otsu's algorithm is used to detect small shaded gaps, and the threshold is automatically determined based on the histogram of pixels' value. Therefore, some image preprocessing may be needed when the contrast of sunlit to shaded parts of tree crowns is too strong. Otherwise, shaded parts of tree crowns may be mixed up with shaded gaps. Therefore, the contrast stretching is needed in the image preprocessing.
The watershed algorithm is used to split the inverted DSM into the mosaic of objects. Considering the ultra-high resolution of DSM, it can be anticipated that some trivial structural features may be segmented out within a tree crown if the watershed algorithm is directly applied. Although overfragmentation has less effect on the final results, it is better to make smooth filtering before the image segmentation in order to reduce the computation load in the following object-based analysis. The 1:0 × 1:0 m window size of smooth filtering is suggested.

Impact of Terrain Effects and Forest Vertical Structure.
The terrain effect generally could impact the extraction of tree crowns because it can change spectral metrics. For BAMOS method, compared with structural-based analysis used in sparse forests, the spectral-based OTSU algorithm used in dense forests is more affected by terrain effect. In this case, the key is to discriminate shaded crowns and shaded gaps. Although terrain effect can change spectral features in some degree, the OTSU method can be still used to differentiate the shaded crowns and shaded gaps because the shaded crown is lighter than shaded gaps under different terrain, which are also validated in this study, e.g., although Figure 4(b) is acquire in complex mountainous region with obvious terrain effect, the UAV-based canopy cover map (Figure 4(d)) is still reliable, indicating that terrain effect has no significant impact on BAMOS method.
The forest vertical structure may have some effects on the setting of parameters. In this study, the vertical structure of Daxing'anling forested areas usually has two layers (i.e., understory vegetation and overstory tree crowns), which ensures that the regions with significantly lower inner buffer are sunlit backgrounds. The applicability of the structural analysis of DSM in forests with three or more layers should be further examined.

Assessment of Global Canopy Cover Maps
5.2.1. Impact of Temporal Inconsistency. Three global products have different temporal differences compared with the UAV-based reference data. The time intervals relative to UAV data are 8, 3, and less than 1 years for GLAD, GLCF and MODIS VCF, respectively. Temporal inconsistency could explain the differences between satellite-based and UAV-based canopy covers to some extent, e.g., Figure 7(i) shows quite different spatial pattern compared with Figure 7(f) or Figure 7(l), especially in forested regions approaching to non-forest areas. This may be related to growth of young forests during 8-year time interval. According to the results in Figure 9, the RMSE shows a decrease trend at both scale of sampling area (from 28.17 to 19.59%) and pixel (from 33.94% to 22.84%) as the temporal interval decreases, which also indicates that temporal interval may explain the different performance of global products to some extent. Nevertheless, MODIS VCF at same year of 2018 still has obvious underestimations with RMSE about 20%, indicating the temporal inconsistency in this study cannot be main factor for the obvious underestimation of global products in Daxing'anling forested area.

Impact of Geolocation.
The geolocation of UAV-based canopy covers is decided by position/attitude measurement system, which has been used in Chianucci et al. [30]. The geographic error of UAV-based digital orthophoto images (DOMs) is less than 3 m, which is far smaller than satellite-based pixel (e.g., 30 m or 250 m). Figure 7 also intuitively shows geographical matching between UAV-based canopy covers and global products (i.e., GLAD and GLCF). Therefore, the geolocation errors do not have a significant impact on our assessments although we addict it may be an error factor.

Conclusion
In this study, a new method (i.e., BAMOS) is proposed to distinguish tree crowns and understories by synergizing UAV-based DOM and DSM, and the strength is that it does not depend on understory terrain and shows robustness for terrain and weather effects. The high-accurate UAV-based canopy cover maps over 77 sampling areas across Daxing'anling forested area are produced by BAMOS with RMSE of 5.7%. These samplings of UAV-based canopy covers provide reference data for assessment of coarse satellite-based canopy cover maps. Results show that both GLAD and GLCF canopy covers can capture the dominant spatial patterns, but GLAD canopy cover tends to miss scattered trees in highly heterogeneous areas, and GLCF fails to capture 11 Journal of Remote Sensing non-tree areas. Most important of all, obvious underestimations with RMSE about 20% are easily observed in all satellite-based maps, although the temporal inconsistency with references may have some contributions.

Data Availability
The UAV-based canopy covers containing 77 sampling sites with spatial resolution of 30 m are free to access at https:// zenodo.org/record/5702373.