Data Set Title / Coverage Name: SAA Land Cover - Final Version Polygons by States in the SAA SAA Version 3.0 Version Date: 3/21/96 ---------------------------------------------------------------- ---------------------------------------------------------------- Identification Information Data Layer Name: **_lcp where ** is the postal code for the state Description: SAA Land Cover (2 acre minimum mapping unit) (Final Polygon Version) Keywords: land cover, land use, Landsat TM satellite data Citation: Native Data Set Environment: UNIX; ARC/INFO 7.0.3 pathname: Scale: 1:100,000 File Format: ARC/INFO 7.0.3 GRID Use Restrictions: none Access Restrictions: ---------------------------------------------------------------- ---------------------------------------------------------------- Spatial Reference Information Datum: North American Datum 1983 (NAD83) Precision: single Projection: Albers Equal Area Units: meters Spheroid: WGS-84 1st Std Parallel: 34 00 00 2nd Std Parallel: 38 00 00 Central Meridian: -82 00 00 Origin: 33 00 00 False Northing: 0.0 False Easting: 0.0 Extent: West Bounding Coord.: -421536.500 East Bounding Coord.: 354593.500 North Bounding Coord.: 732100.594 South Bounding Coord.: 16270.594 Distance Resolution: Vertical Resolution: n/a ---------------------------------------------------------------- ---------------------------------------------------------------- Data Quality Information Thematic Accuracy: unknown Confidence: unknown Accuracy Method: unknown Horizontal Accuracy: unknown Confidence: unknown Accuracy Method: unknown Vertical Accuracy: n/a Logical Consistency: Completeness: ---------------------------------------------------------------- ---------------------------------------------------------------- Source Information Source Material: Landsat TM satellite scenes Organization: Multi Resolution Landscape Characteristics Consortium, US Forest Service, Tennessee Valley Authority Date: Distance Resolution: unknown Contribution: spectral data, multitemporal coverage Source Material: Stands, Spruce-Fir extent delineations Organization: US Forest Service Date: Distance Resolution: unknown Contribution: spruce-fir class, training material, aerial photography Source Material: USGS LUDA (GIRAS) data, DMA 3 Arc second DEM Organization: US Geological Survey Date: Distance Resolution: unknown Contribution: Source Material: River Reach 3 File (RF3) Organization: US Environmental Protection Agency (USGS - Digital Line Graph (DLG)) Date: Distance Resolution: unknown Contribution: streams Source Material: National Wetland Inventory Organization: National Biological Service Date: Distance Resolution: unknown Contribution: wetlands class --------------------------------------------------------------- --------------------------------------------------------------- Processing History Information Process Description: The following text has been extracted from the "Southern Appalachian Assessment Land Cover Mapping Project Final Report" produced by Pacific Merdian Resources, Inc. (PMR). PMR was contracted to perform the land cover classifiaction. I did minor editing to the extracted material. Karl A. Hermann 3/21/96 ************************************************************* All image classifications were performed using ERDAS (Atlanta, Georgia) software. The initial design of this project was to acquire all the imagery at once and then process all of the data simultaneously through each of the image processing tasks. Delays in receiving data resulted in the imagery being processed as three sets of scenes. The following describes the methods used to classify a single scene from start to finish. Classification Scheme The first step in any mapping project is the specification of a classification system which categorizes the features of the earth to be mapped. Specifications of the system are driven by (1) the anticipated uses of the map information, and (2) the features of the earth that can be discerned with the data (e.g., aerial photography, satellite imagery) being used to create the map. A classification system has two critical components: (1) a set of labels (Herbaceous, Spruce-Fir, Northern Hardwood, etc.); and (2) a set of rules--or a system--for assigning labels (e.g., a "Northern Hardwood forest will consist of 70% or more of the total forested area in sugar maple, beech or yellow birch."). Without a clear set of rules, the assignment of labels to types can be arbitrary and lack consistency. The land use/land cover classification scheme for this project was designed to best meet the needs of all agencies involved in the SAA study. Primary concerns in determining the scheme were balancing project time and budget con straints while still meeting the needs of all agencies involved. Table 2 presents the final classification labels for the data layers developed from the satellite image classification. Appendix A contains the definitions for determining the appropriate land cover classification label. Table 2. Final Classification Scheme Class Number SAA Class Descriptions 1 Northern Hardwood Forests 2 Mixed Mesophytic Hardwood Forests 3 Oak Forests 4 Bottomland Hardwood Forests 5 White Pine / Hemlock Forests 6 Montane Spruce-Fir Forests 7 Southern Yellow Pine Forests 8 White Pine / Hemlock / Hardwood Forests 9 Mixed Pine / Hardwood Forests 10 Herbaceous 11 Barren 12 Agriculture - Pasture 13 Agriculture - Cropland 14 Wetlands 15 Developed 16 Water 17 Indeterminate - Clouds, Shadows Data Preprocessing A total of fourteen Landsat TM scenes (Table 3) were needed to cover the SAA study area. This imagery was obtained from three sources: Multiresolution Land Characteristics (MRLC) through the EROS Data Center, EOSAT , and the Tennessee Valley Authority (TVA). With multitemporal data available for most scenes, the amount of data involved in this mapping project was enormous. Both hardware and time constraints for the project made data reduction a necessity. The first step in dealing with the imagery was therefore to subset the bands to be used in the classification. Table 3. Landsat TM Imagery use to Classify the Southern Appalachians Scene Date 1633 3/1/92, 5/20/92 1634 5/20/92, 9/28/93 1733 10/2/92, 7/17/93 1734 9/14/91, 10/2/92 1735 5/11/92, 11/3/92 1834 9/29/94, 11/29/93 1835 10/25/92, 6/6/93 1836 6/6/93, 4/19/93 1935 4/23/92, 7/31/93 1936 7/12/92, 11/17/92 1937 7/31/93.10/3/93 2035 6/28/90 2036 6/12/90 2037 11/11/93, 8/26/94 Several statistical techniques exist to select the best combination of spectral bands of the imagery. (Jensen, 1986; Swain, 1978), (Jensen, 1986), (Johnson and Wischern, 1982). In cover type classification and biophysical variable analysis, there is no one optimum choice of spectral bands. Band selection for this project was based on the sensor involved, the area of interest, hardware limitations, and time constraints. For each scene area, an eight-band, multi-temporal image was created that consisted of bands 3, 4, 7 and a ratio of band 3 to band 4 from both image dates. There were two exceptions to this eight band selection. Only one scene date was received from the TVA for scenes 20/35 and 20/36 and therefore only four bands were used, and band 7 was not delivered with these scenes so band 5 was used to replace band 7. Since the majority of the TM scenes and most of the ancillary data were delivered in UTM zone 17 with a NAD83 projection processing was completed in this coordinate system. Any imagery not in UTM zone 17, NAD83 projection was converted using PROJECT in ARC/Info. The nearest neighbor algorithm was used. In addition, the single date scenes received from the TVA had to be resampled to 30 meter pixels then projected into UTM zone 17. All scenes were checked for anomalies such as striping, clouds, leaf status and proper projection/rectification. Several scenes were reordered because both dates were 'leaf off'. Hydrology and roads coverages were used to check rectification of the scenes. One date for each of the three scenes 17/35, 18/35 and 20/37 was mis-registered. Their registration was corrected by shifting the upper left coordinates by an amount specified by the EROS Data Center, from whom the data was received. Each scene was cut to the SAA study area with an additional one mile buffer and scene to scene overlap was minimized. Preliminary Field Visit The purpose of visiting the study area for field data collection is twofold: 1) to assess and document ground vegetation variation and 2) to transfer knowledge between Forest Service and Pacific Meridian personnel. Variation in vegetation is directly correlated with the spectral variation in satellite imagery. This variation is correlated with spectral variation in the satellite imagery during the image classification process. For this to be successful, a merging of the knowledge of forest vegetation with the knowledge of spectral reflectance is imperative. The best means of bringing these two factors together is through teamwork in the field. Prior to any processing, an initial field visit was made to acquire a general knowledge of the vegetation in the Southern Appalachians. This was a week long trip in visiting forests from northern Georgia to northern Virginia. Forest Service personnel accompanied Pacific Meridian in the field giving an overview tour of their forest and explaining the causes of vegetation variation in their districts. This was an excellent start to the project and notes from this trip were used in all steps that follow. Image Classification Upon completion of pre-processing and the preliminary field visit, each image was stratified into urban agriculture and other areas using the Land Use Data Analysis (LUDA) layer provided by the National Biological Service. Stratification of imagery allows the image processor to narrow the range of expected classification results based on ancillary knowledge. Areas in the LUDA data labelled as agriculture or urban were used to mask corresponding areas from the imagery. An ISODATA unsupervised classification of approximately 25 clusters was then run on these areas and the classes were identified using aerial photos and the image processors knowledge of satellite imagery. The image processors first labelled the clusters as either forested, non-forested or mixed classes. Forested areas were set aside to be processed later. Non-forested areas were further separated into urban, pasture, crop or mixed. Areas that were labelled as agriculture in the LUDA data and that Pacific Meridian identified as crop or pasture were considered to be correctly classified. Likewise urban areas identified by Pacific Meridian which matched the LUDA data were also considered to be correct. These areas were set aside and not considered for future processing. The mixed classes were then run through another ISODATA unsupervised classification for further class separation and identification. Again, classes were identified as urban, pasture, crop, forested, or mixed and compared to the LUDA data as discussed above. Any classes still identified as mixed then defaulted to the LUDA classification with agricultural areas being labelled pasture or crop based on photo interpretation by the image processors. Forested areas were placed back with the original imagery for the remainder of the classification. An ISODATA unsupervised classification was then run on the remaining imagery. Depending on the variation in the imagery, fifty to seventy-five spectral classes were identified in the classification. Several classes were identified as being spectrally very similar but representing very general landcover types within the study area. For example, water, pasture, and herbaceous were each represented by multiple spectral classes. In cases where a spectral class could reliably be found to represent a single land cover type, i.e. water, barren, etc., the spectral classes were simply relabeled to that land cover type. Classes that could be identified as a general class type, such as deciduous, but not as a specific cover type, such as northern hardwood, were labeled as deciduous, coniferous, etc. More confused spectral classes were given a unique color so they could be easily distinguished. This newly recolored spectral variation map was used to identify areas to be field visited. As many of the different spectral classes as possible were targeted for field visits to collect field data pertaining to vegetation cover type for the entire range of spectral reflectance present in the study area. The goal was to correlate the spectral variation in the imagery with the variation in the land cover. Field Data Collection Field work occurred during the months of May and July, 1995. Draft maps of the initial classification were taken to the field and compared to the vegetation on the ground. Notes were taken to indicate where the classification was accurately identifying the land cover and where it was inaccurate. General notes were also recorded that documented all spectral and vegetative variation encountered. These field notes were used to assist image processors in checking the results of unsupervised classifications and identifying training sites for completing image classification of the study area. More than 100 person days were spent visiting the study area and meeting with Forest Service personnel from each of the forests in the study area. Pacific Meridian image processors completed the field data collection with the assistance of Forest Service personnel. The image processors responsible for completing image classification were responsible for the field data collection, ensuring consistency throughout the entire mapping project. Re-Processing After returning from the field, Pacific Meridian used the knowledge gained in the field to refine the unsupervised classifications. Areas noted in the field as correctly classified were set aside. Areas noted as being incorrect were masked from the imagery and re-classified. In addition, a few supervised training sites were added where the unsupervised classification could not distinguish between classes. As a classification was completed, digital elevation data, hydrology layers, field notes, aerial photography, and ARC/INFO stand inventory layers were used to identify possible problem areas in the maps. For example, digital elevation models were compared with the classification to identify areas where northern hardwoods were appearing below 4000 foot elevations. While this combination is not impossible for the area, it is an unusual occurrence and warrented checking. Aerial photos and field notes were used to check these areas. Where the classification was determined to be correct, the area was set aside so as not to be changed. Areas classifying incorrectly were selected for re-processing and run through another iteration. This continued until no further gains in class identification could be achieved from the imagery. Any remaining modifications would be achieved with the use of ancillary data and manual editing. From the start of the mapping project, several classes were known to be difficult to achieve strictly from the satellite imagery. These included wetlands and bottomland hardwoods. Bottomland hardwoods were classified using the imagery along with hydrology and elevation data. A one hundred meter buffer of streams at lower elevations was used to confine these hardwoods to the bottomlands. An unsupervised classification was run and bottomland hardwood classes identified using field notes and aerial photography. Wetlands were based exclusively on the Fish and Wildlife Service National Wetland Inventory (NWI). Wetlands were defined as the palustrine wetland sub-classes of forested, emergent, scrub-shrub, and moss/lichen that were greater than two acres in size. The only exceptions were that "Water" and "Bottomland Hardwood" classes would always take precedence where the NWI and SAA classes overlapped. In addition, the Forest Service's 1992 montane spruce-fir was used as the spruce-fir classification. Draft Map Review and Editing Draft maps were plotted at 1:100,000 scale and delivered to the National Forests within the study area for review and comment. Unfortunately, Forest Service inupt during the draft map review was inconsistant. Several of the draft map reviews were excellent, identifying both correct and incorrect classes as well as having comments on the general ecology of the area being reviewed. However, several draft map reviews were incomplete and were never returned (see Appendix B). Of the total 74 draft 1:100,000 maps prepared for review, only 22 were returned with comments. As Pacific Meridian received the reviewed draft maps, image processors made final enhancements and corrections to the classification. Since few corrections had been identified by Forest Service personnel during the draft map review, image processors reviewed aerial photography where available while making edits. As no aerial photo coverage was provided for much of the private lands, only limited editing could be accomplished in these areas. Edgematching The remaining step in the development of the land use/land cover raster classification was the joining of all fourteen scene classifications into one file. When several scenes are classified independently and then joined together, inconsistencies in data can result along scene boundaries. These inconsistencies are the result of several factors: 1.) Differences in ajoining scene dates can capture the seasonal variation in vegetation as well as natural or man made disturbances and cloud cover variation. 2.) Incomplete ancillary data, such as gaps in the wetlands data where data collection is not yet complete; 3.) Variation in or lack of draft map comments. For example, the comments received for scenes 16-17/33 indicated that "White Pine/Hemlock" areas occurred throughout the area. Comments from adjoining scenes 16- 17/34 indicated that "White Pine/Hemlock" occurred very infrequently in the area. 4.) Inconsistancy in image classificiation due to processing of adjoining scenes by different personnel. Edgematching minimizes inconsistencies between adjoining scenes. The classified scenes were stitched together and then visually checked against the imagery and aerial photography. Inconsistencies caused by variation in scene dates and incomplete ancillary data were not changed because the data, based on the scene classified, is correct. Inconsistencies due to differences in draft map reviews were changed only if a definitive distinction between the correct class and the incorrect class could be identified on the aerial photography. Remaining inconsistencies were checked against the imagery, aerial photography and other ancillary data and the appropriate edits were made. After edgematching, the final step was to transfer the data to an Arc GRID format and convert it to the SAA Albers projection. Polygon Creation The final land cover classification raster layer was used to develop the polygon GIS coverage. The polygonization process has three distinct stages: raster scan; conversion to ARC for elimination; and polygon labeling. 1. Raster Scan - The raster scanning process smoothes or generalizes pixel data to produce homogeneous units. The resulting pixels represent areas of homogeneous land cover pixels. The scanning process was accomplished using a series of scanning routines in ERDAS. 2. Conversion to ARC and Elimination - The products of the ERDAS scanning routine are converted to an ARC polygon coverage and all polygons less than 2 acres were removed. First, all polygons less than .3 acres were removed using a simple DISSOLVE command. The remaining polygons less than 2 acres in size were removed using a series of intelligent eliminates. Intelligent eliminates compare the less than 2 acre polygon to the polygons surrounding it and prioritize the surrounding polygon with which it should be merged. For example, if a one acre white pine/hemlock polygon was surrounded by an oak polygon and a white pine/ hemlock/hardwood polygon the one acre polygon would be merged with the latter. Due to software constraints, the study area had to be processed in five parts. 3. Polygon Labeling - After vectorization of the classified pixel data, polygon attributes or labels were added to the polygon data. Polygon pixel summaries were calculated for each polygon and a label was assigned based on the composition of the classes within the polygon. For example, if a polygon was 65% oak, 25% white pine - hemlock, and 10% barren the polygon received the label of Oak Forest. The rules used to label the polygons can be found in Appendix C. Crosswalk between Map Class and Accuracy Assessment Class SAA Class Descriptions Accuracy Assessment Class Northern Hardwood Forests Hardwood Mixed Mesophytic Hardwood Hardwood Forests Oak Forests Hardwood Bottomland Hardwood Forests Hardwood White Pine / Hemlock Forests Conifer Montane Spruce-Fir Forests Conifer Southern Yellow Pine Forests Conifer White Pine / Hemlock / Conifer/Hardwood Hardwood Forests Mixed Pine / Hardwood Conifer/Hardwood Forests Herbaceous Herbaceous Barren Barren Agriculture - Pasture Agriculture Agriculture - Cropland Agriculture Wetlands Wetlands Developed Developed Water Water Indeterminate - Clouds, Shadows Indeterminate The USFS and Pacific Meridian will incorporate fuzzy set theory to assess the accuracy of this data. Color infrared, Forest Health Monitoring (FHM) stereo triplicate photography at a scale of approximately 1:12,000 was chosen as the primary sampling unit (PSU) for accuracy assessment. The center of each forest health photo set was digitized and buffered to create a circular PSU of approximately 1,236 acres. A total of 236 PSUs were buffered. The polygon coverage was then intersected by these PSUs to produce a final coverage representing the population of polygons available for accuracy assessment sampling. The list of polygons available for sampling were summarized in a database which was then sent to the Rocky Mountain Forest and Range Experiment Station for polygon sample selection. The goal was to select 50 polygons of at least 40 acres for each land use/land cover class for accuracy assessment analysis. An accuracy assessment still being conducted. Results are expected by July or August 1996. **************************************************************** This classification is a comprehensive database for the area and provides a powerful tool for resource planners to utilize in increasing the effectiveness of resource management. This database is currently being assessed for accuracy. The database will be made available by the Forest Service on multiple CD ROM in early 1996. For information on the CD-ROM, contact Karl Hermann at the National Biological Service, University of Tennessee, 17 Ridgeway Road, Norris, TN 37828. ************************************************************ Appendices *************************************************************** Appendix A Southern Appalachian Assessment Land Cover Classification Key If clouds, shadows . . . . . . . . . . . . . . . . .Indeterminate If NWI = Palustrine or Lacustrine wetland and area is not open water or bottomland hardwood species . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wetlands If vegetation > 25% . . . . . . . . . . . . . . . . . . Vegetated If > 25% tree crown cover . . . . . . . . . . . . . . .Forested If > 70% of the total tree crown cover is deciduous . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Deciduous Northern Hardwood Forests Mixed Mesophytic Hardwood Forests Oak Forests Bottomland Hardwood Forests Else if > 70% of the total tree crown cover is evergreen . . . . . . . . . . . . . . . . . . . . . . . Evergreen White Pine - Hemlock Forests Montane Spruce - Fir Forests Southern Yellow Pine Forests Else . . . . . . . . . . . . . . . . . . . . . .Mixed White Pine - Hemlock - Hardwood Forests Mixed Pine - Hardwood Forests Else if tree crown cover < 25% If cropland or pasture . . . . . . . . . . . . .Agriculture Improved Pastures Croplands Else . . . . . . . . . . . . . . . . . . . . . . .Herbaceous Else if vegetation < 25% . . . . . . . . . . . . . .Non-Vegetated If > 50% synthetic surface (LUDA) . . . . . . . . .Developed If lake, river, pond . . . . . . . . . . . . . . . . . Water Else if > 75% is non-vegetated . . . . . . . . . .Barren ************************************************************** Appendix C Polygon Labeling Rules If c17 >= 75% then label = c17 If (c12+c13) >= 66% then label = whichever is greater between c12 and c13 If any other class (c1 through c11 or c14 through c16) >= 66% then label = that class If class with majority of pixels has >= 30% more of the polygon area than the next highest class then label = majority class. For this rule, c12 and c13 should be added together for consideration as a majority class and then if these to classes are >= 30% then the label will be whichever is greater between c12 or c13. If _(c1 through c10,c12,c13,c14) >= 25% then If _(c1 through c9) > 25% then If _(c1 through c4)/_(c1 through c9) >= 70% then If c1 >= c2,c3 and c4 then label = c1 If c4 >= c1,c2 and c3 then label = c4 If c3 >= c1,c2 and c4 then label = c3 else label = c2 If _(c5, c6, c7)/_(c1 through c9) >= 70% then If c6 >= c5 and c7 then label = c6 If c5 >= c6 and c7 then label = c5 else label = c7 If _(c8,c9)/_(c1 through c9) >= 70% then If c8 >= c9 then label = c8 else label = c9 If _(c5,c6,c8) >= _(c7,c9) then label = c8 (**Please don't simplify this and the next line with "else c7". I want to catch labels that fall through) If _(c5,c6,c8) < _(c7,c9) then label = c7 Else if _(c1 through c9) < = 25% then If _(c12,c13) >= c10 and c12 >= c13 then label = c12 else if _(c12,c13) >= c10 and c12 < c13 then label = c13 else c10 Else if _(c1 through c10,c12,c13,c14) < 25% then If c15 >= 50% then label = c15 else if c16 >= 50% then label = c16 else if c11 >= 50% then label = c11 else if c11 >= c15 and c11 >= c16 then label = c11 else if c15 >= c16 then label = c15 else if (c16 - c15) <= 15% then label = c15 else if (c16 - c15) > 15% then label = c16 Else label = default value of 20 Where c1 = Class 1 = Northern Hardwood Forests 10 = Herbaceous c2 = Class 2 = Mixed Mesophytic Hardwood Forests 11 = Barren c3 3 = Oak Forests 12 = Pastures c4... 4 = Bottomland Hardwood Forests 13 = Crop 5 = White Pine - Hemlock 14 = Wetlands 6 = Spruce-Fir 15 = Developed 7 = Southern Yellow Pine 16 = Water 8 = White Pine- Hemlock - Hdwd 17 = Indeterminate 9 = Mixed Pine - Hardwood ---------------------------------------------------------------- ---------------------------------------------------------------- Entity/Attribute Information Entity: LANDCOV.VAT Definition: land cover class Attributes: VALUE COUNT Entity: Definition: Attributes: Entity: Definition: Attributes: ---------------------------------------------------------------- ---------------------------------------------------------------- Status Information Data Set Status: Available Release Date: 3/21/96 ---------------------------------------------------------------- ---------------------------------------------------------------- Metadata Reference Information Date: 3/20/96 Review Date: Contact: Karl A. Hermann National Biological Service Cooperative University of Tennessee 17 Ridgeway Road Norris, TN 37828 samab@utk.edu (423)-632-1452 ---------------------------------------------------------------- ---------------------------------------------------------------- Distribution Information Distribution Contact: Karl A. Hermann National Biological Service Cooperative University of Tennessee 17 Ridgeway Road Norris, TN 37828 samab@utk.edu (423)-632-1452 Distribution Liability: File Compression Technique: UNIX compress, Arc/Info export format Transfer Size: --------------------------------------------------------------- --------------------------------------------------------------- Contact Information Contact Person: Karl A. Hermann Contact Mail Address: National Biological Service Cooperative University of Tennessee 17 Ridgeway Road Norris, TN 37828 Contact Phone: (423)-632-1452 Contact Fax: (423)-632-1612 Contact email: samab@www.lib.utk Home Page for Updates: http://www.lib.utk.edu/samab ---------------------------------------------------------------- ----------------------------------------------------------------